Apache Pig

Why Trust Techopedia

What Does Apache Pig Mean?

Apache Pig is a platform that is used to analyze large data sets. It consists of a high-level language to express data analysis programs, along with the infrastructure to evaluate these programs. One of the most significant features of Pig is that its structure is responsive to significant parallelization.

Advertisements

Pig operates on the Hadoop platform, writing data to and reading data from the Hadoop Distributed File System (HDFS) and performing processing by means of one or more MapReduce jobs. Apache Pig is available as open source.

Apache Pig is also known as Pig Programming Language or Hadoop Pig.

Techopedia Explains Apache Pig

Apache Pig has two parts: Pig Latin language and Pig engine. The Pig Latin language is a scripting language that allows users to illustrate the way in which data flow from one or more inputs must be read and processed, and the location in which must be stored.

Some of the key properties of Pig Latin are as follows:

  • Easy to program: Intricate tasks consisting of various interconnected data transformations are clearly encoded as data flow sequences. This makes them simple to write, understand and maintain.
  • Optimization possibilities: The manner in which the tasks are encoded allows the system to optimize automatic execution. This allows the user to pay attention to semantics instead of efficiency.
  • Extensibility: Users are allowed to create their own functions for carrying out special-purpose processing. The Pig engine is responsible for the execution of data flow written in Pig Latin. Much like a standard relational database management system (RDBMS) design, Apache Pig consists of a parser, optimizer and type checker, in addition to operators that carry out data processing. Pig does not include transactions, a data catalog or the ability to directly handle data storage or employ the execution framework.
Advertisements

Related Terms

Margaret Rouse
Senior Editor
Margaret Rouse
Senior Editor

Margaret is an award-winning technical writer and teacher known for her ability to explain complex technical subjects to a non-technical business audience. Over the past twenty years, her IT definitions have been published by Que in an encyclopedia of technology terms and cited in articles by the New York Times, Time Magazine, USA Today, ZDNet, PC Magazine, and Discovery Magazine. She joined Techopedia in 2011. Margaret's idea of a fun day is helping IT and business professionals learn to speak each other’s highly specialized languages.