Tech moves fast! Stay ahead of the curve with Techopedia!
Join nearly 200,000 subscribers who receive actionable tech insights from Techopedia.
Apache Pig is a platform that is used to analyze large data sets. It consists of a high-level language to express data analysis programs, along with the infrastructure to evaluate these programs. One of the most significant features of Pig is that its structure is responsive to significant parallelization.
Pig operates on the Hadoop platform, writing data to and reading data from the Hadoop Distributed File System (HDFS) and performing processing by means of one or more MapReduce jobs. Apache Pig is available as open source.
Apache Pig is also known as Pig Programming Language or Hadoop Pig.
Apache Pig has two parts: Pig Latin language and Pig engine. The Pig Latin language is a scripting language that allows users to illustrate the way in which data flow from one or more inputs must be read and processed, and the location in which must be stored.
Some of the key properties of Pig Latin are as follows: