Job Chaining

What Does Job Chaining Mean?

Job chaining is a term in MapReduce that refers to launching several steps in the same MapReduce task. With job chaining, the first job sends output to one job, which sends output to the next job in the chain, and so on until the job is complete. It is a form of pipelining MapReduce jobs to make them more manageable.

Advertisements

Techopedia Explains Job Chaining

Job chaining in MapReduce refers to running multiple tasks in one single MapReduce job.

For example, a job chain might consist of:

Map1 > Reduce1 > Map2 > Reduce2

The advantage of job chaining is that it eliminates the need for intermediate data between all the steps in a pipeline. In that sense, job chaining is similar to input/output redirection in the Unix shell. Output from one link in the chain flows to the input in the next job in the chain. MapReduce allows developers to specify dependencies, or which jobs must be completed before it processes the next jobs in the chain through the use of the addDependingJob() method call.

This makes it easier for a developer to write a MapReduce program that can process large amounts of data.

Advertisements

Related Terms

Margaret Rouse
Technology Expert

Margaret is an award-winning technical writer and teacher known for her ability to explain complex technical subjects to a non-technical business audience. Over the past twenty years, her IT definitions have been published by Que in an encyclopedia of technology terms and cited in articles by the New York Times, Time Magazine, USA Today, ZDNet, PC Magazine, and Discovery Magazine. She joined Techopedia in 2011. Margaret's idea of a fun day is helping IT and business professionals learn to speak each other’s highly specialized languages.