Job Chaining

Definition - What does Job Chaining mean?

Job chaining is a term in MapReduce that refers to launching several steps in the same MapReduce task. With job chaining, the first job sends output to one job, which sends output to the next job in the chain, and so on until the job is complete. It is a form of pipelining MapReduce jobs to make them more manageable.

Techopedia explains Job Chaining

Job chaining in MapReduce refers to running multiple tasks in one single MapReduce job.

For example, a job chain might consist of:

Map1 > Reduce1 > Map2 > Reduce2

The advantage of job chaining is that it eliminates the need for intermediate data between all the steps in a pipeline. In that sense, job chaining is similar to input/output redirection in the Unix shell. Output from one link in the chain flows to the input in the next job in the chain. MapReduce allows developers to specify dependencies, or which jobs must be completed before it processes the next jobs in the chain through the use of the addDependingJob() method call.

This makes it easier for a developer to write a MapReduce program that can process large amounts of data.

Share this:

Connect with us

Email Newsletter

Join thousands of others with our weekly newsletter

The 4th Era of IT Infrastructure: Superconverged Systems
The 4th Era of IT Infrastructure: Superconverged Systems:
Learn the benefits and limitations of the 3 generations of IT infrastructure – siloed, converged and hyperconverged – and discover how the 4th...
Approaches and Benefits of Network Virtualization
Approaches and Benefits of Network Virtualization:
Businesses today aspire to achieve a software-defined datacenter (SDDC) to enhance business agility and reduce operational complexity. However, the...
Free E-Book: Public Cloud Guide
Free E-Book: Public Cloud Guide:
This white paper is for leaders of Operations, Engineering, or Infrastructure teams who are creating or executing an IT roadmap.
Free Tool: Virtual Health Monitor
Free Tool: Virtual Health Monitor:
Virtual Health Monitor is a free virtualization monitoring and reporting tool for VMware, Hyper-V, RHEV, and XenServer environments.
Free 30 Day Trial – Turbonomic
Free 30 Day Trial – Turbonomic:
Turbonomic delivers an autonomic platform where virtual and cloud environments self-manage in real-time to assure application performance.