What Does DataStage Parallel Extender Mean?
DataStage Parallel Extender (DataStage PX) is an IBM data integration tool. It is one among the many widely used extraction, transformation and loading (ETL) tools in the data warehousing industry. This tool can collect information from heterogeneous sources, perform transformations as per a business’s needs and load the data into respective data warehouses.
DataStage PX may also be called DataStage Enterprise Edition.
Techopedia Explains DataStage Parallel Extender
DataStage Parallel Extender has a parallel architecture to process data. The two main types of parallelism implemented in DataStage PX are pipeline and partition parallelism. The ability to process data in a parallel fashion speeds up data processing to a large extent.
DataStage Parallel Extender incorporates a variety of stages through which source data is processed and reinforced into target databases. These are defined in terms of terabytes. Besides stages, DataStage PX uses containers to reuse the job components and sequences to run and schedule multiple jobs at the same time.
The commonly used stages in DataStage Parallel Extender include:
- Transformer
- Aggregator
- Data set
- Copy
- Change apply
- Modify
- Filter
- Join
- Merge
- Look up