[WEBINAR] The New Normal: Dealing with the Reality of an Unsecure World

Data Preprocessing

Definition - What does Data Preprocessing mean?

Data preprocessing is a data mining technique that involves transforming raw data into an understandable format. Real-world data is often incomplete, inconsistent, and/or lacking in certain behaviors or trends, and is likely to contain many errors. Data preprocessing is a proven method of resolving such issues. Data preprocessing prepares raw data for further processing.

Data preprocessing is used database-driven applications such as customer relationship management and rule-based applications (like neural networks).

Techopedia explains Data Preprocessing

Data goes through a series of steps during preprocessing:

  • Data Cleaning: Data is cleansed through processes such as filling in missing values, smoothing the noisy data, or resolving the inconsistencies in the data.
  • Data Integration: Data with different representations are put together and conflicts within the data are resolved.
  • Data Transformation: Data is normalized, aggregated and generalized.
  • Data Reduction: This step aims to present a reduced representation of the data in a data warehouse.
  • Data Discretization: Involves the reduction of a number of values of a continuous attribute by dividing the range of attribute intervals.

Techopedia Deals

Connect with us

Techopedia on Linkedin
Techopedia on Linkedin
"Techopedia" on Twitter

Sign up for Techopedia's Free Newsletter!

Email Newsletter

Join thousands of others with our weekly newsletter

Free Whitepaper: The Path to Hybrid Cloud
Free Whitepaper: The Path to Hybrid Cloud:
The Path to Hybrid Cloud: Intelligent Bursting To Amazon Web Services & Microsoft Azure
Free E-Book: Public Cloud Guide
Free E-Book: Public Cloud Guide:
This white paper is for leaders of Operations, Engineering, or Infrastructure teams who are creating or executing an IT roadmap.
Free Tool: Virtual Health Monitor
Free Tool: Virtual Health Monitor:
Virtual Health Monitor is a free virtualization monitoring and reporting tool for VMware, Hyper-V, RHEV, and XenServer environments.
Free 30 Day Trial – Turbonomic
Free 30 Day Trial – Turbonomic:
Turbonomic delivers an autonomic platform where virtual and cloud environments self-manage in real-time to assure application performance.