Data Profiling

Definition - What does Data Profiling mean?

Data profiling is a technique used to examine data for different purposes like determining accuracy and completeness. This process examines a data source such as a database to uncover the erroneous areas in data organization. Deployment of this technique improves data quality.

Data profiling is also referred to as data discovery.

Techopedia explains Data Profiling

Data profiling is the method of examining the data available in a data source and collecting statistics and information about that data. Such statistics help to identify the use and data quality of metadata. This method is widely used in enterprise data warehousing.

Data profiling clarifies the structure, relationship, content and derivation rules of data, which aid in the understanding of anomalies within metadata. Data profiling uses different kinds of descriptive statistics including mean, minimum, maximum, percentile, frequency and other aggregates such as count and sum. The additional metadata information obtained during profiling is data type, length, discrete values, uniqueness and abstract type recognition.

Share this:

Connect with us

Email Newsletter

Join thousands of others with our weekly newsletter

The 4th Era of IT Infrastructure: Superconverged Systems
The 4th Era of IT Infrastructure: Superconverged Systems:
Learn the benefits and limitations of the 3 generations of IT infrastructure – siloed, converged and hyperconverged – and discover how the 4th...
Approaches and Benefits of Network Virtualization
Approaches and Benefits of Network Virtualization:
Businesses today aspire to achieve a software-defined datacenter (SDDC) to enhance business agility and reduce operational complexity. However, the...
Free E-Book: Public Cloud Guide
Free E-Book: Public Cloud Guide:
This white paper is for leaders of Operations, Engineering, or Infrastructure teams who are creating or executing an IT roadmap.
Free Tool: Virtual Health Monitor
Free Tool: Virtual Health Monitor:
Virtual Health Monitor is a free virtualization monitoring and reporting tool for VMware, Hyper-V, RHEV, and XenServer environments.
Free 30 Day Trial – Turbonomic
Free 30 Day Trial – Turbonomic:
Turbonomic delivers an autonomic platform where virtual and cloud environments self-manage in real-time to assure application performance.