Tech moves fast! Stay ahead of the curve with Techopedia!
Join nearly 200,000 subscribers who receive actionable tech insights from Techopedia.
Clustering involves the grouping of similar objects into a set known as cluster. Objects in one cluster are likely to be different when compared to objects grouped under another cluster. Clustering is one of the main tasks in exploratory data mining and is also a technique used in statistical data analysis. While clustering is not one specific algorithm, it is a general task that can be solved by means of several algorithms. Some of the popular clustering methods that are used include hierarchical, partitioning, density-based and model-based.
Clustering is also known as clustering analysis.
Clustering is the act of creating various clusters that have all objects under the data set. Further, clustering can be distinguished into hard and soft clustering. Under hard clustering, an object either belongs to a cluster or it does not. However, with soft clustering (fuzzy clustering) an object can belong to many clusters. The ultimate aim of clustering is to intrinsically group unlabeled data. It finds applications in market research, pattern recognition, data mining and analysis, data compression, image recognition and more.
The concept of a cluster cannot be easily defined, and this is largely why several algorithms are available for clustering. These algorithms differ in their properties, and therefore, researchers are known to apply different cluster models based on the data set in question and also what it is intended to be used for. For example, hierarchical clustering is based on distance connectivity, while distribution models are based on statistical distributions.