Principal Component Analysis (PCA)

Definition - What does Principal Component Analysis (PCA) mean?

Principal component analysis (PCA) is a technique used for identification of a smaller number of uncorrelated variables known as principal components from a larger set of data. The technique is widely used to emphasize variation and capture strong patterns in a data set. Invented by Karl Pearson in 1901, principal component analysis is a tool used in predictive models and exploratory data analysis. Principal component analysis is considered a useful statistical method and used in fields such as image compression, face recognition, neuroscience and computer graphics.

Techopedia explains Principal Component Analysis (PCA)

Principal component analysis helps make data easier to explore and visualize. It is a simple non-parametric technique for extracting information from complex and confusing data sets. Principal component analysis is focused on the maximum variance amount with the fewest number of principal components. One of the distinct advantages associated with the principal component analysis is that once patterns are found in the concerned data, compression of data is also supported. One makes use of principal component analysis to eliminate the number of variables or when there are too many predictors compared to number of observations or to avoid multicollinearity. It is closely related to canonical correlational analysis and makes use of orthogonal transformation in order to convert the set of observations containing correlated variables into a set of values known as principal components. The number of principal components used in principal component analysis is less than or equal to the lesser number of observations. Principal component analysis is sensitive to the relative scaling of the originally used variables.

Principal component analysis is widely used in many areas such as market research, social sciences and in industries where large data sets are used. The technique can also help in providing a lower-dimensional picture of the original data. Only minimal effort is needed in the case of principal component analysis for reducing a complex and confusing data set into a simplified useful information set.