Part of:

Why does ‘bagging’ in machine learning decrease variance?


Bootstrap aggregation, or "bagging," in machine learning decreases variance through building more advanced models of complex data sets. Specifically, the bagging approach creates subsets which are often overlapping to model the data in a more involved way.

One interesting and straightforward notion of how to apply bagging is to take a set of random samples and extract the simple mean. Then, using the same set of samples, create dozens of subsets built as decision trees to manipulate the eventual results. The second mean should show a truer picture of how those individual samples relate to each other in terms of value. The same idea can be applied to any property of any set of data points.

Free Download: Machine Learning and Why It Matters

Since this approach consolidates discovery into more defined boundaries, it decreases variance and helps with overfitting. Think of a scatterplot with somewhat distributed data points; by using a bagging method, the engineers "shrink" the complexity and orient discovery lines to smoother parameters.

Some talk about the value of bagging as "divide and conquer" or a type of "assisted heuristics." The idea is that through ensemble modeling, such as the use of random forests, those using bagging as a technique can get data results that are lower in variance. In terms of lessening complexity, bagging can also help with overfitting. Think of a model with too many data points: say, a connect-the-dots with 100 unaligned dots. The resulting visual data line will be jagged, dynamic, volatile. Then "iron out" the variance by putting together sets of evaluations. In ensemble learning, this is often thought of as joining several "weak learners" to provide a "strong learning" collaborative result. The result is a smoother, more contoured data line, and less wild variance in the model.

It's easy to see how the idea of bagging can be applied to enterprise IT systems. Business leaders often want a "bird's eye view" of what's going on with products, customers, etc. An overfitted model can return less digestible data, and more "scattered" results, where bagging can "stablilize" a model and make it more useful to end users.

Justin Stoltzfus is an independent blogger and business consultant assisting a range of businesses in developing media solutions for new campaigns and ongoing operations. He is a graduate of James Madison University.Stoltzfus spent several years as a staffer at the Intelligencer Journal in Lancaster, Penn., before the merger of the city’s two daily newspapers in 2007. He also reported for the twin weekly newspapers in the area, the Ephrata Review and the Lititz Record.More recently, he has cultivated connections with various companies as an independent consultant, writer and trainer, collecting bylines in print and Web publications, and establishing a reputation…


Related Terms

Related Questions