Why is semi-supervised learning a helpful model for machine learning?


Semi-supervised learning is an important part of machine learning and deep learning processes, because it expands and enhances the capabilities of machine learning systems in significant ways.

First, in today's nascent machine learning industry, two models have emerged for training computers: These are called supervised and unsupervised learning. They are fundamentally different in that supervised learning involves using labeled data to infer a result, and unsupervised learning involves extrapolating from unlabeled data through examining the properties of each object in a training data set.

Free Download: Machine Learning and Why It Matters

Experts explain this by use of many different examples: Whether the objects in the training set are fruits or colored shapes or client accounts, the commonality in supervised learning is that the technology starts out knowing what those objects are – the primary classifications have already been made. In unsupervised learning, by contrast, the technology looks at as-of-yet-undefined items and classifies them according to its own use of criteria. This is sometimes referred to as "self-learning."

This, then, is the primary utility of semi-supervised learning: It combines the use of labeled and unlabeled data to get "the best of both" approaches.

Supervised learning gives the tech more direction to go from, but it can be costly, labor-intensive, tedious and require much more effort. Unsupervised learning is more "automated," but the results can be much less accurate.

So in using a set of labeled data (often a smaller set in the grand scheme of things) a semi-supervised learning approach effectively "primes" the system to classify better. For example, suppose a machine learning system is trying to identify 100 items according to binary criteria (black vs. white). It can be extremely useful just to have one labeled instance of each (one white, one black) and then cluster the remaining "gray" items according to whichever criteria is best. As soon as those two items are labeled, though, unsupervised learning becomes semi-supervised learning.

In directing semi-supervised learning, engineers look closely at decision boundaries that influence machine learning systems to classify toward one or the other labeled result when evaluating unlabeled data. They will think about how to best use semi-supervised learning in any implementation: For example, a semi-supervised learning algorithm can "wrap around" an existing unsup algorithm for a "one-two" approach.

Semi-supervised learning as a phenomenon is sure to push the frontiers of machine learning forward, as it opens up all sorts of new possibilities for more effective and more efficient machine learning systems.

Related Terms

Justin Stoltzfus

Justin Stoltzfus is an independent blogger and business consultant assisting a range of businesses in developing media solutions for new campaigns and ongoing operations. He is a graduate of James Madison University.Stoltzfus spent several years as a staffer at the Intelligencer Journal in Lancaster, Penn., before the merger of the city’s two daily newspapers in 2007. He also reported for the twin weekly newspapers in the area, the Ephrata Review and the Lititz Record.More recently, he has cultivated connections with various companies as an independent consultant, writer and trainer, collecting bylines in print and Web publications, and establishing a reputation…