What is the difference between supervised, unsupervised and semi-supervised learning?
The key difference between supervised and unsupervised learning in machine learning is the use of training data.
Supervised learning makes use of example data to show what “correct” data looks like. The data is structured to show the outputs of given inputs.
A machine learning algorithm that classifies fruits might have pictures of fruits such as apples, bananas, grapes and oranges as inputs and the names of these fruits as outputs.
A real-world example would be the Bayesian spam filters in email programs. These filters are trained with examples of emails that are considered spam. The spam filter can then search for certain phrases that appear in emails that occur in spam emails and move them to a spam folder.
It’s like showing a human how to do a new task. A person doing data entry might be shown examples of the data in a format the company wants and is then expected to follow it.
Machine learning programs using supervised learning iterate many times with the training data. The results can be impressive when it really gets going. Google’s Gmail spam filter is very accurate because there are so many users training it.
Unsupervised learning doesn’t have any prior training data. In our fruit classification example, an algorithm might just be shown pictures of fruit and told to classify them.
Unsupervised learning has applications in market research by learning customer purchasing habits, or security by monitoring hacking patterns.
Semi-supervised learning attempts to take a middle ground by labeling some of the data. For example, the apple and orange might be labeled in the fruit classification program, but the banana and the grapes aren’t.
When to use any of these algorithms will depend on the type of data being used. Some tasks have stable patterns, such as credit card fraud or spam messages. Supervised learning is appropriate for these kinds of tasks. Network attacks are unpredictable, and unsupervised or semi-supervised learning methods may be more appropriate.
Tags
Written by David Delony | Contributor

More Q&As from our experts
- How might companies use Apache Mahout for machine learning?
- How is the master algorithm changing the machine learning world?
- Why is data annotation important in some machine learning projects?
Related Terms
- Machine Learning
- Supervised Learning
- Unsupervised Learning
- Semi-Supervised Learning
- Training Data
- Algorithm
- Bayesian Filter
- Hacking
- Natural Language Processing
- Algorithm
Related Articles

Machine Learning & Hadoop in Next-Generation Fraud Detection

Machine Learning 101

Deep Learning: How Enterprises Can Avoid Deployment Failure

How Should I Start Learning About AI?
Tech moves fast! Stay ahead of the curve with Techopedia!
Join nearly 200,000 subscribers who receive actionable tech insights from Techopedia.
- The CIO Guide to Information Security
- Robotic Process Automation: What You Need to Know
- Data Governance Is Everyone's Business
- Key Applications for AI in the Supply Chain
- Service Mesh for Mere Mortals - Free 100+ page eBook
- Do You Need a Head of Remote?
- Web Data Collection in 2022 - Everything you need to know