Don't miss an insight. Subscribe to Techopedia for free.


What’s a simple way to describe bias and variance in machine learning?

By Justin Stoltzfus | Last updated: June 28, 2022
Made Possible By AltaML

There are any number of complicated ways to describe bias and variance in machine learning. Many of them utilize significantly complex mathematical equations and show through graphing how specific examples represent various amounts of both bias and variance.

Here’s a simple way to describe bias, variance and the bias/variance trade-off in machine learning.

At its core, bias is an oversimplification. It can be important to add to the definition of bias some assumption or assumed error.

If a highly biased result was not in error — if it was on the money — it would be highly accurate. The problem is that the simplified model contains some error, so it is not on the bull’s-eye — the significant error keeps getting repeated or even amplified as the machine learning program works.

The simple definition of variance is that the results are too scattered. This often leads to overcomplexity of the program and problems between test and training sets.

High variance means that small changes create great changes in outputs or results.

Another way to simply describe variance is that there’s too much noise in the model, and so it gets harder for the machine learning program to isolate and identify the real signal.

So one of the simplest ways to compare bias and variance is to suggest that machine learning engineers have to walk a fine line between too much bias or oversimplification, and too much variance or overcomplexity.

Another way to represent this well is with a four-quadrant chart showing all combinations of high and low variance. In the low bias/low variance quadrant, all of the results are gathered together in an accurate cluster. In a high bias/low variance result, all of the results are gathered together in an inaccurate cluster. In a low bias/high variance result, the results are scattered around a central point that would represent an accurate cluster, while in a high bias/high variance result, the data points are both scattered and collectively inaccurate.

Share this Q&A

  • Facebook
  • LinkedIn
  • Twitter


Artificial Intelligence Machine Learning

Made Possible By

Logo for AltaML

Written by Justin Stoltzfus | Contributor, Reviewer

Profile Picture of Justin Stoltzfus

Justin Stoltzfus is a freelance writer for various Web and print publications. His work has appeared in online magazines including Preservation Online, a project of the National Historic Trust, and many other venues.

More Q&As from our experts

Related Terms

Related Articles

Term of the Day

Generative AI

Generative AI is a broad label that's used to describe any type of artificial intelligence (AI) that can be used to...
Read Full Term

Tech moves fast! Stay ahead of the curve with Techopedia!

Join nearly 200,000 subscribers who receive actionable tech insights from Techopedia.

Go back to top