Margaret Rouse is an award-winning technical writer and teacher known for her ability to explain complex technical subjects simply to a non-technical, business audience. Over…
I am a Senior Machine Learning Developer (Data Scientist) with PhD in Applied Math, 10+ years quantitative modelling experience, and 7+ years practice providing business-focused…
Statistical mean is a certain kind of mathematical average that's very useful in computer science, and in machine learning in particular.
Simply speaking, the statistical mean is an arithmetic mean process, in that it adds up all numbers in a data set, and then divides the total by the number of data points.
That's simple and straightforward, and so the arithmetic mean or statistical mean has been widely used throughout the modern era and into the age of computer programming.
Here, we can differentiate the statistical mean from two other types of means that make up a group of three statistical methods called the Pythagorean means. The other two means are called harmonic and geometric means.
All three of these can be useful in machine learning and new kinds of artificial intelligence algorithm engineering
In general, the statistical mean is helpful in all sorts of machine learning classification and decision-support tasks.
Think of it this way — the program plots all the data points, and then uses the statistical mean to arrive at an average, which it uses to help the computer learn through its machine learning processes.
The somewhat more complex harmonic mean and geometric mean can also be used in machine learning for specific things.
For instance, the harmonic mean is often used to derive an "F-score" which helps evaluate data retrieval in a particular system.
Going back to the statistical mean, suppose you have five data points, and the total is 25. Your statistical mean would be five, but you're not quite sure what each of those five numbers is. You could have three ones, a two and a twenty — or you could have a perfectly symmetrical five fives.
You have a data set like the first example mentioned above, where the statistical mean skews a bit. You might have a data set with the following five numbers — two, three, six, seven and 38.
The total is 56, but only one of those numbers is above the statistical mean, which is a little deceptive.
This is where machine learning engineers talk about bias and how different types of means and averages might show bias in a machine learning program.
Without getting too complex, engineers can provide a solution for these kinds of bias by making algorithms even more elaborate so they can second-guess or check or re-evaluate classification data.
The random forest model is one such technique where instead of just a single data set, different systems known as individual “trees” capture a range of data sets and tabulate the results collectively.
The bottom line is that the statistical mean, as a basic type of arithmetic mean, is very broadly useful in providing those simplifications that machine learning algorithms run on.
If you have a scattershot diagram of data, and you want to filter it into an easily digestible insight, as so many business dashboards do, the statistical mean is a great way to help facilitate this.
Much of the fine details about statistical means, other averages, and their deviations are often pored over by professional mathematicians and algorithm engineers.
An arithmetic mean is calculated using the following equation:
Techopedia’s editorial policy is centered on delivering thoroughly researched, accurate, and unbiased content. We uphold strict sourcing standards, and each page undergoes diligent review by our team of top technology experts and seasoned editors. This process ensures the integrity, relevance, and value of our content for our readers.
Margaret is an award-winning technical writer and teacher known for her ability to explain complex technical subjects to a non-technical business audience. Over the past twenty years, her IT definitions have been published by Que in an encyclopedia of technology terms and cited in articles by the New York Times, Time Magazine, USA Today, ZDNet, PC Magazine, and Discovery Magazine. She joined Techopedia in 2011. Margaret's idea of a fun day is helping IT and business professionals learn to speak each other’s highly specialized languages.
What Is Hyperdimensional Computing? Hyperdimensional computing is a new approach to information processing that uses high-dimensional mathematical vectors to represent...
Margaret RouseTechnology Expert
What Does STEM Mean?STEM is an integrated, interdisciplinary, and student-centered approach to learning that encourages critical thinking, creativity, collaboration and...
What Does Distributed Cloud Mean?Distributed cloud is a business model that extends a public cloud provider’s infrastructure and services to...
Trending NewsLatest GuidesReviewsTerm of the Day