What is ‘precision and recall’ in machine learning?


There are a number of ways to explain and define “precision and recall” in machine learning. These two principles are mathematically important in generative systems, and conceptually important, in key ways that involve the efforts of AI to mimic human thought. After all, people use “precision and recall” in neurological evaluation, too.

One way to think about precision and recall in IT is to define precision as the union of relevant items and retrieved items over the number of retrieved results, while recall represents the union of relevant items and retrieved items over the total of relevant results.

Another way to explain it is that precision measures the portion of positive identifications in a classification set that were actually correct, while recall represents the proportion of actual positives that were identified correctly.

These two metrics are often affecting each other in an interactive process. Experts use a system of tagging true positives, false positives, true negatives and false negatives in a confusion matrix in order to show precision and recall. Changing the classification threshold can also change the output in terms of precision and recall.

Another way to say it is that recall measures the number of correct results, divided by the number of results that should have been returned, while precision measures the number of correct results divided by the number of all results that were returned. This definition is helpful, because you can explain recall as the number of results that a system can “remember,” while you can cast precision as the efficacy or targeted success of identifying those results. Here we get back to what precision and recall mean in a general sense — the ability to remember items, versus the ability to remember them correctly.

The technical analysis of true positives, false positives, true negatives and false negatives is extremely useful in machine learning technologies and evaluation, in order to show how classification mechanisms and machine learning technologies work. By measuring precision and recall in a technical way, experts can not only show the results of running a machine learning program, but can also start to explain how that program produces its results — by what algorithmic work the program comes to evaluate data sets in a particular way.

With that in mind, many machine learning professionals may talk about precision and recall in an analysis of return results from test sets, training sets or subsequent performance sets of data. Using an array or matrix will help to order this information and more transparently show how the program works and what results it brings to the table.

Related Terms

Justin Stoltzfus

Justin Stoltzfus is an independent blogger and business consultant assisting a range of businesses in developing media solutions for new campaigns and ongoing operations. He is a graduate of James Madison University.Stoltzfus spent several years as a staffer at the Intelligencer Journal in Lancaster, Penn., before the merger of the city’s two daily newspapers in 2007. He also reported for the twin weekly newspapers in the area, the Ephrata Review and the Lititz Record.More recently, he has cultivated connections with various companies as an independent consultant, writer and trainer, collecting bylines in print and Web publications, and establishing a reputation…