Why is data annotation important in some machine learning projects?


Data annotation is important in machine learning because in many cases, it makes the work of the machine learning program much easier.

This has to do with the difference between supervised and unsupervised machine learning. With supervised machine learning, the training data is already labeled so the machine can understand more about the desired results. For example, if the purpose of the program is to identify cats in images, the system already has a large number of photos tagged as cat or not. It then uses those examples to contrast new data to make its results.

Free Download: Machine Learning and Why It Matters

With unsupervised machine learning, there are no labels, and so the system has to use attributes and other techniques to identify the cats. Engineers can train the program on recognizing visual features of cats like whiskers or tails, but the process is hardly ever as straightforward as it would be in supervised machine learning where those labels play a very important role.

Data annotation is the process of affixing labels to the training data sets. These can be applied in many different ways – above we talked about binary data annotation – cats or not cats – but other kinds of data annotation are important as well. For example, in the medical field, data annotation may involve tagging specific biological images with tags identifying pathology or disease markers for other medical properties.

Data annotation takes work – and is often done by teams of people – but it is a fundamental part of what makes many machine learning projects function accurately. It provides that initial setup for teaching a program what it needs to learn and how to discriminate against various inputs to come up with accurate outputs.

Related Terms

Justin Stoltzfus

Justin Stoltzfus is an independent blogger and business consultant assisting a range of businesses in developing media solutions for new campaigns and ongoing operations. He is a graduate of James Madison University.Stoltzfus spent several years as a staffer at the Intelligencer Journal in Lancaster, Penn., before the merger of the city’s two daily newspapers in 2007. He also reported for the twin weekly newspapers in the area, the Ephrata Review and the Lititz Record.More recently, he has cultivated connections with various companies as an independent consultant, writer and trainer, collecting bylines in print and Web publications, and establishing a reputation…