Margaret Rouse is an award-winning technical writer and teacher known for her ability to explain complex technical subjects simply to a non-technical, business audience. Over…
Machine bias is the tendency of a machine learning model to make inaccurate or unfair predictions because there are systematic errors in the ML model or the data used to train the model.
Bias in machine learning can be caused by a variety of factors. Some common causes include:
Machine bias is often the result of a data scientist or engineer overestimating or underestimating the importance of a particular hyperparameter during feature engineering and the algorithmic tuning process. A hyperparameter is a machine learning parameter whose value is chosen before the learning algorithm is trained. Tuning is the process of selecting which hyperparameters will minimize a learning algorithm’s loss functions and provide the most accurate outputs.
It’s important to note that machine bias can be used to improve the interpretability of a ML model in certain situations. For example, a simple linear model with high bias will be easier to understand and explain than a complex model with low bias.
When a machine learning model is to make predictions and decisions, however, bias can cause machine learning algorithms to produce sub-optimal outputs that have the potential to be harmful. This is especially true in the case of credit scoring, hiring, the court system and healthcare. In these cases, bias can lead to unfair or discriminatory treatment of certain groups and have serious real-world consequences.
Bias in machine learning is a complicated topic because bias is often intertwined with other factors such as data quality. To ensure that an ML model remains fair and unbiased, it is important to continually evaluate the model’s performance in production.
Machine learning algorithms use what they learn during training to make predictions about new input. When some types of information are mistakenly assigned more — or less importance than they deserve — the algorithm’s outputs can be biased.
For example, machine learning software is used by court systems in some parts of the world to recommend how long a convicted criminal should be incarcerated. Studies have found that when data about a criminal’s race, education and marital status are weighted too highly, the algorithmic output is likely to be biased and the software will recommend significantly different sentences for criminals who have been convicted of the same crime.
Machine bias can manifest in various ways, such as:
Here are a few examples of stories in the news where people or companies have been harmed by AI:
A 2016 investigation by ProPublica found that COMPAS, an AI system adopted by the state of Florida, was twice as likely to flag black defendants as future re-offenders as white defendants. This raised concerns about AI’s use in policing and criminal justice.
In 2018, it was reported that Amazon’s facial recognition technology, known as Rekognition, had a higher rate of inaccuracies for women with darker skin tones. This raised concerns about the potential for the technology to be used in ways that could harm marginalized communities.
In 2020, a chatbot used by the UK’s National Health Service (NHS) to triage patients during the COVID-19 pandemic was discovered to be providing incorrect information and directing people to seek treatment in the wrong places. This raised concerns about the safety of using AI to make medical decisions.
In 2021, an investigation by The Markup found lenders were 80% more likely to deny home loans to people of color than white people with similar financial characteristics. This raised concerns about how black box AI algorithms were being used in mortgage approvals.
In 2022, the iTutorGroup, a collection of businesses that provides English-language tutoring services to students in China was found to have programmed its online recruitment software to automatically reject female applicants age 55 or older and male applicants age 60 or older. This raised concerns about age discrimination and resulted in the U.S. Equal Employment Opportunity Commission (EEOC) filing a lawsuit.
There are several methods that can be used to detect machine bias in a machine learning model:
There are several techniques that can be used to foster responsive AI and prevent machine bias in machine learning models. It is recommended to use multiple methods and combine them by doing the following:
Bias and variance are two concepts that are used to describe the performance and accuracy of a machine learning model. A model with low bias and low variance is likely to perform well on new data, while a model with high bias and high variance is likely to perform poorly.
In practice, finding the optimal balance between bias and variance can be challenging. Techniques such as regularization and cross-validation can be used to manage the bias and variance of the model and help improve its performance.
Techopedia’s editorial policy is centered on delivering thoroughly researched, accurate, and unbiased content. We uphold strict sourcing standards, and each page undergoes diligent review by our team of top technology experts and seasoned editors. This process ensures the integrity, relevance, and value of our content for our readers.
Margaret is an award-winning technical writer and teacher known for her ability to explain complex technical subjects to a non-technical business audience. Over the past twenty years, her IT definitions have been published by Que in an encyclopedia of technology terms and cited in articles by the New York Times, Time Magazine, USA Today, ZDNet, PC Magazine, and Discovery Magazine. She joined Techopedia in 2011. Margaret's idea of a fun day is helping IT and business professionals learn to speak each other’s highly specialized languages.
What is Turnitin AI Checker? The Turnitin AI checker is an advanced tool aimed at maintaining the integrity of school...
Maria WebbTechnology journalist
What is ISO/IEC 42001? ISO/IEC 42001 is an international standard that provides a governance framework for implementing and continually improving...
Margaret RouseTechnology Expert
What are Physical Resource Networks (PRNs)? The definition of Physical Resource Networks (PRNs) is that they are a type of...
Nicole WillingTechnology Journalist
Trending NewsLatest GuidesReviewsTerm of the Day