Whether they’re talking about a new government policy or defending their right to put pineapple on pizza, people are biased about every subject on the planet.
The problem is that these judgments are being passed through to machine learning models in the data that’s used to train the models, in the algorithms used to generate the models and in the way the models are deployed in practice.
What does this mean for a global society that’s increasingly dependent on robotic assistants? It means that oversight is needed to make sure the outputs of an AI system are fair and unbiased.
What Biases Exist Within AI?
Literature often discriminates between data, algorithmic, social and societal bias in AI. While data and algorithmic biases can be due to biased data sampling or preparation, social and societal biases could be due to less representation of a minority group among IT teams.
In machine learning, data/algorithmic bias is the result of a systematic error or deviation from the truth in a model’s predictions. When unconscious and conscious biases find their way into the data sets used to train the model, they will also sneak their way into AI outputs.
She decided to study how AI from vendors like Amazon, Google, Microsoft and IBM treat different skin types and genders and found that the systems worked significantly better for lighter skinned faces (both male and female) and made the most mistakes for dark-skinned women. The problem? The training data was not diverse enough. No surprise there — the companies were using images of their disproportionately white, male tech workforce to train their AI systems.
And that brings us to gender bias. In 2019, Genevieve Smith found that out when she and her husband applied for the same credit card. Despite having a slightly better credit score and the same income, expenses and debt as her husband, the credit card company set her credit limit at almost half the amount of her husband’s. No surprise there either — the AI systems were trained with historical data and historically, women have lower credit limits than men.
Other important types of machine bias include:
- Age bias – when biased algorithms are used in healthcare, for example, it can result in incorrect diagnoses or treatments for older individuals.
- Socioeconomic bias – when biased algorithms are used in lending, for example, it can result in unequal access to credit for low-income individuals.
- Geographical bias – when biased algorithms are used in disaster response, for example, it can result in unequal allocation of resources based on geographic location.
- Disability bias – when biased algorithms are used in hiring, for example, it can result in discrimination against individuals with disabilities.
- Political bias – when biased algorithms used in news recommendations, for example, it can result in unequal exposure to different political perspectives.
How Does Concept Drift Create Bias?
Concept drift occurs when the distribution of the data used to make predictions shifts so much over time that it changes the relationships between inputs and outputs.
To mitigate the impact of concept drift on AI outputs, it’s important for MLOPs teams to regularly monitor the performance of their AI models and update them with new data to ensure their predictions remain accurate and fair. Additionally, techniques like transfer learning can be used to help AI models continuously adapt to changing data distributions.
What Role Does Bias Play in Self-supervised Learning?
AI requires oversight to stay relevant. While human in the Loop (HITM) intervention can be used to label a more diverse balance of data points when fine-tuning a model, don’t forget that bias also plays a role in self-supervised learning, a scenario in which the model is trained without human-provided labels.
In this type of unsupervised learning, the model is trained on a large amount of unstructured or unlabeled data and learns to generate its own labels based on the structure of the data.
To mitigate the impact of bias in self-supervised learning, it’s important for data scientists and machine learning engineers to carefully curate their training data. They can also use techniques like fairness constraints and bias correction algorithms to reduce bias.
Conclusion: Overcoming Biases in AI
Unfortunately, humankind will always have biases and AI needs to be nurtured carefully. If even a hint of gender, socioeconomic, age or political bias weaves through a popular model’s training data and leads to machine biased outputs, the vendor could face a revolt against the technology.
Luckily, MLOps and AIOps engineers can minimize the inevitability of bias by carefully curating their training data and continually monitoring and updating their models to ensure their outputs are accurate and fair. Moreover, new approaches and techniques to combat AI bias — for example, counterfactual fairness — are emerging continually.