Unlearning Machines: The Crucial Skill of Training AI to Forget

Unlearning or forgetting things learned is an important action that artificial intelligence (AI) must undergo from time to time.

Unlearning is also known as selective amnesia in AI. and can be needed for all kinds of reasons, including removing bias, correcting inaccuracies, or updating information.

Over a period of time, AI learns from varied and vast amounts of datasets and inevitably learns to demonstrate bias, inaccuracies, and discriminations. These manifestations can be dangerous and can be targeted by malicious entities.

However, the task of unlearning is difficult, for instance, data can affect many different datasets, and different tools are needed for different machine learning models.

Still, unlearning is one of the important ways to improve AI.

What is AI Unlearning?

Let’s try to understand AI unlearning with an example of imaginary John Smith.

Circumstances Behind Unlearning AI

The primary purpose is to remove inaccurate and biased output, however, another concern can be that AI may leak private data — and therefore, that knowledge must be “unlearnt”.

Various regulatory authorities have already been asking companies to eliminate data that violates privacy.

In 2018, the data regulator in the UK warned that companies using AI could be subject to the GDPR. The US Federal Trade Commission (FTC) forced Paravision, a facial recognition software company, to remove a collection of photos that they had collected without following protocol and also to alter the data of the AI program that was trained on the photos.

Unlearning is a Complex Proposition

From the perspective of the companies that train AI systems, the circumstances leading to unlearning create a problematic situation.

One, the need to protect privacy drives continuous changes to various laws like the GDPR, and the companies must have their AI systems adapt to the regulations, which can be costly and time-consuming.

Two, currently, unlearning means that you remove the data from the AI systems and retrain the system from scratch. Add to this the effort of removing the data from other methods which are affected by the data.

This means that you might be facing the possibility of retrain.

Where possible, it is more straightforward to remove the contested data but avoid retraining the AI system.

Can You Forget But Avoid Retraining an AI Model?

According to Aron Roth, a researcher on AI Unlearning at the University of Pennsylvania, “Can we remove all influence of someone’s data when they ask to delete it, but avoid the full cost of retraining from scratch?” A lot of effort is being put in that direction.

One example is a project by researchers at the universities of Toronto and Wisconsin-Madison in which they created multiple smaller projects with datasets and combined them into a larger project.

The research paper describes the project as “a framework that expedites the unlearning process by strategically limiting the influence of a data point in the training procedure.

“While our framework is applicable to any learning algorithm, it is designed to achieve the largest improvements for stateful algorithms like stochastic gradient descent for deep neural network.

“Training reduces the computational overhead associated with unlearning, even in the worst-case setting where unlearning requests are made uniformly across the training set.”

Are There Any Limitations?

There is a limitation with the approach, as pointed out by the researchers from Harvard, Pennsylvania, and Stanford universities, that if the data deletion came in a certain sequence, either from a malicious actor or from any other entity by chance, the program could break.

Apart from this, there is another problem of verifying whether the AI system has been successfully unlearned.

This is not to question the company’s intention but to find out whether the effort to unlearn has fully succeeded.

According to Gautam Kamath, a professor at the University of Waterloo, “It feels like it’s a little way down the road, but maybe they’ll eventually have auditors for this sort of thing.”

Other ideas include differential privacy, a technique that can put mathematical boundaries on how much an AI system can actually leak in terms of private data. The technique must still be vetted by different experts before it can be successfully rolled out.

The Bottom Line

Unlearning is at a nascent stage, and it will be a while before it is treated as a mature and proven system that can enable AI systems to not only unlearn but also retrain with minimal effort.

Constant pressure from regulatory bodies, laws, regulations, and litigations will keep the companies using AI systems on their toes, especially in regions like the European Union (EU), where strong laws like the GDPR are used.

Unlearning is an extremely complex proposition, and it will take a deeper look at how AI systems learn to find out how they can potentially unlearn.

Unlearning Machines: The Crucial Skill of Training AI to Forget

What is AI Unlearning?

Circumstances Behind Unlearning AI

Unlearning is a Complex Proposition

Can You Forget But Avoid Retraining an AI Model?

Are There Any Limitations?

The Bottom Line

Kaushik Pal

Table of Contents

Florida’s First-in-Nation Lawsuit Against OpenAI Could Redefine AI Accountability

Microsoft Wants to Give Everyone an AI Sidekick That Never Sleeps | This Week in IT

‘AI Will Save the Planet’ — But at What Cost? The Hidden Environmental Toll of Data Centers

SpaceX’s Starlink Feuds With Pentagon Over Pricing Ahead of IPO | This Week in IT

Aurora Hunter: an AI-Powered Forecasting Site for Northern Lights Viewers | Discoveries This Week

Google Overhauls Search Experience With AI Agents | Techopedia Consumer Report

Anthropic AI’s Thirst for Processing is Consuming Nearly All SpaceX’s GPU Capacity

Musk Looks Likely to Keep Fighting OpenAI Despite Setback, as IPO Approaches | This Week in IT

What is AI Unlearning?

Circumstances Behind Unlearning AI

Unlearning is a Complex Proposition

Can You Forget But Avoid Retraining an AI Model?

Are There Any Limitations?

The Bottom Line

Related Reading

Related Terms

Related Questions

About Techopedia’s Editorial Process

Kaushik Pal

Kaushik Pal

Table of Contents

Most Popular News

Related Features

Florida’s First-in-Nation Lawsuit Against OpenAI Could Redefine AI Accountability

Microsoft Wants to Give Everyone an AI Sidekick That Never Sleeps | This Week in IT

‘AI Will Save the Planet’ — But at What Cost? The Hidden Environmental Toll of Data Centers

SpaceX’s Starlink Feuds With Pentagon Over Pricing Ahead of IPO | This Week in IT

Aurora Hunter: an AI-Powered Forecasting Site for Northern Lights Viewers | Discoveries This Week

Google Overhauls Search Experience With AI Agents | Techopedia Consumer Report

Anthropic AI’s Thirst for Processing is Consuming Nearly All SpaceX’s GPU Capacity

Musk Looks Likely to Keep Fighting OpenAI Despite Setback, as IPO Approaches | This Week in IT

Get Techopedia's Daily Newsletter in your inbox every Weekday.