MIT Researchers Unveil Database that Curates AI’s Past Failures

Why Trust Techopedia
Key Takeaways

  • MIT's AI Risk Database aids businesses in learning from past AI failures.
  • Low-quality training data can cause erratic AI system behavior.
  • Lack of transparency in AI systems complicates error detection.

Researchers have collaborated with international partners to develop the MIT AI Risk Database.

The researchers involved are from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), and the database is a comprehensive resource for businesses to understand and mitigate potential AI threats.

The database serves as a vast library, housing over 700 documented cases of AI systems that didn’t pan out as expected. These examples are called “failures” or “near-misses”. The purpose of this open-access repository is to enable companies to learn from and avoid past mistakes others have made. This will also help companies anticipate and prevent potential problems in the future, to deliver more reliable AI performance.

Why AI Risks Are a Big Deal for Businesses

The database analyzes and identifies risks by categorizing them by their cause, domain, and subdomain. According to the report, 51% of risks analyzed were related to failures of AI systems, not human errors. 

One important discovery during the creation of the AI repository is that more risks emerged during deployment (65%) than the 10% that cropped up during development. 

Head of the MIT FutureTech Lab, Dr. Neil Thompson, explained that the database is the first attempt, that they know of, to curate, analyze, and extract AI risk frameworks into an accessible database.

MIT’s AI Risk Repository is a valuable resource for businesses, offering a treasure trove of lessons learned from past AI mishaps. 

By examining these cases, companies can refine their systems, sidestepping pitfalls that have affected others. 

“We are starting with a comprehensive checklist to help us understand the breadth of potential risks. We plan to use this to identify shortcomings in organizational responses. For instance, if everyone focuses on one type of risk while overlooking others of similar importance, that’s something we should notice and address,” Thompson added.

As expected, using low-quality training data, which can cause AI systems to behave erratically, is a common issue highlighted in the database. 

This singular decision can have far-reaching consequences when building large language models (LLMs). By studying these examples, businesses can fortify their AI systems against such vulnerabilities, cultivating a more robust and reliable AI ecosystem.

Fixing AI Security and Transparency Problems

However, even with this precaution, security and transparency concerns persist. AI systems are often hard to understand – so much so that they are called “black boxes.” 

This makes understanding how or why AI made a certain decision challenging. This lack of transparency can make it hard to catch mistakes. On top of that, AI systems are also becoming a target for hackers as more cybercriminals break into AI systems, change the data, and wreak havoc. 

One way adversaries are attacking ML systems is through data poisoning, a process where malicious actors manipulate training data to corrupt the learning process. 

A notable example of this attack was when Microsoft’s Twitter chatbot was intentionally corrupted. Malicious users repeatedly sent offensive tweets, eventually damaging its ability to function correctly and showing how vulnerable machine learning systems can be to data manipulation.

As AI becomes a bigger part of our lives, tools like MIT’s AI Risk Database could be crucial in helping businesses use this technology responsibly and securely.