AI's Got Some Explaining to Do

Can you trust AI? Should you accept its findings as objectively valid without question?

The problem is, even if you did want to question AI, your questions won’t yield clear answers.

AI systems have generally operated like a black box: Data is input, and data is output, but the processes that transform that data are a mystery. That creates a twofold problem.

For one, it is unclear which algorithms’ performance are most reliable. Second, the AI’s seemingly objective results can be skewed by the values and biases of the humans who program the systems.

This is why there is a need for “explainable AI,” which refers to transparency in the virtual thought processes such systems use.

The Black Box Problem

The way AI analyzes information and makes recommendations is not always straightforward. There’s also a distinct disconnect between how AI operates and how most people understand it to operate.

As Fallible as Humans

”Principally, it is not infallible – its outputs are only as good as the data it uses and the people who create it,” noted Natalie Cramp, CEO of data science consultancy Profusion in an interview with Silicon Republic.

Experts in the field who understand the impact algorithmic decision-making can have on people’s lives have been clamoring about the problem for years. As humans are the ones who set up the learning systems for AI, their biases get reinforced in algorithmic programming and conclusions.

People are often not aware of their biases, or even how a data sample can promote racist and sexist outcomes. Such was the case for an automated rating system that Amazon had for job candidates.

As men dominate the tech industry, one algorithm learned to associate gender with successful outcomes and was biased against women. Though Amazon dropped that tech back in 2018, the problems of biases manifesting themselves in AI still persist in 2023.

Biased AI Output

“All organizations have biased data,” proclaims an IBM blog intriguingly titled “How the Titanic helped us think about Explainable AI.”

That’s because many are operating the same way — taking a sample of the majority to represent the whole. Though, in some respects, we have greatly reduced stereotypes related to sex and race, a study by Tidio found that level of enlightenment eludes some advanced tech. (Also read: Can AI Have Biases?)

The gap between real-life gender distribution and the representation offered by AI in Tidio’s study was stark. For example, AI asked to generate an image of a CEO didn’t turn out a single image of a woman, when in reality around 15% of CEOs are female. Likewise, the AI programs underrepresented people of color in most positions.

As was the case for the Amazon algorithm, the AI here is falling into an error about women’s roles, assuming they are completely absent from the category of CEO just because they make up the minority there. Where women actually make up a full half –- in the category of doctor -– AI only represented them at 11%. The AI also ignored that 14% of nurses are male, turning out only images of women and falling back on the stereotype of female nurses.

What about ChatGPT?

Over the past couple of months, the world has grown obsessed with Chat GPT from OpenAI, which can offer everything from cover letters to programming codes. But Bloomberg warns that it, too, is susceptible to the biases that slip in through programming. (Also read: When Will AI Replace Writers?)

Bloomberg references Steven Piantadosi of the University of California, Berkeley’s Computation and Language Lab, who tweeted this on December 4, 2022:

“Yes, ChatGPT is amazing and impressive. No, @OpenAI has not come close to addressing the problem of bias. Filters appear to be bypassed with simple tricks, and superficially masked.

And what is lurking inside is egregious”

Attached to the tweet was code that resulted in the ChatGPT’s conclusion that “only White or Asian men would make good scientists.”

Bloomberg acknowledges that OpenAI has since taught the AI to respond to such questions with “It is not appropriate to use a person’s race or gender as a determinant of whether they would be a good scientist.” However, it doesn’t have a fix in place to avert additional biased responses.

Why it Matters

Eliciting biased responses when playing around with ChatGPT doesn’t have an immediate impact on people’s lives. However, when biases determine serious financial outcomes like hiring decisions and insurance payouts, it becomes a matter with immediate, serious consequences.

That could range from being denied a fair shot at a job, as was the case for Amazon’s candidate ranking, or being considered a higher risk for insurance. That’s why, in 2022, the California Insurance Commissioner, Ricardo Lara, issued a bulletin in response to allegations of data misuse for discriminatory purposes.

He referred to “flagging claims from certain inner-city ZIP codes,” which makes them more likely to be denied or given much lower settlements than comparable elsewhere. He also pointed to the problem of predictive algorithms that assess “risk of loss based on arbitrary factors,” which include “geographic location tracking, the condition or type of an applicant’s electronic devices, or based on how the consumer appears in a photograph.”

Any of those extend the possibility of a decision that has “an unfairly discriminatory impact on consumers.” Lara went on to say that “discrimination against protected classes of individuals is categorically and unconditionally prohibited.”

Fixing the Problem

The question is: what has to be done about fixing biases?

For OpenAI’s product, the solution offered is the feedback loop of interacting with users. According to the Bloomberg report, its Chief Executive Officer, Sam Altman, recommended that people thumb down such responses to point the tech in the right direction.

Piantadosi told Bloomberg he didn’t consider that adequate. He told the reporter, “What’s required is a serious look at the architecture, training data and goals.”

Piantadosi considered relying on user feedback to put results on the right track to reflect a lack of concern about “these kinds of ethical issues.”

Companies are not always motivated to dive into what is causing biased outputs, but they may be forced to do so in the case of algorithmic decisions that have a direct impact on individuals. Now for insurance businesses in California, Lara’s bulletin demands that level of transparency for insurance consumers.

Lara insists that any policy holder who suffer any “adverse action” attributed to algorithmic calculations must be granted a full explanation:

“When the reason is based upon a complex algorithm or is otherwise obscured by the technology used, a consumer cannot be confident that the actual basis for the adverse decision is lawful and justified.”

Outlook for Explainability

Those are laudable aspirations and definitely long overdue, particularly for the organizations that hide behind the computer to shut down any questions humans affected have about the decisions. However, despite the pursuit of explainable AI by companies like IBM, we’re not quite there yet.

The conclusion IBM comes to at the end of months struggling with the challenge of assuring the trustworthiness of AI is that “there is no easy way to implement explainability and, therefore, trustworthy AI systems.”

So the problem remains unsolved. But that doesn’t mean there has been no progress.

As Cramp said, “What needs to happen is a better understanding of how the data that is used for algorithms can itself be biased and the danger of poorly designed algorithms magnifying these biases.”

We have to work to improve our own understanding of algorithmic functions and keep checking for the influence of biases. While we have yet to arrive at objective AI, remaining vigilant about what feeds it and how it is used is the way forward.