To feed artificial intelligence with useable datasets, a good dose of manual human touch is highly desirable, if not just a compelling necessity.
However, data labeling is a repetitive and often punishing job, and many companies are trying to exploit underpaid labor to feed their AIs with vast, vast amounts of data. From destitute workers coming from underdeveloped countries to internship students and even jail prisoners, companies that tout their investment in “ethical AI” may actually turn out to be underhanded when it comes to hiring vastly underpaid employees.
Are humans really important in training AI? Who are the poorest data labelers and annotators working in digital sweatshops, and where do they come from? Are we sure there are no other alternatives to human touch when it comes to feeding AI with training data? Let’s have a look.
Brittleness, MAD Cows and AI Autophagy Disorders: Why Humans Are a Necessity
The human touch, or as experts used to call it, human-in-the-loop models, is essential to ensure the quality of data when training an AI.
Back in 2018, one of the first deadly accidents caused by a self-driving car involved a woman who was walking her bike across the street. While the algorithms could recognize a pedestrian or a bike when they represented separated entities, they weren’t capable of identifying what was an unexpected outcome for them. Machine learning-based models are extremely rigid and can hardly react with the same flexibility as humans when encountering something they were not trained for.
Humans need to sort out all these “edge cases” where an informed decision is necessary and save AI from the inherent brittleness that may cause it to crumble so quickly in front of the unknown.
But that’s not the only reason why they (we) are needed.
When companies turn to different types of datasets that do not require human labelers to complete their excruciating tasks, such as machine-generated or structured data, the results have also not been so good.
Models trained solely on other AI-generated outputs tend to go wild after a while. Quite literally so. According to a study from the Rice and Stanford University, the quality of outputs starts degrading as a phenomenon called Model Autophagy Disorder (MAD) starts occurring. In what truly looks like a neurodegenerative disorder affecting the machines’ brains, the outputs, like images or videos, become wonkier and more absurd. The researchers drew a parallel to what happened to self-consuming cows who developed the infamous “mad cow disease.”
Low- and Middle-Income Country Workers: The Usual Suspects
One of the most gaping wounds of globalization is, hands down, the opportunity to delocalize jobs in countries where wages are extremely low and work conditions are exploitatively poor. The more the tasks required for a job are simple and unspecialized, and the more they can be fully performed remotely, the easier it becomes for larger organizations to take advantage of delocalization. Overcrowding of workplaces and high churn rates are inconsequential if anyone can perform that task without any real training after all.
A new army of 21st-century mine workers comes, unsurprisingly, from the less developed economies of Africa and Asia, where the wages are small and so often are the rights of the workers who get paid them. A recent investigation from TIME found that OpenAI, the creator of the globally famous ChatGPT, outsourced laborers from Kenya, Uganda, and India to scrub toxicity, violent language, and bias away from its chatbot. Besides having to deal with terrifying datasets (more on this later), the Kenyan workers received a whopping wage between $1.32 and $2 per hour to help improve this multi-billion dollar market — the agency behind the work was allegedly paid $12.50 an hour per worker.
African employees are not the only “clickworkers,” as they are often called, who are hired at what we would consider inhumane pay. In the Philippines, thousands of young, unspecialized workers spend their days differentiating light poles from pedestrians in videos used to train self-driving cars, identifying celebrity pictures, and editing text snippets. How much these people are paid? No more than $6 to $10 a day. Do they have basic workers’ rights? Obviously not, since they’re hired through freelancing platforms that outsource their work to the larger AI companies. The same platforms that often hold their payments for a week seize them for any “purported violation” against which they cannot recourse or ban them if they try logging from a different device. Because your wage depends on the country you come from, so using any VPN is a surefire way to lose your job on the spot – and your life may very well be over if you end with no pay all of a sudden, if your pay was already so close to the poverty line.
How is it going in the other, more prosperous countries? Not so better – different places, different tactics, but the outcome is still the same: human exploitation. In China, elements of the AI industry simply decided to make an unholy deal with vocational schools. Students are obligated to do laborious, patience-eroding data labeling and annotation tasks as a requirement for graduation. They are made to take internships advertised as “career-improving jobs” that are nothing but cheap, repetitive, assembly-line tasks. All for breakfast money since they rarely reach even the local minimum wage after the greedy vocational schools took their cut.
But in the highly civilized Western world, we probably hit the rock bottom of true digital slavery. In the Nordics, where data must be collected in local languages spoken by a very small number of people, like Finnish or Danish, it is harder to enroll underpaid Africans or Indians. So, who can better do your job at a fraction of the actual price if not a prisoner? Being paid a miserable €1.54 ($1.65) an hour in a country where a Starbucks Espresso is priced €2.8 ($2.99) is way, way below the poverty line. But, hey, you’re a prisoner, so what’s better than a job that will “prepare you for the digital world of work” once you’re released, as the prison system boasts?
The Horrors of a Job Nobody Wants to Do
East or West, North or South of the world, clickworker jobs are much more dreary and punishing than anyone can imagine. You just sit in front of a PC and click, click, click your day to earn money. That’s way better than working your back under the sun in some tomato field in Southern Europe, isn’t it? Well, maybe, or maybe not. Actually, many of the jobs performed by data collectors, labelers, and, even worse, social media moderators are quite horrifying.
The Kenyan workers employed by ChatGPT, as we discussed above, had to expunge toxicity from the chatbot. In order to do so, they had to identify it, and the only way to do that was to read it, watch it, and experience it. And toxicity comes from the darkest recesses of the internet and, often, from the darkest recesses of the human mind. Social media moderators who must feed the content moderation algorithms are often exposed to terrifying images, videos, and content filled with violence on humans and animals, pornographic content, gore, and soul-crushing abuse. All for a few dollars, with working conditions that are often little more than abysmal.
The Dark (Mostly Grey) Takeaway
Small and Big Techs use all kinds of exploitative tactics to ensure no right is guaranteed to the employees, but we already know that. Still, the system itself is not built to improve the conditions of the workers over time. Whoever bids the lowest takes the job, and since outsourcing agencies and freelancer platforms take a share of the money paid to workers, their final wage is even lower. The middlemen ensure there’s a constant flow of new people willing to take the job, who can be hired or fired at any time, and with the constant threat of a ban that could exclude them from taking a future job. Not that you can make a career out of labeling: there’s no real skill to be learned, no way to upsell yourself as a professional, and no chances to get better pay over time.
We often imagine a world dominated by machines as a scary reality where our synthetic overlords force us to live in grey, dull, repetitive routines. We’re nothing but cogs in a gigantic mechanism that devours us like cattle with no names, faces, or identities. However, we hardly stop and think that the world we live in today is like that already in so many ways for the vast majority of less fortunate people of the world. How can machines can make it even worse than it already is?