OfRista Pawlowski remembers the defining moment that shaped her opinion on the ethics of artificial intelligence. As an AI worker on Amazon Mechanical Turk — a marketplace that allows companies to hire workers to perform tasks like entering data or matching an AI prompt with its output — Pawlowski spends her time assessing the quality of AI-generated text, images, and videos, as well as doing some fact checking.
About two years ago, while working from home at her dining room table, she took on the task of designating tweets as racist or not. When he was presented with a tweet that read, “Listen to that Mooncricket song”, he almost clicked the “no” button before deciding to check the meaning of the word “Mooncricket”, which, to his surprise, was a racial slur against black Americans.
“I sat there thinking about how many times I’d made the same mistake and not caught myself,” Pawlowski said.
The potential scale of his own errors, and those of thousands of other workers like him, sent Pawlowski spiraling. How many other people unwittingly let objectionable content slip through their fingers? Or worse, chosen to allow it?
After years of observing the inner workings of AI models, Pawlowski personally decided not to use generative AI products and told his family to stay away from them.
“It’s an absolute no-no in my house,” Pawlowski said, referring to how she doesn’t let her teenage daughter use tools like ChatGPT. And for people she meets socially, she encourages them to ask the AI about something they are very knowledgeable about so they can recognize its errors and understand for themselves how wrong the technology is. Pawlowski said that every time she sees a menu of new tasks to choose from on the Mechanical Turk site, she asks herself if what she’s doing could be used to hurt people — many times, she says, the answer is yes.
An Amazon statement said employees can choose which tasks to complete at their own discretion and can review the description of a task before accepting it. According to Amazon, requesters set the specifics of any given task, such as time allotted, pay, and instruction level.
“Amazon Mechanical Turk is a marketplace that connects businesses and researchers, called requesters, with workers to complete online tasks, such as labeling images, answering surveys, transcribing text, or reviewing AI output,” said Montana McLachlan, an Amazon spokesperson.
Pawlowski is not alone. A dozen AI evaluators, workers who check AI’s reactions to accuracy and ground level, told the Guardian that, after becoming aware of the way chatbots and image generators work and how inaccurate their outputs can be, they have begun urging their friends and family not to use generative AI at all – or at least trying to educate their loved ones about using it carefully. These trainers work on a range of AI models – Google’s Gemini, Elon Musk’s Grok, other popular models, and many smaller or lesser-known bots.
One worker, Google’s AI evaluator, who evaluates the responses generated by Google Search’s AI observations, said she tries to use AI as sparingly as possible, if at all. The company’s approach to AI-generated responses, particularly to health questions, gave him pause, he said, requesting anonymity for fear of professional retribution. He said he observed his colleagues blindly evaluating AI-generated responses to medical cases and, despite his lack of medical training, he was tasked with evaluating such questions himself.
At home, he has forbidden his 10-year-old daughter from using chatbots. “She has to learn critical thinking skills first otherwise she won’t be able to tell whether the output is good or not,” the rater said.
“Ratings are one of many collected data points that help us measure how well our systems are working, but do not directly influence our algorithms or models,” a statement from Google said. “We also have a number of strong security measures in place to ensure the high quality information displayed in our products.”
bot watchdogs raise alarm
These people are part of a global workforce of thousands who help make chatbots more human. While checking AI responses, they also try their best to ensure that the chatbot does not spread false or harmful information.
When the people who find AI trustworthy are the ones who trust it the least, however, experts believe it’s a sign of a much larger issue.
Alex Mahadevan, director of MediaWise at Poynter, a media literacy program, said, “This shows that there are probably incentives to ship and scale over slow, careful validation, and those providing feedback are being ignored.” “So this means that when we see the final [version of the] Chatbots, we can expect the same types of errors they are experiencing. This is not a good sign for the public who are increasingly turning to LLM for news and information.
AI workers said the persistent emphasis on fast turnaround times at the expense of quality has led them to distrust the models they work on. Brooke Hansen, an AI worker on Amazon Mechanical Turk, explained that while she doesn’t distrust generative AI as a concept, she also doesn’t trust the companies that develop and deploy these tools. For him, the biggest turning point was realizing how little support the people training these systems get.
“We are expected to help improve models, yet we are often given vague or incomplete instructions, minimal training, and unrealistic deadlines to complete tasks,” said Hansen, who has been working with data since 2010 and has taken part in training some of Silicon Valley’s most popular AI models. “If employees are not equipped with the information, resources, and time we need, how can the results possibly be safe, accurate, or ethical? To me, the gap between what is expected of us and what we are actually given to work on is a clear sign that companies are prioritizing speed and profit over responsibility and quality.”
Experts say a major flaw of generative AI is giving false information in a confident tone rather than providing no answers when they are not readily available. An audit of the top 10 generative AI models, including AI from ChatGPT, Gemini, and Meta, by media literacy non-profit NewsGuard revealed that the non-response rate of chatbots declined from 31% in August 2024 to 0% in August 2025. Additionally, the likelihood of chatbots repeating false information nearly doubled from 18% to 35%, NewsGuard found. Neither company responded to NewsGuard’s request for comment at that time.
“I will not trust any facts [the bot] Another Google AI rater, requesting anonymity due to a non-disclosure agreement signed with the contracting company, said, “Offers without checking them yourself – it’s not reliable at all.” It’s just a robot.”
After newsletter promotion
“We make fun of it [chatbots] It would be great if we could stop them from lying,” said an AI tutor who has worked with Gemini, ChatGPT and Grok, requesting anonymity and signing nondisclosure agreements.
‘garbage in garbage out’
Another AI evaluator, who started his journey of rating responses for Google’s products in early 2024, started feeling that he couldn’t trust AI after six months on the job. They were tasked with stumping the model – meaning they had to ask Google’s AI various questions that would expose its limitations or weaknesses. The worker, who had a degree in history, asked model historical questions for the assignment.
“I asked it about the history of the Palestinian people, and it wouldn’t give me an answer, no matter how I repeated the question,” recalled this activist, who signed a nondisclosure agreement and requested anonymity. “When I asked him about the history of Israel, he had no problem giving me a very detailed account. We reported it, but no one at Google cared.” When asked specifically about the situation described by the evaluator, Google did not issue any statement.
For this Google worker, the biggest concern in AI training is the feedback given to AI models by evaluators like him. “After seeing how bad the data that supposedly went into training the model was, I knew there was no way it could ever be trained correctly,” he said. He coined the term “garbage in, garbage out”, a principle in computer programming that states that if you feed bad or incomplete data into a technical system, the output will have the same flaws.
The evaluator avoids using generative AI and has “advised every family member and friend of mine not to buy new phones that have AI integrated, resist automatic updates that add AI integration if possible, and never tell the AI anything personal”, he said.
delicate, not futuristic
Whenever the topic of AI comes up in social conversation, Hansen reminds people that AI is not magic – explaining the army of invisible workers behind it, the unreliability of the information, and how harmful it is to the environment.
“Once you see how these systems are linked together – the biases, the rushed deadlines, the constant compromises – then you stop seeing AI as futuristic and start seeing it as fragile,” Adio Dinica, who studies the labor behind AI at the Distributed AI Research Institute, said of the people working behind the scenes. “In my experience it’s always the people who don’t understand AI who are charmed by it.”
AI activists who spoke to the Guardian said they are taking responsibility for making better choices and creating awareness around themselves, particularly emphasizing the idea that AI is, in Hansen’s words, “only as good as what’s put into it, and what’s put into it isn’t always the best information”. She and Pawlowski gave a presentation at the Michigan Association of School Boards spring conference in May. In a room filled with school board members and administrators from across the state, he spoke about the ethical and environmental impacts of artificial intelligence, hoping to start a conversation.
“Many attendees were surprised by what they learned, as most had never heard of the human labor or environmental footprint behind AI,” Hansen said. “Some were grateful for the insight, while others were defensive or disappointed, accusing us of being ‘disappointed and hopeless’ about a technology they saw as exciting and full of promise.”
Pawlowski compared AI ethics to the clothing industry: When people didn’t know how to make cheap clothes, they were happy to get the best deal and save a few bucks. But as sweatshop stories began to emerge, consumers had a choice and knew they should ask questions. He believes the same is true for AI.
“Where does your data come from? Is this model built on copyright infringement? Were workers compensated fairly for their work?” He said. “We’re just starting to ask those questions, so in most cases the general public doesn’t have access to the truth, but like the textile industry, if we keep asking and keep pushing, change is possible.”
<a href