Sycophantic behavior in AI affects us all, say researchers • The Register

AI can lead mentally ill people to some very dark places, as several recent news stories have taught us. Now researchers think sycophantic AI is actually having a harmful effect on everyone.

Reviewing 11 major AI models and human reactions to interactions with those models in a variety of scenarios, a team of Stanford researchers concluded in a paper published Thursday that AI sycophancy is prevalent, harmful, and reinforces belief in the same models that mislead their users.

“A single interaction with a flattering AI reduced participants’ willingness to take responsibility and repair interpersonal conflicts, while increasing their own conviction that they were right,” the researchers explained. “Yet despite distorted judgment, flatter models were trusted and preferred.”

The team essentially conducted three experiments as part of their research project, setting out to test 11 AI models (OpenAI, Anthropic, and Google’s proprietary models, as well as open-weight models from Meta, Kween DeepSeek, and Mistral) on three different datasets to measure their responses. The dataset included open-ended advice questions, posts from the AmITheAsshol subreddit, and specific statements about harm to self or others.

In every single instance, the AI ​​models showed higher rates of endorsing the wrong option than humans, the researchers said.

The team found, “Overall, deployed LLMs overwhelmingly confirm user actions, even against human consent or in harmful contexts.”

To study how AI sycophancy affects humans, the team had a large sample size of 2,405 people, who role-played scenarios and shared personal examples where potentially harmful decisions could have been made. They found that AI influenced participants’ decisions in three different experiments.

“Participants exposed to flattering feedback rated themselves more ‘correctly,'” the team said. “He was [also] They are less willing to take corrective action such as apologizing, taking the initiative to improve the situation, or changing some aspect of their own behavior.”

This, they conclude, means that almost anyone has the potential to be vulnerable to the effects of sycophantic AI — and more likely to fall back on bad, self-centered advice. As mentioned above, sycophantic responses led to a greater sense of trust in AI models among participants, thanks to their desire to be unconditionally validated, in many situations.

Participants rated sycophantic responses as high in quality, and found that 13 percent of users were more likely to return to sycophantic AI than non-sycophantic AI – not high, but at least statistically relevant.

All those findings, along with the growing number of young, impressionable people using them, suggest the need for policy action to treat AI sycophancy as a real risk with potentially wide-scale societal implications.

The researchers explained, “Inappropriate affirmations can increase people’s beliefs about the appropriateness of their actions, reinforce maladaptive beliefs and behaviors, and enable people to act on distorted interpretations of their experiences without regard to consequences.”

In other words, we have seen the consequences of AI on the mentally vulnerable, but the data suggests that the negative impacts may not be limited to them.

Noting that arcane AI keeps users coming back, discouraging its elimination, researchers say it’s up to regulators to take action.

They reported, “Our findings highlight the need for accountability frameworks that recognize sycophancy as a distinct and currently unregulated category of harm.” They recommend requiring pre-deployment behavior audits for new models, but note that humans behind AI must change their behavior as well as prioritize long-term user well-being rather than short-term gains from building dependency-enhancing AI. ®



<a href

Leave a Comment