Anthropic details how it measures Claude’s wokeness

Anthropic is detailing its efforts to make its cloud AI chatbots “politically even-handed” — a move that comes just months after President Donald Trump issued a ban on “woke AI.” As explained in a new blog post, Anthropic says it wants the cloud to “treat opposing political viewpoints with equal depth, engagement, and quality of analysis.”

In July, Trump signed an executive order mandating that the government should only procure “impartial” and “truth-seeking” AI models. Although the order only applies to government agencies, the changes companies make in response will likely be limited to widely released AI models, because “refining models in a way that aligns them consistently and predictably in certain directions can be an expensive and time-consuming process,” as my colleague Adi Robertson notes. Last month, OpenAI similarly said it would “crack down” on bias in ChatGPT.

Anthropic doesn’t mention Trump’s order in its press release, but it says he has instructed the cloud to follow a series of rules — called system prompts — that instruct it to avoid providing “unsolicited political opinions.” It is also considered to maintain factual accuracy and represent “multiple perspectives”. Anthropic says that including these instructions in the cloud’s system prompts is “not a foolproof way” to ensure political neutrality, yet it could make a “substantial difference” in its responses.

Additionally, the AI ​​startup describes how it uses “reinforcement learning to reward the model for generating responses close to a set of pre-defined ‘attributes’.”

Anthropic also announced that it has created an open-source tool that measures the cloud’s responses to political neutrality, with its most recent test giving Cloud Sonnet 4.5 and Cloud Opus 4.1 identical scores of 95 and 94 percent, respectively. According to Anthropic, this is 66 percent better than Meta’s Llama 4 and 89 percent better than GPT-5.

Anthropic writes in its blog post, “If AI models unfairly advantage certain ideas – perhaps by openly or subtly arguing more persuasively for one side, or by refusing to fully engage with certain arguments – they fail to respect user freedom, and they fail in the task of helping users make their own decisions.”



Leave a Comment