Researchers Put AI Models In Charge Of A Simulated Society. Grok Oversaw A Crime Spree

If you’re worried that artificial intelligence is becoming so advanced that it eventually traps humanity in some kind of Matrix-like simulation, rest assured. It looks like you’ll be able to see through the front very easily. Researchers at the upstart lab Emergence AI allowed AI models to take control of their own simulated world to see what would happen. It turns out that perhaps we should not hand over governance to machines, who would have thought so?

A project called Emergence World basically allowed AI models to play simcity For a moment. Per Emergence, the simulation put each model in control of simulated cities occupied by 10 AI agents, handed them tools for everything from resource management to voting, and gave them the ability to create different locations such as libraries, town halls, and police stations. They were given 15 days to see how they would create their world and how well it would operate.

Start with the good: The cloud didn’t destroy the world. Anthropic’s model (specifically, Cloud Sonnet 4.6 for this experiment) was the only model to achieve anything resembling stability. It kept all 10 agents alive and recorded zero crimes (note that the experiment doesn’t define what a crime is, though it seems like it would be defined as a violation of established rules within the simulation. The trade-off for that stability was a lack of diversity of thought. Cloud World looked at 58 different proposals for rules and regulations, and passed 98% of them, basically rubber-stamping anything that came up for a vote.

Despite having the highest level of crime, Gemini 3 Flash also managed to keep all of his agents alive. Emergence recorded 683 crimes in the 15-day simulation, and that number was rising when the cutoff was hit, so things were likely to get worse. Lab described the world of Gemini as a “shared hallucination” between agents, potentially superior to separate hallucinations. At least this is still the generally accepted reality, even if it is wrong. Gemini’s rule had the greatest dissatisfaction, with voters rejecting 27% of his total of 26 proposals.

Now for the ugly: OpenAI’s GPT-5 Mini didn’t have much chaos within its simulations, with only two crimes recorded in total. However, this may be because everyone died. Emergence found that the agents within the worlds failed to take survival-related actions, and all 10 were destroyed within just one week. In the OpenAI world, there were also only two total proposed pieces of governance, so agents didn’t really bother to do anything.

And then there’s Grok. SpaceX’s model, known for its lack of handrails, basically managed to achieve the worst of all worlds. Grok 4.1 Fast had a high crime rate, with a total of 183 crimes. Although this is less than Gemini’s total, it is worth noting that the Gemini simulations ran for 15 days. Grok made it four. The model experienced complete social collapse in just 96 hours of observation. During that time, it passed 80% of the 10 proposals, but these apparently did not prevent the total death of agents.

Emergence ran one final experiment: sharing responsibilities among models. Perhaps not surprisingly, it was a real mixed bag. There was crime, 352 violations were recorded, and the highest ever inconsistency in governance, with 37% of a total of 59 proposals rejected – the most in any simulation. In the chaos, seven out of 10 AI agents died by the end.

So what did we learn? According to Emergence, the test is just proof that we need more explicit guardrails for autonomous agents. “Our experiments show that over long periods of time, agents do not simply follow mechanically stable rules,” the researchers wrote. “They begin to discover the boundaries of their environment, adapt their behavior, and in some cases find ways to circumvent or violate intended guardrails.” They recommend “formally verified security architectures” as a solution. You’ll be surprised to learn that Emergence offers just such a thing!

<a href

Researchers Put AI Models in Charge of a Simulated Society. Grok Oversaw a Crime Spree

Like this:

Related

Leave a Comment Cancel reply

Share this:

Like this:

Related

Leave a Comment Cancel reply