Even Google And Replit Struggle To Deploy AI Agents Reliably

2025 was supposed to be the year of the AI agent, right?

Not at all, accept Google Cloud and Replit – two big players and partners in the AI agent field "vibe coding" Movement – Recently at the VB Impact Series event.

Even though they make agentic devices themselves, leaders of both companies say the capabilities are not yet complete.

This disrupted reality comes from struggles with legacy workflows, fragmented data, and immature governance models. Furthermore, enterprises fundamentally misunderstand that agents are not like other technologies: they require fundamental rethinking and reworking of workflows and processes.

When enterprises are building agents to automate work, “most of them are toy examples,” Replit CEO and founder Amjad Masad said during the event. “They get excited, but when they start implementing it, it’s really not working very well.”

Creating agents based on Replit’s own mistakes

Reliability and integration, rather than intelligence, are the two primary barriers to AI agent success, Massad said. Agents often fail when running for long periods of time, accumulating errors, or lacking access to clean, well-structured data.

The problem with enterprise data is that it’s messy – it’s structured, unstructured, and stored everywhere – and crawling it is a challenge. Additionally, there are many unwritten things that people do that are difficult to encode in agents, Massad said.

“The idea that companies are just going to turn on agents and agents will replace workers or automatically do workflow automation, that’s not the case today,” he said. “The tooling isn’t there.”

Extending beyond agents are computer usage tools that can take over a user’s workspace for basic tasks such as web browsing. But these are still in their infancy and, despite quick promotion, can be buggy, unreliable, and even dangerous.

“The problem is that computer usage models are really bad right now,” Massad said. “They’re expensive, they’re slow, they’re making progress, but they’re only a year old.”

Replit is learning from its own mistake earlier this year, when its AI coder was destroyed A company’s entire code base In a testing phase. Massad admitted: “The tools were not mature enough,” noting that the company has since separated development from production.

Techniques like testing-in-the-loop, verifiable execution, and development isolation are essential, he said, even though they can be extremely resource-intensive. Replit incorporated in-the-loop capabilities in version 3 of its agent, and Masad said its next-generation agent can operate autonomously for up to 200 minutes; Some have run it for 20 hours.

Still, he acknowledged that users have expressed frustration with lag times. When they give a “heavy signal”, they may have to wait for 20 minutes or more. Ideally, they have expressed that they would like to be involved in a more creative loop where they can enter multiple signals, work on multiple tasks simultaneously, and adjust the design while the agent is working.

“The way to solve this is parallelism, creating multiple agent loops and having them work on these independent features while allowing them to do creative work at the same time,” he said.

Agents need cultural change

Beyond the technical perspective, there’s a cultural barrier: Agents work probabilistically, but traditional enterprises are structured around deterministic processes, said Mike Clark, director of product development at Google Cloud. This creates a cultural and operational mismatch as LLMs engage with all new tools, orchestration frameworks, and processes.

“We don’t know how to think about agents,” Clark said. “We don’t know how to address what agents can do.”

He said the companies that are doing it right are driven by bottom-up processes: building no-code and low-code software and tools in the trenches that reach big agents. So far, deployments that have been successful have been narrow, carefully scoped, and heavily monitored.

“If I look at 2025 and it promises to be the year of the agents, this was the year a lot of people spent prototyping,” Clark said. “We are now in the midst of this massive scale phase.”

How do you protect a pasture-less world?

Clark said another struggle is AI agent security, which also requires rethinking traditional processes.

Security perimeters have been drawn around everything, Clark said — but that doesn’t work when agents need to be able to access many different resources to make the best decisions.

“This is really changing our security model, changing our base level,” he said. “What does minimum privilege mean in a pasture-less defenseless world?”.

Ultimately, there must be a rethinking of governance on behalf of the entire industry, and enterprises must align on a single threat model around agents.

Clark pointed out the disparity: “If you look at some of your governance processes, you’d be very surprised that the genesis of those processes was someone typing in triplicate on an IBM electric typewriter and handing it out to three people. That’s not the world we live in today.”

<a href

Even Google and Replit struggle to deploy AI agents reliably — here's why

Creating agents based on Replit’s own mistakes

Agents need cultural change

How do you protect a pasture-less world?

Like this:

Related

Leave a Comment Cancel reply

Creating agents based on Replit’s own mistakes

Agents need cultural change

How do you protect a pasture-less world?

Share this:

Like this:

Related

Leave a Comment Cancel reply