AI Agent Credentials Live In The Same Box As Untrusted Code. Two New Architectures Show Where The Blast Radius Actually Stops.

Four separate RSAC 2026 keynotes reached the same conclusion without coordination. Microsoft’s Vasu Jakkal told attendees that zero trust in AI should be widespread. Cisco’s Jitu Patel said in an exclusive interview with VentureBeat that the shift from access control to action control calls for how agents behave. "Like teenagers, highly intelligent, but with no fear of consequences." CrowdStrike’s George Kurtz identified AI governance as the biggest gap in enterprise technology. Splunk’s John Morgan called for an agentic trust and governance model. Four companies. Four stages. A problem.

Matt Caulfield, Vice President of Product for Identity and Duo at Cisco, put it clearly in an exclusive VentureBeat interview at RSAC. "Although the concept of zero trust is good, we need to take it a step further." Caulfield said. "It’s not just about authenticating once and then letting the agent run wild. It’s about constantly verifying and checking every action an agent takes, because at any moment, that agent could go rogue."

According to PwC’s 2025 AI Agent Survey, 79 percent of organizations already use AI agents. According to the Gravity State of AI Agent Security 2026 report of 919 organizations in February 2026, only 14.4% reported full security approval for their entire agent fleet. A CSA survey presented at RSAC found that only 26% have AI governance policies. CSA’s Agentic Trust Framework describes the resulting gap between deployment velocity and security preparedness as a governance emergency.

Cybersecurity leaders and industry executives at RSAC agreed on the problem. Then two companies submitted architectures that answer the question differently. The differences between their designs reveal where the real risks lie.

The monolithic agent problem that security teams are inheriting

The default enterprise agent pattern is a monolithic container. The model causes, calls the tools, executes the generated code, and places the credentials in one process. Every component trusts every other component. The OAuth token, API keys, and git credentials sit in the same environment where the agent runs the code written a few seconds ago.

A quick injection gives everything to the attacker. Tokens are redeemable. Sessions are generative. Blast radius is not an agent. This is the entire container and every connected service.

A CSA and Ambit survey of 228 IT and security professionals determined just how common this is: 43% use shared service accounts for agents, 52% rely on workload identity rather than agent-specific credentials, and 68% can’t distinguish agent activity from human activity in their logs. No single function claimed ownership of AI agent access. Security said it was the developer’s responsibility. The developers said it was a security responsibility. No one owned it.

CrowdStrike CTO Elia Zaitsev said in an exclusive VentureBeat interview that the pattern should look familiar. "What a lot of security agents look like will be similar to what they look like to protect highly privileged users. They have identity, they have access to underlying systems, they reason, they take action," Zaitsev said. "There will rarely be a single solution that is a silver bullet. This is a defense in depth strategy."

CrowdStrike CEO George Kurtz highlighted ClawVoc (a supply chain campaign targeting the OpenClaw agentive framework) at RSAC during his keynote. Koi Security named the campaign on February 1, 2026. According to multiple independent analyzes of the campaign, AntiCERT confirmed 1,184 malicious exploits associated with 12 publisher accounts. Snick’s ToxicSkills research found that 36.8% of 3,984 Clawhub skills scanned had security flaws at any severity level, with 13.4% of those considered critical. The average breakout time has dropped to 29 minutes. Fastest Viewed: 27 seconds. (CrowdStrike 2026 Global Threat Report)

Anthropic separates the brain from the hands

Anthropic’s Managed Agents, which launched in public beta on April 8, split each agent into three components that don’t trust each other: a brain (the cloud and harness routes its decisions), an arm (the disposable Linux container where code executes), and a session (only an attached event log outside of both).

Separating instructions from execution is one of the oldest patterns in software. Microservices, serverless functions, and message queues.

Credentials never enter the sandbox. Anthropic stores the OAuth token in an external vault. When the agent needs to call the MCP tool, it sends a session-bound token to a dedicated proxy. The proxy obtains the actual credentials from the vault, makes the external call, and returns the results. The agent never sees the actual token. Git tokens are wired into the local remote upon sandbox initialization. Pushing and pulling tasks without the agent touching the credentials. For security directors, this means that nothing comes out of a compromised sandbox that the attacker can’t re-use.

The security benefit came as a side effect of a performance improvement. Anthropic separated the brain from the hands so that the guessing could begin before the container even booted. The average time to first token was reduced by approximately 60%. Zero-trust design is also the fastest design. This eliminates the enterprise objection that security adds latency.

Session durability is the third structural benefit. A container crash in the monolithic pattern means total state loss. In managed agents, the session log remains out of both the brain and the hands. If the harness crashes, a new harness boots, reads the event log, and restarts. Neither state translates into productivity gains over lost time. Managed agents include built-in session tracing through the cloud console.

Pricing: $0.08 per session-hour of active runtime, excluding idle time, plus standard API token costs. Security directors can now model the cost of agent compromise per session-hour versus the cost of architectural controls.

Nvidia shuts down the sandbox and keeps an eye on everything inside it

Nvidia’s Nameklow, released in early preview on March 16, takes the opposite approach. It does not isolate the agent from its execution environment. It wraps the entire agent inside four vertical security layers and monitors every activity. Anthropic and Nvidia are the only two vendors that have publicly shipped zero-trust agent architectures at the time of this writing; Others are in development.

NemoClaw places five enforcement layers between the agent and the host. Sandboxed execution uses LANDLOCK, SECCOMP, and network namespace isolation at the kernel level. Default-deny outbound networking forces every external connection through explicit operator approval via a YAML-based policy. Access runs with minimal privileges. A privacy router directs sensitive queries to Nemotron models running locally, reducing token costs and data leakage to zero. The layer that matters most to security teams is intent verification: OpenShell’s policy engine blocks every agent action before it touches the host. The agreement is straightforward for organizations evaluating Nemoclaw. Strong runtime visibility reduces operator staffing costs.

The agent doesn’t know it’s inside Nemoclaw. In-policy actions return to normal. Actions outside the policy get a configurable denial.

Observability is the strongest layer. A real-time terminal user interface logs every action, every network request, every blocked connection. Audit trail is complete. The problem is cost: operator load increases linearly with agent activity. Each new endpoint requires manual approval. The observation quality is high. Autonomy is less. This ratio becomes increasingly expensive in a production environment running dozens of agents.

Durability is the difference no one is talking about. Agent state is persisted as files inside the sandbox. If the sandbox fails, the state goes with it. No external session recovery mechanism exists. Long-running agent tasks have durability risks that security teams need to value into deployment planning before going into production.

credential proximity gap

Both architectures are a real step up from the monolithic default. The question that matters most to security teams is where they differ: How close are the credentials to the execution environment?

Anthropic blast completely removes credentials from the radius. If an attacker compromises the sandbox via instant injection, they get a disposable container with no tokens and no persistent state. Exuding credentials requires a two-hop attack: influencing the brain’s logic, then convincing it to act via a container that contains nothing worth stealing. Single-hop exfiltration has been structurally eliminated.

Nemoclaw controls the blast radius and monitors every activity inside it. Four protection layers limit lateral movement. Default-deny networking blocks unauthorized connections. But the agent and the generated code share the same sandbox. Nvidia’s Privacy Router keeps guess credentials on the host, outside the sandbox. But messaging and integration tokens (Telegram, Slack, Discord) are injected into the sandbox as runtime environment variables. Inference API keys are proxyed through the Privacy Router and not sent directly to the sandbox. Exposure varies by credential type. Credentials are policy-based, not structurally removed.

This difference matters most for indirect instant injection, where an adversary embeds instructions in the content interrogated by the agent as part of a legitimate task. A poisonous web page. A manipulated API response. The intent validation layer evaluates what the agent proposes to do, not the content of data returned by external devices. Injected instructions enter the logic chain as trusted references. With closeness of execution.

In anthropic architecture, indirect injection can affect the logic but not access the credential vault. In the namecloak architecture, the injected context sits next to both the logic and the execution inside the shared sandbox. This is the biggest difference between the two designs.

NCC Group’s David Brauchler, technical director and head of AI/ML security, advocates gated agent architectures built on trust segmentation principles, where AI systems derive the trust level of the data they process. Unreliable input, restricted capabilities. Both Anthropic and Nvidia move in this direction. Nor does it come completely.

Zero-Trust Architecture Audit for AI Agents

The audit grid covers three vendor patterns, five actions per row, across six security dimensions. It is based on five priorities:

Audit each deployed agent for monolithic patterns. Mark any agents that have an OAuth token in its execution environment. CSA data shows that 43% of people use shared service accounts. Those are the first goals.
Credential separation is required in the agent deployment RFP. Specify whether the vendor structurally deletes credentials or gates them through policy. Both reduce risk. They reduce it by different amounts with different failure modes.
Test session recovery before production. Hit a sandbox in the middle of action. Verify that the state is alive. If this does not happen, long-running jobs run the risk of data-loss that increases with the duration of the job.
Staff for observation model. Anthropic’s console tracing integrates with existing observation workflows. NemoClaw’s TUI requires an operator-in-the-loop. The math of staffing is different.
Track indirect prompt injection roadmap. No architecture completely resolves this vector. Anthropic limits the blast radius of a successful injection. NemoClaw catches malicious proposed actions but not maliciously returned data. This specific gap requires vendor roadmap commitments.

As soon as the two architectures were shipped, zero trust for AI agents ceased to be a research topic. Unbroken default is a liability. The 65-point gap between deployment velocity and security approval is where the next category of breaches will begin.

<a href

AI agent credentials live in the same box as untrusted code. Two new architectures show where the blast radius actually stops.

The monolithic agent problem that security teams are inheriting

Anthropic separates the brain from the hands

Nvidia shuts down the sandbox and keeps an eye on everything inside it

credential proximity gap

Zero-Trust Architecture Audit for AI Agents

Like this:

Related

Leave a Comment Cancel reply

The monolithic agent problem that security teams are inheriting

Anthropic separates the brain from the hands

Nvidia shuts down the sandbox and keeps an eye on everything inside it

credential proximity gap

Zero-Trust Architecture Audit for AI Agents

Share this:

Like this:

Related

Leave a Comment Cancel reply