ATA was born out of an internal Amazon hackathon in August 2024, and security team members say it has since become an important tool. The main concept underlying ATA is that it is not a single AI agent developed for comprehensive security testing and threat analysis. Instead, Amazon developed several specialized AI agents that compete against each other in teams of two to rapidly investigate actual attack techniques and different methods that could be used against Amazon’s systems – and then propose security controls for human review.
“The initial concept was intended to address a critical gap in security testing – the challenge of keeping detection capabilities operational in a limited coverage and rapidly evolving threat landscape,” Steve Schmidt, Amazon’s chief security officer, tells WIRED. “Limited coverage means you can’t reach all the software or you can’t reach all the applications because you don’t have enough humans. And then it’s great to analyze a set of software, but if you don’t keep detection systems up to date with changes in the threat landscape, you’re missing half of the picture.”
As part of expanding its use of ATA, Amazon has developed special “high-fidelity” test environments that are a deeply realistic reflection of Amazon’s production systems, so ATA can both ingest and produce real telemetry for analysis.
The company’s security teams also placed emphasis on designing ATA such that every technology it uses and the detection capability it produces is validated with real, automated test and system data. Red Team agents are working on finding attacks that can be used against Amazon’s systems, executing real commands in ATA’s special test environment that generate verifiable logs. Blue teams, or defense-focused agents, use real telemetry to confirm whether the protection they are proposing is effective. And whenever an agent develops a new technology, it also produces time-stamped logs to prove that its claims are accurate.
This verifiability reduces false positives, says Schmidt, and serves as a form of “hallucination management.” Because the system is designed to demand certain standards of observable evidence, Schmidt claims that “hallucinations are architecturally impossible.”
<a href