OpenAI upgrades its Responses API to support agent skills and a complete terminal shell

STJU0 q1n9P9Di7ulrG2j cPcT4sKa
Until recently, the practice of building AI agents was a bit like training a long-distance runner with a thirty-second memory.

Yes, you can give your AI model tools and instructions, but after a few dozen interactions – several laps around the track, to extend our running analogy – it will inevitably lose context and start hallucinating.

With OpenAI’s latest update to its Response API – the application programming interface that allows developers on OpenAI’s platform to access multiple agentic tools like web search and file search with a single call – the company is signaling that the era of the limited agent is waning.

The updates announced today include server-side compaction, hosted shell containers, and a new "Skill" Standards for Agents.

With these three major updates, OpenAI is effectively handing agents a permanent desk, a terminal, and a memory that doesn’t fade and will help agents develop into reliable, long-term digital workers.

Technology: overcoming ‘context amnesia’

The most significant technical hurdle for autonomous agents has always been "chaos" Of long running tasks. Whenever an agent calls a tool or runs a script, the conversation history grows.

Eventually, the model reaches its token limit, and the developer is forced to shorten the history – often deleting it. "logic" The agent has to finish the job.

OpenAI’s answer is server-side compaction. Unlike simple truncation, compaction allows agents to run for hours or even days.

Early data from e-commerce platform Triple Whale shows this is a breakthrough in stability: their agent, Moby, successfully navigated a session containing 5 million tokens and 150 tool calls without a drop in accuracy.

In practice, this means that the model can "summarize" Keeps one’s own past works in a compressed state, keeping essential context alive while clearing out the noise. This transforms the model from a forgettable assistant into an ongoing system process.

Managed Cloud Sandbox

The introduction of shell tools takes OpenAI into the realm of managed computation. Developers can now choose container_auto, Which provisions a Debian 12 environment hosted by OpenAI.

It’s not just a code interpreter: it gives each agent its own full terminal environment by preloading:

  • native execution environment Which includes Python 3.11, Node.js 22, Java 17, Go 1.23, and Ruby 3.1.

  • persistent storage through /mnt/dataAllows agents to generate, save, and download artifacts.

  • networking capabilities Which allows agents to access the Internet to install libraries or interact with third-party APIs.

Hosted Shell and it remains persistent /mnt/data Storage provides a managed environment where agents can perform complex data transformations using Python or Java without requiring the team to build and maintain custom ETL (extract, transform, load) middleware for each AI project.

By leveraging these hosted containers, data engineers can implement high-performance data processing tasks while minimizing "multiple responsibilities" Which come with specialized infrastructure management, removal of building overhead and securing your own sandbox. OpenAI is essentially saying: “Give us the instructions; we’ll provide the computers.”

OpenAI’s skills vs Anthropic’s skills

While OpenAI is racing toward a unified agent orchestration stack, it faces a significant philosophical challenge from Anthropic’s Agent Skills.

Both companies converged on a remarkably similar file structure – using SKILL.md (Markdown) YAML appears alongside Frontmatter – but their underlying strategies reveal different visions for the future of work.

OpenAI’s approach prioritizes the following "Programmable Substrate" Optimized for developer velocity. By bundling shell, memory, and skills into the Response API, they offer a "turnkey" Experience building complex agents faster.

Already, enterprise AI search startup Glenn has reported a jump in tool accuracy from 73% to 85% using OpenAI’s skills framework.

In contrast, Anthropic has launched Agent Skills as an independent open standard (agentskills.io).

While OpenAI’s system is tightly integrated into its own cloud infrastructure, Anthropic’s skills are designed for portability. Skills built for the cloud can theoretically be moved to VS Code, Cursor, or any other platform that adopts the specification.

Indeed, the hit new open source AI agent OpenClaw adopts precisely this SKILL.md Manifest and folder-based packaging, allowing it to capture a wealth of specialized procedural knowledge natively designed for the cloud.

This architectural adaptability has promoted community-driven "skill jump" On platforms like Clawhub, which now hosts over 3,000 community-built extensions ranging from smart home integration to complex enterprise workflow automation.

This cross-pollination shows that "Skill" has become a portable, versioned asset rather than a vendor-locked feature. Because OpenClaw supports multiple models – including OpenAI’s GPT-5 series and local Llama examples – developers can now write a skill once and deploy it across heterogeneous scenarios of agents.

For technology decision makers, this open standard is turning into the industry’s preferred method of externalization and sharing. "agentic knowledge," Moving proprietary signals toward a shared, auditable, and interoperable infrastructure.

But there is another important difference between OpenAI and Anthropic "Skill."

OpenAI uses server-side compaction to manage the active state of long-running sessions. Anthropic uses Progressive Disclosure, a three-tier system where the model initially knows only the name and description of the skill.

Full descriptions and supporting scripts are loaded only when the task specifically requires them. This allows massive skill libraries—brand guidelines, legal checklists, and code templates—to exist without impacting the working memory of the model.

Implications for enterprise technology decision makers

For engineers focused on "Fast deployment and fine-tuning," The combination of server-side compaction and efficiency boosts productivity at scale

Instead of creating custom state management for each agent run, engineers can leverage built-in compression to handle multi-hour tasks.

skills allow "packaged ip," Where specific fine-tuning or specialized procedural knowledge can be modularized and reused in different internal projects.

For those tasked with leading the way with AI "chat box" Into a production-grade workflow – OpenAI’s announcement marks the end of it "optimized infrastructure" Era.

Historically, orchestrating an agent required significant manual scaffolding: developers had to build custom state-management logic to handle lengthy conversations and a secure, short-lived sandbox for executing code.

the challenge is no more "How do I give the terminal to this agent?" But "Which skills are authorized for which users?" And "How do we audit artifacts produced in a hosted file system?" OpenAI provided the engine and chassis; The orchestrator’s job is now to define the rules of the road.

For security operations (SecOps) managers, giving shell and network access to AI models is a high-risk development. OpenAI’s use of domain secrets and org aliases provides a defense-in-depth strategy, ensuring that agents can call APIs without exposing raw credentials to the model context.

But as agents become easier to deploy "Skill," SecOps must remain vigilant "malicious skills" Which could introduce quick injection vulnerabilities or unauthorized data exfiltration paths.

How should enterprises make decisions?

OpenAI is no longer just selling "Brain" (model); it is selling "Office" (container), the "Memory" (condensation), and "training manual" (Skill). For enterprise leaders, the choice is becoming clear:

  • Choose OpenAI if you need an integrated, high-velocity environment for long-term autonomous work.

  • Choose Anthropic if your organization needs model-agnostic portability and an open ecosystem standard.

Ultimately, the announcements signal that AI is moving out of the chat box and into system architecture "Quick Spaghetti" Maintainable, versioned, and scalable business workflows.



<a href

Leave a Comment