Nvidia's DGX Station Is A Desktop Supercomputer That Runs Trillion-parameter AI Models Without The Cloud

Nvidia on Monday unveiled a deskside supercomputer powerful enough to run AI models with a trillion parameters — almost the scale of GPT-4 — without touching the cloud. The machine, called the DGX Station, packs 748 gigabytes of coherent memory and 20 petaflops of compute into a box that sits next to the monitor, and it may be the most important personal computing product since the original Mac Pro that convinced creative professionals to abandon the workstation.

The announcement, made at the company’s annual GTC conference in San Jose, comes at a time when the AI industry is grappling with a fundamental tension: The world’s most powerful models require vast data center infrastructure, but the developers and enterprises building on those models increasingly want to keep their data, their agents, and their intellectual property local. The DGX Station is Nvidia’s answer – a six-figure machine that bridges the distance between the limits of AI and an engineer’s desk.

What 20 Petaflops Really Means on Your Desktop

The DGX Station is built around the new GB300 Grace Blackwell Ultra desktop superchip, which fuses a 72-core Grace CPU and Blackwell Ultra GPU via Nvidia’s NVLink-C2C interconnect. That link provides 1.8 terabytes per second of coherent bandwidth between the two processors – seven times the speed of PCIe Gen 6 – meaning the CPU and GPU share a single, uninterrupted pool of memory without the bottlenecks that typically impede the work of desktop AI.

Twenty petaflops – 20 quadrillion operations per second – would have ranked this machine among the world’s top supercomputers less than a decade ago. The Summit system at Oak Ridge National Laboratory, which took the global No. 1 spot in 2018, had nearly ten times the performance but occupied a room the size of two basketball courts. Nvidia is packing a meaningful fraction of that capability into something that plugs into a wall outlet.

The 748GB of integrated memory is arguably the more important number. Trillion-parameter models are huge neural networks that must be completely loaded into memory to run. Without enough memory, no amount of processing speed will matter – the model won’t fit at all. DGX Station clears that bar, and it does so with a consistent architecture that eliminates the latency penalty of shuttling data between the CPU and GPU memory pools.

Always-on agents require always-on hardware

Nvidia has clearly designed the DGX Station to be the next step in AI: autonomous agents that reason, plan, write code and execute tasks proactively – not just systems that respond to signals. Every major announcement at GTC 2026 reinforced this "agentic ai" thesis, and the DGX station is where those agents are to be created and run.

The main pair is NemoClaw, a new open-source stack that Nvidia also announced on Monday. NemoClaw bundles Nvidia’s Nemotron open model with OpenShell, a secure runtime that implements policy-based security, network, and privacy guardrails for autonomous agents. A single command installs the entire stack. Nvidia founder and CEO Jensen Huang framed the combination in clear terms, calling OpenClave – the comprehensive agent platform supporting NemoClave – "Operating system for personal AI" And this is being compared directly to Mac and Windows.

The logic is straightforward: cloud instances spin up and down on demand, but always-on agents require persistent compute, persistent memory, and persistent state. A machine under your desk, running 24/7 with local data and local models inside a security sandbox, is architecturally better suited to that workload than a rented GPU in someone else’s data center. DGX Station can serve as a personal supercomputer for a single developer or as a shared compute node for teams, and it supports air-gapped configurations for classified or regulated environments where data may never leave the building.

From desk prototyping to data center production with zero rewrites

One of the cleverest aspects of the DGX Station’s design is what Nvidia calls architectural continuity. Applications built on the machine transfer seamlessly to the company’s GB300 NVL72 data center system – a 72-GPU rack designed for hyperscale AI factories – without having to rearchitect a single line of code. Nvidia is selling a vertically integrated pipeline: prototype on your desk, then scale to the cloud when you’re ready.

This matters because the biggest hidden cost in AI development today isn’t calculated — it’s engineering time wasted rewriting code for different hardware configurations. Fine-tuned models on local GPU clusters often require substantial rework to deploy on cloud infrastructures with different memory architectures, networking stacks, and software dependencies. DGX Station eliminates that friction by running the same NVIDIA AI software stack that powers every tier of Nvidia’s infrastructure, from DGX Spark to Vera Rubin NVL72.

Nvidia also expanded Station’s little brother DGX Spark with new clustering support. Four Sparc units can now operate as an integrated system with near-linear performance scaling – A "desktop data center" That fits on a conference table without rack infrastructure or IT tickets. For teams that need to improve medium-sized models or develop smaller-scale agents, Clustered provides a reliable departmental AI platform at a fraction of the cost of Sparks Station.

Early buyers tell where the market is headed

The initial customer roster for DGX Station maps industries where AI is rapidly transforming use as a daily operating tool. Snowflake is using the system to test its open-source Arctic training framework locally. EPRI, the Electric Power Research Institute, is pioneering AI-powered weather forecasting to strengthen electrical grid reliability. Medivis is integrating vision language models into surgical workflows. Microsoft Research and Cornell have deployed systems for practical AI training on a large scale.

The systems are available to order now and will ship in the coming months from ASUS, Dell Technologies, GIGABYTE, MSI, and Supermicro, with HP joining later in the year. Nvidia hasn’t disclosed pricing, but GB300 components and the company’s historical DGX pricing suggest a six-figure investment — expensive by workstation standards, but remarkably cheap compared to the cloud GPU cost of running trillion-parameter inference at scale.

The list of supported models underscores how open the AI ecosystem has become: developers can run and fine-tune OpenAI’s gpt-oss-120b, Google Gemma 3, Qwen3, Mistral Large 3, DeepSeek V3.2, and Nvidia’s own Nemotron model, among others. The DGX station is model-agnostic by design – a hardware Switzerland in an industry where model allegiances change quarterly.

Nvidia’s real strategy: Owning every layer of the AI stack from classroom to office

DGX did not reach station zero. This was one piece of a broader set of GTC 2026 announcements that collectively reflect Nvidia’s ambition to supply AI computation at every physical scale.

To top it off, Nvidia unveiled the Vera Rubin platform – seven new chips in full production – anchored by the Vera Rubin NVL 72 rack, which integrates 72 next-generation Rubin GPUs and claims 10 times more inference throughput per watt than the current Blackwell generation. The Vera CPU, with 88 custom Olympus cores, targets the orchestration layer that agentic workloads are increasingly demanding. At the far end, Nvidia announced the Vera Rubin Space Module for orbital data centers, which provides 25 times more AI compute for space-based inference than the H100.

Between Orbit and Office, Nvidia revealed partnerships with Adobe for creative AI, automakers like BYD and Nissan for Level 4 autonomous vehicles, an alliance with Mistral AI and seven other labs to build open frontier models, and Dynamo 1.0, an open-source inference operating system already adopted by AWS, Azure, Google Cloud, and a roster of AI-native companies including Cursor and Perplexity.

The pattern is unmistakable: Nvidia wants to be the go-to computing platform – hardware, software, and models – for every AI workload, everywhere. The DGX station is the piece that bridges the gap between the cloud and the person.

The cloud isn’t dead, but its monopoly on serious AI work is ending

For the past several years, the default assumption in AI has been that serious work requires cloud GPU instances – renting Nvidia hardware from AWS, Azure, or Google Cloud. That model works, but it has real costs: data exfiltration fees, latency, security risks from sending proprietary data to third-party infrastructure, and the fundamental loss of control inherent in renting someone else’s computer.

DGX Station doesn’t eliminate the cloud — Nvidia’s data center business dwarfs its desktop revenue and is accelerating. But it makes it a reliable local option for an important and growing range of workloads. Training a Frontier model from scratch still requires thousands of GPUs in a warehouse. Fine-tuning a trillion-parameter open model on proprietary data? Running estimates for internal agents processing sensitive documents? Prototype before committing to cloud expense? A machine under your desk is starting to look like the logical choice.

This is the strategic beauty of the product: It expands Nvidia’s addressable market into personal AI infrastructure while strengthening the cloud business, as everything built locally is designed to scale up to Nvidia’s data center platform. It’s not cloud versus desk. it’s could And Desk, and Nvidia supplies both.

A supercomputer on every desk – and an agent who never sleeps on top of it

The defining slogan of the PC revolution was "A computer on every desk and in every home." Four decades later, Nvidia is updating the complex with an uncomfortable addition. The DGX station places actual supercomputing power – the kind that runs national laboratories – next to a keyboard, and NemoClaw places an autonomous AI agent on top of it that runs around the clock, writing code, calling up tools, and completing tasks while its owner sleeps.

Whether that future is exciting or turbulent depends on your vantage point. But one thing is no longer debatable: the infrastructure needed to build, run, and own frontier AI has moved from server rooms to desk drawers. And the company that sells almost every serious AI chip on the planet has made sure it sells desk drawers, too.

<a href

Nvidia's DGX Station is a desktop supercomputer that runs trillion-parameter AI models without the cloud

What 20 Petaflops Really Means on Your Desktop

Always-on agents require always-on hardware

From desk prototyping to data center production with zero rewrites

Early buyers tell where the market is headed

Nvidia’s real strategy: Owning every layer of the AI stack from classroom to office

The cloud isn’t dead, but its monopoly on serious AI work is ending

A supercomputer on every desk – and an agent who never sleeps on top of it

Like this:

Related

Leave a Comment Cancel reply

What 20 Petaflops Really Means on Your Desktop

Always-on agents require always-on hardware

From desk prototyping to data center production with zero rewrites

Early buyers tell where the market is headed

Nvidia’s real strategy: Owning every layer of the AI ​​stack from classroom to office

The cloud isn’t dead, but its monopoly on serious AI work is ending

A supercomputer on every desk – and an agent who never sleeps on top of it

Share this:

Like this:

Related

Leave a Comment Cancel reply

Nvidia’s real strategy: Owning every layer of the AI stack from classroom to office