Jensen Huang announces Vera Rubin at CES 2026: A new AI computing platform marks a breakthrough for the industry

2026-03-11 14:06:42

After five years without launching consumer graphics cards at CES, NVIDIA CEO Jensen Huang has focused on a different goal — introducing the Vera Rubin computing platform, a 2.5-ton AI server system designed to accelerate training of next-generation AI models. This is not just hardware; it’s a comprehensive strategy to revolutionize how businesses build and deploy AI infrastructure.

Huang appeared at three events within 48 hours, from NVIDIA Live to a partnership with Siemens on industrial AI, then at Lenovo TechWorld. His core message: the approximately $10 trillion worth of computing resources invested over the past decade need a complete upgrade.

Vera Rubin — An integrated 6-chip architecture to surpass Blackwell

Vera Rubin breaks NVIDIA’s internal rules. Instead of changing 1-2 chips per generation, this time the company redesigned six different chips simultaneously, entering mass production. The reason is traditional performance enhancement methods can no longer keep pace with the AI model’s 10x annual growth, especially as Moore’s Law slows down.

NVIDIA’s approach is “extreme co-design” — innovating simultaneously at all levels of the chips and the entire platform. The Vera CPU integrates 88 custom Olympus cores with 176 threads, supporting 1.5 TB of system memory and 1.2 TB/s LPDDR5X bandwidth — three times that of the Grace generation. The Rubin GPU delivers NVFP4 inference power at 50 PFLOPS (five times Blackwell) with 336 billion transistors, featuring a third-generation Transformer engine with dynamic precision adjustment.

To connect all these components, NVIDIA deployed ConnectX-9 (800 Gb/s network card), BlueField-4 DPU (storage AI endpoint processor), NVLink-6 switch chip (connects 18 nodes, supporting up to 72 Rubin GPUs as a single unit), and Spectrum-6 Ethernet switch chip (512 channels, 200 Gbps each).

Unmatched performance: From training to inference

The Vera Rubin NVL72 system delivers impressive numbers. In NVFP4 inference, performance reaches 3.6 EFLOPS — five times Blackwell’s. For NVFP4 training, performance hits 2.5 EFLOPS, a 3.5x increase. LPDDR5X memory capacity reaches 54 TB (three times), while HBM reaches 20.7 TB with HBM4 bandwidth at 1.6 PB/s (2.8x increase).

Remarkably, despite the huge performance boost, transistor count only increases 1.7 times (to 220 trillion), demonstrating NVIDIA’s advanced semiconductor optimization. Training a 100 trillion parameter model with Vera Rubin requires only a quarter of the systems needed for Blackwell, and token creation costs are one-tenth.

Most importantly, throughput (AI tokens per watt per dollar) increases tenfold over Blackwell. For a gigawatt data center costing $50 billion, this means revenue capacity doubles — each dollar invested generates twice the value.

From 43 cables to 0: Outsourcing assembly innovation

Vera Rubin also introduces a breakthrough in technical design. Previously, each supercomputer node required 43 cables, took 2 hours to assemble, and was prone to errors. Now, Vera Rubin nodes use zero cables, only six liquid cooling tubes, and can be assembled in five minutes.

Behind the server chassis, nearly 3.2 km of copper wiring and 5,000 copper cables form the NVLink main network transmitting at 400 Gbps. Huang humorously notes: “Weighing hundreds of pounds, you need a very strong CEO to handle this.”

Infinite KV Cache: Context memory no longer a bottleneck

A major AI challenge is that during long conversations, the KV Cache (key-value memory) overflows HBM memory. Vera Rubin’s solution is deploying BlueField-4 processors inside the server chassis to manage KV Cache separately.

Each node has four BlueField-4s, each with an additional 150 TB of context memory allocated to GPUs. Each GPU gets 16 TB of memory — compared to about 1 TB onboard. Crucially, bandwidth remains at 200 Gbps, with no slowdown in data transfer.

Spectrum-X: AI-specific network saves $5 billion

To support “notes” spread across dozens of server racks and thousands of GPUs acting as a unified memory, the network must be large, fast, and reliable. Spectrum-X is the world’s first end-to-end Ethernet network platform dedicated to generative AI, using TSMC’s COOP process with silicon photonics technology, offering 512 channels × 200 Gbps.

Huang estimates: a $50 billion gigawatt data center using Spectrum-X can boost throughput by 25%, saving around $5 billion. “You could say this network system is almost ‘cost-free.’”

Computing security: All data encrypted during transmission

Vera Rubin supports Confidential Computing — all data is encrypted during transfer, storage, and processing, including PCIe, NVLink, CPU-GPU communication, and other buses. Enterprises can confidently deploy their models externally without worrying about data leaks.

Physical AI: From robotics to autonomous vehicles, NVIDIA focuses on the real world

Huang emphasizes the “three-core computer” architecture for physical AI: training computers built from GPUs, inference “cerebellum” computers placed in robots or cars, and simulation platforms (Omniverse and Cosmos) providing virtual training environments.

Based on this architecture, NVIDIA announced Alpamayo — the first autonomous driving model capable of reasoning and inference. Unlike traditional self-driving systems, Alpamayo is an end-to-end training system that can handle the “long tail” of driving scenarios. When faced with complex, unseen traffic situations, Alpamayo not only executes commands but also reasons like a human driver.

Mercedes CLA equipped with Alpamayo technology will debut in the US in Q1, then expand to Europe and Asia. The vehicle is rated as the safest car globally by NCAP, thanks to its “dual safety stack” design — when the end-to-end AI model lacks confidence, the system switches immediately to a more traditional, stable safety mode.

On stage, Huang invites humanoid robots and quadruped robots like Boston Dynamics and Agility to demonstrate. He emphasizes that the biggest robot is actually a factory. All robots will be equipped with Jetson mini computers, trained in Isaac Simulator on the Omniverse platform. NVIDIA is also integrating this technology into industrial ecosystems like Synopsys, Cadence, Siemens.

Open-source models: NVIDIA’s strategic choice

Huang praises the open-source community, noting that DeepSeek V1’s breakthrough last year directly spurred a wave of industry-wide development. On slides, models like Kimi K2 and DeepSeek V3.2 are ranked #1 and #2 in open-source.

While current open models may lag top-tier models by about six months, new models emerge every six months. This rapid iteration cycle keeps startups, giants, and researchers eager — nobody wants to miss out, including NVIDIA.

This time, NVIDIA isn’t just selling “shovels” or graphics cards; they’re building DGX Cloud supercomputers, developing advanced models like La Proteina (protein synthesis) and OpenFold 3. The open-source ecosystem spans biomedicine, physical AI, agent models, robotics, and autonomous driving.

Many open models from NVIDIA’s Nemotron family are also notable, including speech, multimodal, generative retrieval, and safety models, which have achieved top rankings and are widely adopted by enterprises.

Future outlook: From virtual to physical worlds

Previously, NVIDIA made chips for the virtual world. Now, Huang clearly shifts focus to physical AI, with autonomous driving and humanoid robots as representatives, aiming to enter the real, physical world with fierce competition.

Amid debates over an AI bubble, besides unveiling Vera Rubin’s supercomputing platform to meet computational demands, Huang is heavily investing in applications and software. The goal is to make AI’s transformative impact visually clear — from safer autonomous driving to reasoning robots.

Ultimately, only when the battle moves into the real world can these weapons continue to sell.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.