Locator: 49193VERARUBIN.
For the archives.
From the blog, August 14, 2025.
Huge reminder: two different business models -- this is an incredibly important distinction
- the cloud: Nvidia and almost everyone else;
- the edge: Apple
In a nutshell:
- Nvidia Vera Rubin platform
- scheduled for release in 2H26
- combination of a new Rubin GPU with the custom Vera CPU to created a powerful, high-bandwidth engine for science and AI
- a single rack offers 100TB of memory and 1.7 PB/s of memory bandwidth
- the platform will include next-generation components, including HBM4 memory and NyLink 6 interconnects
- will be able to hand million-token and generative video
- HBM -- wiki -- an industry standard; adopted in 2013; iterations, HBM2, HBM3, and now HBM4
- 3D-stacked synchronous dynamic random-access memory (SDRAM)
- NvLink 6: high-speed interconnect technology
Rubin -- microarchitecture -- wiki.
Nvidia is using its own Blackwell GPUs to accelerate the design of Vera and Rubin, as well as Rubin's successor, Feynman.
Nvidia unveils Rubin CPX: a new class of GPU designed for massive-context inference. Link here. September 9, 2025.
Vera Rubin: wiki. The origin of the naming of Nvidia's new chip.
More Background
Nvidia unveils Rubin CPX: a new class of GPU designed for massive-context inference. Link here. September 9, 2025:
NVIDIA® today announced NVIDIA Rubin CPX, a new class of GPU purpose-built for massive-context processing. This enables AI systems to handle million-token software coding and generative video with groundbreaking speed and efficiency.
Rubin CPX works hand in hand with NVIDIA Vera CPUs and Rubin GPUs inside the new NVIDIA Vera Rubin NVL144 CPX platform. This integrated NVIDIA MGX system packs 8 exaflops of AI compute to provide 7.5x more AI performance than NVIDIA GB300 NVL72 systems, as well as 100TB of fast memory and 1.7 petabytes per second of memory bandwidth in a single rack. A dedicated Rubin CPX compute tray will also be offered for customers looking to reuse existing Vera Rubin NVL144 systems.
“The Vera Rubin platform will mark another leap in the frontier of AI computing — introducing both the next-generation Rubin GPU and a new category of processors called CPX,” said Jensen Huang, founder and CEO of NVIDIA. “Just as RTX revolutionized graphics and physical AI, Rubin CPX is the first CUDA GPU purpose-built for massive-context AI, where models reason across millions of tokens of knowledge at once.”
NVIDIA Rubin CPX enables the highest performance and token revenue for long-context processing — far beyond what today’s systems were designed to handle. This transforms AI coding assistants from simple code-generation tools into sophisticated systems that can comprehend and optimize large-scale software projects.
To process video, AI models can take up to 1 million tokens for an hour of content, pushing the limits of traditional GPU compute. Rubin CPX integrates video decoder and encoders, as well as long-context inference processing, in a single chip for unprecedented capabilities in long-format applications such as video search and high-quality generative video.
Built on the NVIDIA Rubin architecture, the Rubin CPX GPU uses a cost‑efficient, monolithic die design packed with powerful NVFP4 computing resources and is optimized to deliver extremely high performance and energy efficiency for AI inference tasks.