Hewlett Packard Enterprise and AMD have signaled a notable shift in how large-scale AI systems will be built in the second half of the decade. The two longtime partners unveiled a rack-scale platform based on AMD’s Helios architecture, which is an open, Ethernet-driven design engineered to train trillion-parameter models and support the swelling inferencing loads of cloud providers.

At its core, Helios is AMD’s answer to the dense, vertically integrated AI racks promoted by competitors. The system blends next-generation Instinct accelerators, EPYC CPUs, Pensando networking silicon and the ROCm software stack into a unified platform. HPE will be one of the first companies to commercialize a complete Helios rack, slated for global availability in 2026.

Faster Deployment, Clearer Path to Scale

HPE executives say customers are asking for faster deployment cycles and a clearer path to scale. The company’s engineers have paired the Helios compute stack with a new HPE Juniper Networking scale-up switch to create a rack that behaves as a single, tightly coupled GPU domain. That switch is the first to run the Universal Accelerator Link over Ethernet (UALoE) protocol, a standards-based alternative to proprietary GPU interconnects.

HPE’s pitch leans heavily on openness. By running scale-up traffic over standard Ethernet and basing the rack on the Open Compute Project’s Open Rack Wide specification, the company aims to curb vendor lock-in while giving operators the flexibility to mix and match components over time. The open-rack format also allows a double-wide chassis and direct-to-liquid cooling. These features are practical necessities as system power pushes toward triple-digit kilowatts.

The performance targets reflect this new era of accelerated computing. A single Helios rack links 72 next-generation Instinct MI450-series GPUs, delivering up to 31 TB of high-bandwidth HBM4 memory and rack-scale connectivity using AMD’s next-generation Ethernet-based scale-up fabric. AMD projects up to 2.9 exaFLOPS of FP4 throughput per rack, placing Helios firmly in the class of systems purpose-built for generative AI and multi-modal workloads.

Tightly Integrated Platforms 

For chipmakers and system builders alike, this is now cutting edge infrastructure: tightly integrated platforms designed to support AI factories comprised of massive clusters optimized for continuous training, inference and data pipelining. NVIDIA’s NVLink-based GB200 NVL72 rack, with its proprietary fabric and Grace CPUs, is the dominant model. Helios, by contrast, presents an alternate route using ubiquitous Ethernet and open software tooling.

As generative AI gains traction across cloud, enterprise and research sectors, the push for more interoperable and energy-efficient infrastructure has become urgent. Cloud service providers, particularly neo-cloud GPU-hosting firms, are pressing vendors for solutions that shorten deployment timelines. HPE says Helios, with its turnkey design and AI-native automation in the networking layer, is meant to answer that call.

What’s emerging is a competitive landscape defined less by individual chips and more by how entire racks, and soon even entire data centers, are architected. Vendors are leaning into modularity, liquid cooling and open interconnects as customers navigate both soaring demand and mounting concerns over power budgets. Against that backdrop, Helios provides AMD and HPE with a clear narrative: openness, interoperability and a full-stack alternative for the next wave of AI clusters.

With commercial availability scheduled for 2026, the companies now face the work of turning reference designs into production systems at scale. If customer interest in large GPU clusters continues to rise, as nearly every indicator suggests, it may prove to be a well-timed collaboration.