NVIDIA HGX Systems
Enterprise AI Infrastructure

NVIDIA HGX Systems

Power your AI factories with NVIDIA Blackwell architecture. From GPU clusters to complete rack-scale solutions, we deliver unprecedented computational performance, density, and efficiency.

Why NVIDIA HGX?

Unprecedented AI Performance

Unprecedented Performance

Up to 15x inference and 3x training performance over previous generation

Maximum Density

Up to 144 GPUs in a single rack for hyperscale AI deployments

Advanced Cooling

Direct liquid cooling captures up to 98% of heat for 40% power savings

High-Speed Networking

800Gb/s compute fabric with integrated SuperNICs

Product Portfolio

NVIDIA Blackwell Architecture Solutions

A comprehensive range of air-cooled and liquid-cooled systems optimized for NVIDIA HGX platforms.

HGX B300 8-GPU System — 4U Liquid-Cooled or 8U Air-Cooled

HGX B300 8-GPU System

4U Liquid-Cooled or 8U Air-Cooled

High-performance 8-GPU system featuring NVIDIA Blackwell Ultra architecture with integrated networking for turnkey deployment.

GPUs per Rack

Up to 64 (liquid-cooled)

GPU-GPU Interconnect

1.8TB/s

HBM3e Memory

288 GB per GPU

Networking

800Gb/s Compute Fabric

Key Features:

Front I/O design
Integrated ConnectX-8 SuperNICs
Dual CPU options
Enhanced serviceability
HGX B200 8-GPU System — 8U/10U Air-Cooled or 4U Liquid-Cooled

HGX B200 8-GPU System

8U/10U Air-Cooled or 4U Liquid-Cooled

Next-generation air-cooled and liquid-cooled systems with enhanced cooling architecture and high configurability.

GPUs per Rack

Up to 96 (liquid-cooled)

GPU-GPU Interconnect

1.8TB/s

HBM3e Memory

180 GB per GPU

Power Savings

Up to 40% with DLC

Key Features:

1:1 GPU-to-NIC ratio
DLC-2 technology
Up to 92% heat capture
Noise levels as low as 50dB
GB300 NVL72 Rack System — 72 GPU Liquid-Cooled Rack

GB300 NVL72 Rack System

72 GPU Liquid-Cooled Rack

Exascale supercomputer in a single rack combining 72 NVIDIA Blackwell Ultra GPUs with Grace CPUs for ultimate AI performance.

Total GPUs

72 Blackwell Ultra

Total CPUs

36 Grace CPUs

NVLink Domain

All 72 GPUs connected

CDU Capacity

250kW liquid-to-liquid

Key Features:

Single NVLink domain
72-core Arm Neoverse V2
Liquid-to-air sidecar option
Maximum compute density
2-OU Liquid-Cooled System — HGX B300 Ultra-Dense

2-OU Liquid-Cooled System

HGX B300 Ultra-Dense

Unmatched GPU density for hyperscale deployments with up to 144 GPUs per rack in compact 2-OU nodes.

GPUs per Rack

Up to 144

Nodes per Rack

Up to 18

HBM3e Memory

288 GB per GPU

Rack Standard

21-inch OCP ORV3

Key Features:

Blind-mate manifold connections
Modular GPU/CPU tray
1,100W TDP per GPU
Exceptional serviceability
Cooling Solutions

Air-Cooled & Liquid-Cooled Options

Choose the cooling solution that best fits your data center requirements and workload demands.

Air-Cooled Systems

Enhanced cooling architecture with next-gen heatsinks for standard data center deployment.

  • Up to 32 GPUs per rack
  • Front or rear I/O options
  • Standard data center compatible
  • Enhanced serviceability

Liquid-Cooled Systems

Direct liquid cooling technology for maximum density and efficiency.

  • Up to 144 GPUs per rack
  • Up to 98% heat capture
  • 40% power savings
  • Noise levels as low as 50dB
Applications

Powering AI Workloads

Large Language Model Training

Train foundation models and LLMs at scale with massive GPU clusters

AI Inference at Scale

High-throughput inference for production AI applications

Scientific Computing

Complex simulations, climate modeling, and research workloads

Generative AI

Power generative AI applications with unprecedented compute

AI Agents at Scale

AI Agent Factories on HGX

Train, fine-tune and serve enterprise-grade agent populations on the same rack. HGX delivers the memory, NVLink fabric and throughput required to operate thousands of concurrent agent sessions with predictable latency.

Train Your Own Agents

Fine-tune Llama, Mistral or Nemotron on your domain data, RLHF for agent behaviour and tool-use, distill into faster student models.

High-Throughput Inference

NIM microservices on HGX serve thousands of concurrent agent token streams with sub-second TTFT, backed by NVLink + 800Gb/s networking.

Population-Scale Memory

288 GB HBM3e per GPU and 1.8 TB/s GPU-GPU bandwidth keep large agent populations resident with their tools, RAG indexes and KV caches.

Multi-Tenant Governance

Run isolated agent fleets per business unit on shared infrastructure, with NeMo Guardrails, audit logging and policy enforcement at the cluster level.

A unified path from pilot to production fleet

Prototype on a GPU workstation, pilot on a DGX Spark, then promote the same agent stack to an HGX cluster for company-wide rollout. Same containers, same code, same governance model.

Design my agent platform

Ready to Build Your AI Infrastructure?

Contact us to discuss your requirements and get a customized NVIDIA HGX solution for your AI workloads.

Request a Quote