Amazon Cools Nvidia’s AI with Custom Hardware
To handle the intense heat generated by **Nvidia**’s high-powered GPUs in AI systems, **Amazon** has developed a unique cooling solution. The cloud division engineered specialized hardware to address the problem.
In-Row Innovation
**Amazon Web Services** opted to create its own cooling solution after assessing third-party liquid-cooled systems. They found these were unsuitable for their needs, according to **Dave Brown**, vice president of compute and machine learning services at AWS.
According to **Brown**, “They would take up too much data center floor space or increase water usage substantially.”
He added that current solutions could handle lower volumes at other providers, but could not support **Amazon**’s scale.
Instead, **Amazon** engineers designed the In-Row Heat Exchanger (IRHX) that can be integrated into existing and new data centers. This was necessary, as previous generations of **Nvidia** chips were adequately cooled by traditional air cooling methods.
P6e Instances and Nvidia’s GB200 NVL72
Customers can now use the AWS service through computing instances called P6e, **Brown** shared in a blog post. These systems complement **Nvidia**’s design for dense computing power. **Nvidia**’s GB200 NVL72 contains 72 **Nvidia** Blackwell GPUs in a single rack, wired together for training and running large AI models.
Competition in the Cloud
Previously, computing clusters based on **Nvidia**’s GB200 NVL72 were available through **Microsoft** and **CoreWeave**. Currently, AWS is the world’s largest cloud infrastructure supplier. In fact, **Amazon**’s AWS dominates the cloud infrastructure market, capturing 31% of the total revenue in 2023, followed by **Microsoft** Azure and Google Cloud (Statista 2024).
Amazon’s History of Hardware Innovation
**Amazon** has a track record of rolling out its own infrastructure hardware. The company has custom chips for general-purpose computing and AI. They also designed their own storage servers and networking routers. By using homegrown hardware, **Amazon** reduces its reliance on third-party suppliers, benefiting its bottom line.
In the first quarter, AWS delivered its widest operating margin since at least 2014, contributing the majority of **Amazon**’s net income. Meanwhile, **Microsoft**, the second-largest cloud provider, has also developed its own chips. In 2023, the company designed Sidekicks to cool the Maia AI chips it developed.
AWS Announces Latest CPU Chip
WATCH: AWS announces latest CPU chip, will deliver record networking speed