Nvidia Unleashes New Generation of Interconnects and Switches to Alleviate Data Bottlenecks

Share

Key Takeaways:

– Nvidia unveils Blackwell architecture amidst concerns of network layer bottlenecks affecting AI and data analytic workloads.
– NVLink 5.0, new interconnects and switches, is launched to address this bottleneck issue.
– With 18 NVLink connections, Blackwell GPIO provides total bandwidth of 1.8 terabytes per second.
– The new NVLink switches can connect multiple NVL72 frames into a single namespace, optimizing and accelerating GPU-heavy workloads.
– Nvidia’s new X800 technology, set to be available in 2025, will get backing from heavyweights including Microsoft Azure, Oracle Cloud, and Coreweave.

Main Article:

While Nvidia’s new Blackwell architecture garnered plaudits at the recent GPU Technology Conference in San Jose, concerns were raised about emerging bottlenecks at the network layer, which could potentially disrupt workloads for AI, high-performance computing (HPC), and big data analytics. Nvidia responded swiftly by addressing the bottleneck with the introduction of new interconnects and switches, including the NVLink 5.0 and 800Gb InfiniBand and Ethernet switches for storage connections.

The highlight of Nvidia’s announcement is the latest generation of NVLink, bred to enhance the speedy data movement between processors. The fifth generation of the GPU-to-GPU-to-CPU bus is capable of moving data at an impressive speed of 100 gigabytes-per-second.

Nvidia is leveraging NVLink 5.0 as a foundational block for building robust GPU supercomputers. A fully loaded NVL72 frame, equipped with two GB200 Grace Blackwell Superchips, each with one Grace CPU and two Blackwell GPUs, actively embodies this vision.

According to a blog post by Nvidia, nine NVLink switches are required to connect all the Grace Blackwell Superchips in the liquid-cooled NVL72 frame. This ensures that data moves smoothly and at a remarkable speed within the GPU cluster.

Boosting Storage Connections

In addition to enhancing the flow of data within the GPU cluster, Nvidia also launched new switches this week. The purpose of these switches is to connect the GPU clusters with massive storage arrays that store big data for AI training, or HPC simulations, or analytics workloads.

Paving the way toward this goal, Nvidia unveiled its X800 line of switches, which promise to deliver 800Gb per second throughput in both Ethernet and InfiniBand variants. These switches and their associated Network Interface Cards (NICs) are designed to handle the intense computational needs of AI models with trillion-parameters.

Future of Nvidia’s X800 Tech and its Industry Adoption

Projected for availability in 2025, Nvidia’s new X800 technology has already succeeded in sparking interest among big names in the cloud provision industry. Microsoft Azure, Oracle Cloud, and Coreweave have pledged their support for this incoming technology.

Moreover, a bevy of storage providers such as Dell Technologies, Hitachi Vantara, and Hewlett Packard Enterprise have committed to delivering storage systems based on the X800 line, signifying the industry’s anticipation for this high-performance solution.

In Conclusion

With the recently introduced NVLink 5.0, interconnects, and switches, Nvidia has reaffirmed its commitment to resolving network layer bottlenecks. By generating a robust solution that ensures the smooth flow of data between processors and the effective connection of GPU clusters with expansive storage arrays, Nvidia is poised to revolutionize computing for AI, HPC, and big data analytics.

Jonathan Browne
Jonathan Brownehttps://livy.ai
Jonathan Browne is the CEO and Founder of Livy.AI

Read more

More News