On Monday, Nvidia introduced the H200, a state-of-the-art graphics processing unit specifically designed for training and deploying advanced artificial intelligence models, fueling the current generative AI boom.
This upgraded GPU surpasses its predecessor, the H100, notable for playing a pivotal role in training OpenAI’s sophisticated language model, GPT-4. The limited supply of these chips has sparked intense competition among major corporations, startups, and government entities.
Experts estimate the cost of H100 chips to range between $25,000 and $40,000 each. Constructing the most expansive AI models necessitates the collaboration of thousands of these chips, actively participating in the “training” process.
The phenomenal success of Nvidia’s AI GPUs actively propels the company’s stock, witnessing an impressive surge of over 230% in 2023. Looking ahead, Nvidia actively anticipates a substantial revenue of approximately $16 billion for its fiscal third quarter, marking a remarkable 170% increase compared to the previous year.
The H200 from Nvidia marks a notable leap forward with its incorporation of 141GB of remarrkable “HBM3” memory. This next-generation memory technology represents a substantial improvement, particularly in the realm of “inference,” where the chip leverages its trained model to swiftly generate text, images, or predictions. The significance of this lies in the chip’s enhanced ability to process and execute tasks with remarkable efficiency after the initial training phase.
In comparative terms, Nvidia asserts that the H200 is capable of delivering output nearly twice as fast as its predecessor, the H100. This assertion is based on a rigorous test scenario involving Meta’s Llama 2 LLM, underlining the tangible performance gains offered by the H200.
Anticipated to hit the market in the second quarter of 2024, the H200 is poised to enter into direct competition with AMD’s MI300X GPU. Notably, AMD’s chip shares a similar focus on augmenting memory capacity compared to its predecessors. This strategic emphasis on increased memory is instrumental in accommodating large and complex models on the hardware, thereby optimizing the efficiency of the inference process.
The H200’s introduction represents a significant step forward in the capabilities of AI-focused GPUs, offering enhanced performance and efficiency in handling inference tasks, a crucial aspect in the rapidly evolving landscape of artificial intelligence. The upcoming competition with AMD’s MI300X further underscores the competitive dynamics within the realm of advanced GPU technology.
Nvidia has confirmed that the H200 will maintain compatibility with the H100, ensuring a seamless transition for AI companies currently using the prior model. This means that businesses engaged in AI training with the previous version won’t need to undergo extensive changes to their server systems or software to leverage the capabilities of the new H200.
The H200 will be made available in two configurations, either in four-GPU or eight-GPU server setups within Nvidia’s HGX complete systems. Additionally, it will be featured in a chip called GH200, combining the H200 GPU with an Arm-based processor, further expanding its versatility.
Despite the notable advancements introduced with the H200, it’s worth noting that its reign as Nvidia’s fastest AI chip might be short-lived. In the dynamic landscape of semiconductor development, significant leaps in performance typically occur every two years when manufacturers transition to a new architecture. This shift unlocks more substantial performance gains compared to incremental improvements like adding memory or making smaller optimizations. Both the H100 and H200 are based on Nvidia’s Hopper architecture, indicating that future innovations may emerge as the company evolves its chip designs.
In October, Nvidia informed its investors of a strategic shift in its approach, transitioning from a traditional two-year architecture cadence to a more accelerated one-year release pattern. This decision was driven by the substantial demand for Nvidia’s GPUs in the market. The company emphasized its commitment to keeping pace with industry demands and technological advancements.
As part of this accelerated release strategy, Nvidia unveiled plans for its upcoming B100 chip, which is set to be based on the anticipated Blackwell architecture. According to information presented by the company, Nvidia aims to announce and release the B100 chip in 2024. This move signifies Nvidia’s proactive response to the rapidly evolving landscape of GPU technology and the heightened demand for advanced computing capabilities in various industries. The shift to a one-year release pattern underscores the company’s dedication to staying at the forefront of innovation and meeting the needs of its diverse customer base.
Nvidia’s strategic shift from a two-year architecture cadence to a more dynamic one-year release pattern reflects the company’s response to the intense demand for its GPUs and the fast-paced evolution of technology. The announcement of the B100 chip, based on the upcoming Blackwell architecture and scheduled for release in 2024, underscores Nvidia’s commitment to staying ahead in the competitive landscape of GPU technology. This proactive approach aligns with the company’s dedication to innovation and meeting the diverse computing needs of its customers across various industries.
As Nvidia continues to adapt to market dynamics, the accelerated release pattern signals a focus on agility and responsiveness to emerging trends in the rapidly advancing field of semiconductor development.