NVIDIA GTC 2024 ROUNDUP
As captivated conference attendees, AHEAD engineers decode a handful of highlights.
If NVIDIA® GTC Conference was a prize fight, CEO Jensen Huang’s keynote delivered a haymaker to its competition. Huang walked on stage to a roaring crowd peppered with celebrities like Ashton Kutcher, Kendrick Lamar, George Lucas, Nas, and more. And with NVIDIA’s stock surging in value 256% over the past twelve months, the excitement was warranted.
AHEAD attendees were in full force, representing each key practice area that designs and supports solutions based on NVIDIA’s technology stack. Here, we highlight some of the biggest announcements and boil down key takeaways from the conference that, from AHEAD’s point of view, will play an important role in the acceleration and impact of AI technology.
Blackwell for AI Training
NVIDIA emphasized the need for more powerful and energy-efficient accelerated computing for AI training. With that, they announced the arrival of the Blackwell platform for trillion parameter scale generative AI. According to NVIDIA, the Blackwell architecture will enable 30 times greater inference performance and 25 times less energy compared to Hopper™, the previous generation architecture. NVIDIA confirmed two Blackwell GPU designs for x86-based systems, the B100 and B200, as successors to the Hopper H100 and H200.
To power a new era of computing, the Blackwell platform consists of five revolutionary technologies:
- AI Superchip: Packed with 208B transistors
- 2nd Gen Transformer Engine: FP4/FP6 tensor core
- RAS Engine: 100% in-system self-test for reliability, availability, and serviceability
- Secure AI: Full performance encryption and TEE (trusted execution environment)
- Decompression Engine: 800 GB/sec
For the highest AI performance, NVIDIA announced the NVIDIA GB200 NVL72, a liquid cooled rack pod that combines 36 Blackwell Superchips, 72 Blackwell GPUs, and 36 Grace CPUs interconnected by fifth generation NVLink. NVIDIA BlueField-3 DPUs provide network acceleration, composable storage acceleration, and zero-trust security.
One of the overarching takeaways is that NVIDIA is continuing to move the needle with measurably higher performance at lower relative power consumption. We realize every enterprise’s needs are different — for example, some may never need a rack-scale Blackwell solution or liquid cooling in their data centers. So, it’s good to know NVIDIA is still building air-cooled options for their latest technology to allow more organizations to consume it. Blackwell GPU server offerings are for the very high end of AI infrastructure needs. Luckily, NVIDIA offers a very wide range of GPU offerings and servers to pair with the use case.
With the higher performance of Blackwell GPUs comes the need for even higher InfiniBand and ethernet networking performance. The new NVIDIA Quantum IB 800Gb switches were announced, as well as NVIDIA Spectrum ethernet 800Gb switches. ConnectX®-8 cards were also announced with BlueField-3 and 800Gb performance.
RAG’s Role in Transformative AI
RAG (retrieval augmented generation) made a massive mark across the conference. Virtually every company was focused on RAG, from panel discussions to exhibitor demos and consultations. At the center of it all was NVIDIA NeMo, an end-to-end platform for developing and deploying custom GenAI across clouds, data centers, and the edge.
RAG is a way to improve the accuracy and reliability of LLMs by retrieving information from external sources, which could be anything from industry-specific publications to proprietary company data. It allows LLMs to cite actual sources for the responses it gives, ensuring the data is trustworthy.
NeMo makes building RAG faster and easier to develop and deploy. Developers can quickly train, customize, and deploy LLMs with NeMo at scale to bring applications to market quicker. NeMo Retriever™ microservices enable developers to link their AI applications to their business data, including text, images, and visualizations such as bar graphs, line plots, and pie charts. With these RAG capabilities, enterprises can offer more data to copilots, chatbots, and generative AI productivity tools to improve accuracy and insight.
IGX for GenAI Solves Real-World Challenges
Use cases for NVIDIA IGX Orin edge AI platform were on full display, from industrial automation to healthcare. IGX is high-performance, industrial-grade hardware built to run enterprise software at the edge. With 2x 100Gb ConnectX-7 NICS, a 12 core Arm® CPU, and 2,048 Ampere® GPU cores, the platform is perfect for low latency video applications.
IGX is a very appealing choice for GenAI at the edge, and there is a plethora of microservices to choose from which are developed to accelerate applications in industries like retail and transportation.
This is not AHEAD’s first foray into the power of IGX, in fact, AHEAD appeared in the GTC IGX presentation deck. As a long-standing NVIDIA partner, AHEAD’s Engineered Solutions team developed a rack mountable version of the platform in a 2U form factor.
Another AHEAD partner, Advantech, showcased the power of IGX with a computer vision endoscopy application combined with a generative AI LLM (large language model) chatbot. LLMs, best suited for text-based tasks, can interact with VLMs (vision language models), which creates the ability to perceive and understand the real world using computer vision. The demo showed how a care team can interact with the chatbot during surgeries to obtain value-added insights from the camera’s visual findings.
NVIDIA also displayed the low latency capabilities of IGX using NVIDIA Holoscan sensor bridge to enable easy sensor-based application development with low-latency ethernet protocol. It was demonstrated with a computer vision application tracking a laser pointer with its IP connected camera, which trailed the pointer with an NVIDIA logo on a display by only a few milliseconds.
Innovating AI at the Edge
AI at the edge is revolutionizing every industry, as evident by the conference’s breakout sessions. Numerous application developers across industries like transportation, agriculture, manufacturing, healthcare, and retail showcased their edge solutions. The common theme: leveraging development building blocks and pre-trained AI models like NVIDIA Metropolis and Clara as the starting point. Then, by tweaking and training beyond the foundational models, the applications’ unique capabilities can shine.
Computer vision, with its ability to use cameras and software to solve problems, was the most prominent application. For example, computer vision applications are solving problems such as luggage tracking at airports, theft prevention in retail, understanding traffic patterns in transportation to make roads safer, eliminating costly sensors like Lidar in automotive manufacturing, and reducing the use of chemicals in agriculture with autonomous sprayers to make the soil more sustainable.
Simplifying AI Deployment with NIM Microservices
NVIDIA also announced a new catalog of GPU-accelerated NVIDIA NIM microservices compatible with IGX for enterprises, designed to support AI use cases including LLMs, VLMs, and multiple other models for speech, images, video, 3D, and more.
NIM stands for NVIDIA Inference Microservice, a containerized set of cloud-native microservices including industry-standard APIs, domain-specific code, optimized inference engines, and enterprise runtime. As part of NVIDIA AI Enterprise, the microservices speed up AI deployment by minimizing time spent on writing code and setting up the environment. According to Huang, “This is how we’re going to write software in the future.”
In the case of VLMs and NIM, it’s possible to enable contextual understanding of video. With textual output stored in a database, LLMs can be used to ask questions based on data being processed by hundreds of cameras. For example, with a VLM model interpreting road traffic and edge AI-enabled cameras, an operator can run a search looking for unusual traffic situations.
With its streamlined path for developing AI-powered enterprise applications and deploying AI models in production, NIM microservices will shorten the time to market and simplify the deployment of GenAI models.
Robotics at Every Corner
Robotics was one of the hottest topics, not only in the keynote but throughout the conference. More and more, robotics solutions are being used across industries to reduce risk, optimize processes, and reduce overhead.
NVIDIA supports a multi-pronged approach to robotics:
- AI model training on DGX
- Omniverse digital twin for simulation and model training
- Jetson Orin™ AGX and Thor for the robotic computer and stack
The NVIDIA Isaac™ extensible robotics simulation platform gives enterprises a fast, easy way to design, test, and train AI-based robots, such as mobile robots, robotic manipulator arms, and mobile outdoor equipment.
Humanoid Robots with Project GR00T
During the keynote, Huang also announced Project GR00T, a general-purpose foundational model for human-centric robots. While the humanoid form factor is one of the most hotly contested topics in robotics today, there is no doubt modern AI will accelerate development and robots will become a part of daily life.
The GR00T foundational model will be trained on the NVIDIA DGX hardware and software platform as well as Isaac Lab, a lightweight reference application used for simulation training in the NVIDIA Omniverse platform. The GR00T hardware stack will be powered by a new Jetson™ Thor robotic computer, designed for running simulation workflows, generative AI models, and more for the humanoid form factor.
As part of the GR00T initiative, Jetson Thor can perform complex tasks and interact safely and naturally with people. The Thor SOC includes an NVIDIA Blackwell architecture GPU with a transformer engine delivering 800 teraflops of 8-bit floating point AI performance to run multimodal generative AI models. It also features an integrated safety processor, a high-performance CPU cluster, and 100GB ethernet.
It’s clear NVIDIA is squarely focused on robotics for their next wave of innovation. In addition to Project GR00T, part of the NVIDIA Isaac release includes Isaac Manipulator, a collection of foundation models, robotics tools, and GPU-accelerated libraries that provide ultramodern dexterity and modular AI capabilities for robotic arms. NVIDIA is truly pushing the boundaries in terms of helping developers design robots and broaden their application cross-industry.
AI is on Every Organization’s Horizon
For anyone who missed the conference or hasn’t had the opportunity to listen to the NVIDIA keynote on-demand, here’s a summation of NVIDIA AI innovations that AHEAD sees moving technology forward over the next year and beyond:
- NVIDIA Blackwell providing a massive performance leap with less power consumption to help the issue of training larger and larger models
- A plethora of NIM Microservices to help enterprises design powerful edge AI solutions on IGX
- RAG to enhance the accuracy and reliability of generative AI models with facts fetched from external sources
- Robotics as an amped up focus area for NVIDIA that has futurists dreaming about the art of the possible
NVIDIA has built an unmatched ecosystem of software and hardware building blocks for accelerating AI solutions across all industries. AHEAD + NVIDIA is a powerful partnership to accelerate any organization’s AI journey.
Feel free to tap us to talk about your AI journey and how AHEAD can help you bring your vision to reality.