| 
 March 21, 2025
 By Bob O'Donnell One of the biggest challenges in analyzing a rapidly  growing company like Nvidia is trying to make sense of all the different businesses  the company participates in, all the products it announces, and the overall strategy  it’s following. Following the keynote speech by CEO Jensen Huang at the  company’s annual GTC Conference this year, the task was particularly daunting.  As per his usual, Huang covered an enormous range of topics over a long period  of time and, frankly, left more than a few people scratching their heads to  make sense of it all. At a very enlightening Q&A session that Huang did a  few days later with industry analysts, however, he shared several comments that  suddenly made all the various product and partnership announcements he covered,  as well as the thinking behind them, crystal clear. In essence, he said, Nvidia  is now an AI infrastructure provider building out a platform of hardware and  software that everyone from large cloud computing providers, other tech vendors  and enterprise IT departments can use to create AI-powered applications. Needless to say, that’s an extraordinarily far cry from its  role as a provider of graphics chips for PC gaming, or even from its efforts to  help drive the creation of machine learning algorithms. But it does tie together  a number of seemingly disparate announcements from recent events and lays out a  fairly clear path towards where the company is headed into the future. It also  makes clear Nvidia has moved well past its origins and common perception as a semiconductor  design house and into a newly created role of critical infrastructure enabler  for the future world of AI-powered capabilities and—as Huang noted—“an intelligence  manufacturer.” During his GTC keynote, Huang discussed the company’s  efforts to help enable both the most capable and most efficient ways of  generating tokens for use with modern foundation models and associated those  tokens with the intelligence that organizations will need to generate revenues  in the future. He calls these efforts an AI factory and strongly believes they have  relevance for businesses across an extremely broad range of industries. While  it’s a bit of a heady vision, the signs of the coming information economy and  even the information-driven efficiencies of the traditional manufacturing economy  are starting to become clear. From businesses built solely on AI services  (think ChatGPT) through the robotic manufacturing and distribution of  traditional goods, there’s little doubt we’re moving into an exciting new economic  era. In this context, Huang spent a good portion of the GTC keynote  describing how Nvidia’s latest offerings are helping enable the creation of  tokens faster and more efficiently than ever before. His initial discussion  focused on the notion of AI inference, which many believed was a simpler task than  the AI training efforts that first brought Nvidia into the spotlight several  years ago. In conjunction with the new type of chain-of-thought reasoning  models such as DeepSeek R1, OpenAI’s o1, etc., in particular, he argued that inferencing  will end up taking about 100x the computing demands that current one-shot inference  responses do. In other words, there’s no reason to worry that more efficient large  language models will reduce the demand for computing infrastructure and we’re  still in the early stages of the AI factory infrastructure buildout. One of the most important, but least understood,  announcements that Huang made in his keynote was for a new piece of software  called Nvidia Dynamo that’s designed to make the inferencing process for these  more sophisticated models better. Specifically, Dynamo—which is an updated  version of the company’s Triton Inference Server software—can dynamically  allocate GPU resources to deal with the different aspects of inferencing,  including the prefill and decode stages, which have different types of computing  requirements. It can also create dynamic caches of information and move that  information across different types of memory in the system.  Working in an analogous manner to how Docker coordinates  containers in a cloud computing environment, Dynamo intelligently allocates and  orchestrates the resources and data needed for generating tokens in AI factory  environments. This level of control is why Nvidia dubbed it the OS of AI  factories. Practically speaking, leveraging Dynamo allows organizations to get  up to a 30x increase in the number of inferencing requests that a given system  can handle.  Of course, it wouldn’t be GTC if Nvidia didn’t also have  chip and hardware announcements and there were plenty this time around. As  expected, Huang offered a roadmap for future GPUs, including an update to their  current Blackwell line called Blackwell Ultra (GB300 series) that offers more  onboard HBM memory for faster performance. He also unveiled unveiled the new  Vera Rubin architecture, featuring both a new Arm-based CPU called Vera and a  next-generation GPU called Rubin, both of which incorporate larger numbers of  cores and other capabilities. He even teased the generation beyond that—to be  named after mathematician Richard Feynman—which takes the company into 2028 and  beyond. At the previously mentioned Q&A session, Huang explained the  rationale behind showing so many of their future products was that, as an  infrastructure provider, they realize that they must give a great deal of  advance notice to current and future ecosystem partners so that they can be  ready for these new generations.  Speaking of which, Huang also discussed many more  partnerships at this year’s event. Based on the much larger presence of many  other tech vendors this year, it was also clear throughout the show that a lot  of other tech vendors are eager to participate in this growing new ecosystem. On  the compute side, Huang explained that to really maximize the infrastructure  for faster and more efficient token generation they needed to make refinements  and advancements to all parts of the traditional computing stack, including  networking and storage. To that end, the company unveiled both new silicon  photonics technology for doing optical networking between racks of GPU accelerated  servers and discussed a partnership with Cisco. The Cisco deal allows Cisco  silicon to be used in routers and switches designed to integrate these GPU-accelerated  AI factories into enterprise environments. In addition, it entails the creation  of a common software management layer for these devices. For storage, Nvidia  worked with a number of leading hardware providers and data platform companies  to ensure that their solutions could be accelerated with GPUs, thus further  diversifying the range of markets that Nvidia silicon could impact. And finally, building on the diversification strategy,  Huang introduced more work that the company is doing for autonomous vehicles  (notably a deal with GM) and robotics, both of which he described as part of  the next big stage in AI development: physical AI. Nvidia has been providing  components to automakers for many years now and, similarly, has had robotics  platforms for several years as well. What’s different now, however, is that  they’re being tied back to AI infrastructure that can be used to better train  the models that will be deployed into those devices, as well as providing the  real-time inferencing data that’s needed to operate them in the real world.  While this tie back to infrastructure is arguably a relatively modest advance,  in the bigger context of the company’s overall AI infrastructure strategy, it  does make more sense and helps tie together many of the company’s initiatives  into a cohesive whole. Making sense of all the various elements that Huang and  Nvidia unveiled at this year’s GTC isn’t easy, particularly because of the  firehose-like nature of all the different announcements and the much broader  reach of the company’s ambitions. Once the pieces do come together, however,  it’s not hard to see that Nvidia is taking on a much bigger role than it ever  has and is well positioned to achieve the big picture types of goals it has  laid out for itself. At the end of the day, Nvidia knows that being an  infrastructure and ecosystem provider means that they can benefit both directly  and indirectly as the overall tide of AI computing rises, even as their direct  competition is bound to increase. It’s a clever strategy and one that could  lead to even greater growth for the future. Here’s a link to the original column: https://www.linkedin.com/pulse/nvidia-positions-itself-ai-infrastructure-provider-bob-o-donnell-9sfwc Bob O’Donnell is the president and  chief analyst of TECHnalysis Research, LLC a market research firm that provides strategic consulting and market research  services to the technology industry and professional financial community. You  can follow him on LinkedIn at Bob  O’Donnell or on Twitter @bobodtech. |