TheTechOasis
Posts
Deep Dives: NVIDIA

Deep Dives: NVIDIA

Ignacio de Gregorio Noblejas
June 15, 2024

🏝 TheTechOasis 🏝

part of the:

In AI, learning is winning.

This post is part of the Premium Subscription, in which we analyze technology trends, companies, products, and markets to try to answer the most pressing questions in AI.

You will also find this deep-dive in TheWhiteBox ‘Company/Product Deep-dives’, for future reference if necessary and to ask me any questions you consider.

By now, NVIDIA is probably the best story ever told about markets and creating shareholder value.

A decades-long bet on the important role AI will play transformed a company that provided hardware to gamer kids in basements into a company the size of the sixth-largest economy in the world, with more than $90 billion in data center revenues for 2024 alone.

And with some already predicting the company will be worth $10 trillion by the decade's end, the hype around NVIDIA shows no signs of slowing down.

For those reasons, today you will discover:

NVIDIA’s real moat (it’s not the hardware),
provide you with the ultimate guide on what NVIDIA is and does,
the very bullish reasons to believe it will become the most valuable company ever created,
and the various topics NVIDIA enthusiasts always seem to forget, like geopolitics, supply and demand issues, and technological disruption.

This is so you have the complete picture of the hottest company in AI today, so you can make more informed decisions about the future of your investments or business.

Let’s dive in!

From Gaming to AI, but Why?

Two words took NVIDIA to where it is today: linear algebra.

In our world, mathematics and engineers use linear algebra whenever it’s possible (because we deeply understand it) to the point that it’s literally everywhere.

But how did NVIDIA’s story begin, and what does linear algebra have to do with a company growing trillions of dollars in value in months?

From 3D to 2D

When playing video games, with the exception of virtual/augmented reality headsets, players look at a screen that displays the games in two dimensions.

Your screen is nothing more than a bunch of pixels. Each pixel depicts a color, and combined, they represent a frame, the image you see at a particular time.

Traditionally, screens had a much smaller amount of pixels, which caused images to appear weird:

But today, TVs have millions of pixels, allowing them to display extremely high-definition scenes. But when playing video games, you don’t see a single frame per second; you see up to 120 frames per second (fps) or more.

This means that your screen displays 120 frames of the scene every second, meaning that every single one of the millions of pixels gets allocated a color 120 times per second.

To assign a color, your computer or video game platform has to calculate, based on the 3D scene (the environment, the objects, how everything has moved from the previous instant, and so on), how that scene will be viewed from a 2D screen, a process we know as rendering.

This is a linear projection… that has to be done multiple millions of times (every pixel requires its particular calculation).

That means that your computer, for a mid-tier FullHD screen (2 million pixels), needs to perform, at least, 24 million projections per second.

Additionally, those calculations must be performed in parallel, as users must see the new update for the 2 million pixels simultaneously (because it’s an image).

This is what GPUs excel at.

Unlike Central Processing Units (CPUs), the heart of your computer, which excel at sequential processing (sequentially performing very complex calculations), GPUs have much more modest cores (the cores are the calculators). This means that GPUs can’t perform complex calculations, but can perform multiple simple ones simultaneously.

For reference, the most powerful CPUs have just 24 cores, while a state-of-the-art GPU has thousands. And the gaming use case we have just described explains why.

Nonetheless, this precise use case led to the creation of GPUs. And in the GPU world, NVIDIA is the name of the game.

But what does this have to do with AI?

It’s Linear Algebra all along

I wasn’t trying to scare you with the linear algebra thing.

My point is that projecting pixels on a screen and training/running an LLM is, mathematically speaking, a highly similar linear algebra problem: matrix multiplications.

In fact, ChatGPT, in pure reductionist form, is just a bunch of matrix multiplications; that’s it. Once you realize this, it’s no wonder all AI companies are going crazy for NVIDIA’s GPUs.

In fact, without deviating too much from the topic of conversation, ChatGPT’s underlying architecture, which is the same for all frontier models, including Gemini, Claude, and other modalities like Stable Diffusion or Sora, known as the Transformer, was specifically designed to make the most of GPUs.

In other words, models like ChatGPT are meant to be run on this particular hardware.

Fun fact: ChatGPT ingests the entire sequence simultaneously. In other words, it does not respect the order of words in the sequence, it processes all of them at once (it adds positional encodings to be aware of the actual order of words) but the take-away is that the model parallelizes everything.

Everything starts to make sense, right? However, we still have to answer: why is everyone going crazy for NVIDIA?

The Safest Bet

As we already discussed in a previous issue of this newsletter, AI is in quite a bubble right now.

The reason is not that the technology isn’t legit; we all know that’s not the case, but the fact that it’s not generating nearly as much revenue and value creation to justify how much markets are rewarding those efforts.

The hyperscalers (Microsoft, Amazon, Google) alone account for +$3 trillion in market value since the release of ChatGPT while merely surpassing $20 billion in revenue, 150 times less.

To make matters worse, their booming cloud segments are inflated by them signing equity deals with AI labs like OpenAI, Anthropic, or Cohere.

GenAI hyperscaler-led investment rounds such as OpenAI’s, Anthropic or Mistral are, in essence, compute credit trades: Big tech companies buy equity in the company in the exchange of cloud credits (GPU usage), not money.

In summary, they act like a perpetual motion machine, meaning that they are as inflated as an air balloon in Cappadocia unless revenues come piling in soon.

Nonetheless, NVIDIA’s data center segment (their AI segment) has a current run rate of $90 billion (2024’s projected AI-based revenues), which, if materialized, would represent more than half the total revenue of the entire AI market.

According to ProfG Markets, NVIDIA accounts for almost half of the S&P500’s growth this year (Apple added $300 billion in just two days, so that number may have dropped a bit).

Complete madness.

Thus, seeing how all investors want to be exposed to AI but acknowledge the poor revenues across the board, it’s no surprise that they all go into NVIDIA as the safest bet, just like CISCO in the “dot-com” boom.

Still, as we argued in the AI bubble article, CISCO and NVIDIA are not comparable. At its peak, CISCO was trading at triple (140x forward p/e) the value related to its forward earnings compared to NVIDIA (currently 48).

With all this said, it’s clear that NVIDIA is overly exposed to the AI market (the AI segment accounts for 80% of the projected revenues), a market that is clearly underdelivering as we speak.

Knowing this, what’s NVIDIA doing about it?

The Bullish Case for NVIDIA

NVIDIA’s leadership is blatantly aware of its overexposure to AI hardware, and there are still plenty of reasons to be optimistic about them.

The AI accelerator market (GPUs et al.) is projected to grow at a 35-39% CAGR until 2030. If this is true, and if NVIDIA holds its market share, its expected revenues from AI will grow to a staggering $800 billion, almost ten times its current run rate.

At a 10 price-to-sales ratio (P/S, multiple of their value compared to revenues), that’s an 8 trillion dollar company, dear reader. That doesn't seem improbable for a company trading at a 40 P/S as we speak.

Although potentially out of the question, if they maintained their current P/S ratio, those revenues would elevate NVIDIA’s value to $48 trillion, almost half of the world’s entire GPD today.

Despite this, NVIDIA isn’t comfortable having all its eggs in the same basket and is committed to diversifying as much as possible while still being fully AI.

But what does that mean? Bear with me.

The Conquest of Physicality

Simulation. That was the most important word Jensen Huang mentioned in his widely-followed COMPUTEX keynote.

Jensen explicitly mentioned imagining a future where every device around you had some generative intelligence and autonomy. In simple terms, he was betting that every device around you would have some sort of AI accelerator (for instance, a GPU) inside.

Indirectly, Jensen Huang referred to the other big term he used in the speech, ‘Physical AI’, the conquest of physicality.

But this is easier said than done. Even today, AIs are completely constrained by the limits of the Moravec Paradox, coined in 1988. This paradox states that ‘what’s easier for humans is hard for AI and vice versa’.

In other words, embodied intelligence is, without a doubt, AI’s hardest problem. We are still learning to teach robots to perform basic movements that a 3-year-old can seamlessly perform.

But why am I telling you all this?

Well, what Jensen was implicitly telling us is that he’s trying to position NVIDIA at the forefront of AI’s conquest of Embodied Intelligence, the capacity of AI systems to live in our world.

And when you realize this, the entire speech and all the features he mentioned fall into place.

Simulation. He consistently mentioned NVIDIA’s efforts on Isaac Gym and Omniverse, NVIDIA’s bet on training autonomous AIs in simulation. In simple terms, the idea is to generate plausible representations of reality and train robots in them. Then, the robots are transferred into real life, hopefully using the learnings in the highly realistic simulated worlds to inhabit the physical world effectively.

In this regard, they are putting their money where their mouth is, training impressive robots like DrEureka’s circus robot, which was trained in simulation but can balance itself in a real-life yoga ball.

If NVIDIA becomes the main training ground of robots, that’s a solid case for another multi-billion dollar revenue segment.

Digital twins. Those two words have been thrown around for many years, but are poised to become a reality soon. NVIDIA is deepening its reach into factories, helping manufacturers automate and industrialize their plants. It’s also trying to create the perfect digital twin of the world, Earth-2, a copy of our planet that could, potentially, have meter-level precision of the weather. They even mentioned that Earth-2 would consider your surrounding buildings, meaning it could forecast the weather at the street level.

Again, it all boils down to simulation, or representing the real world in simulated environments for robotics training, weather prediction, or, quite frankly, limitless applications, a world where everything has its own digital twin.

This seemed like an outlandish statement years ago, but considering that scientists have created an exact generative representation of a rat, a perfect digital twin to the point that the digital rat behaves exactly like a physical one, tells me this future is not only plausible, but near.

But NVIDIA also gave us more insight into how they pretend to continue to milk the LLM cow.

NIMS, Deploying GenAI in a Breeze

When companies go for tremendous moonshots like the ones we just described, it’s tempting for them to forget the business segments that really pay the bills.

But NVIDIA won’t fall for that.

In fact, they are doubling down on the LLM segment, not only through their insane commitment to updating the state-of-the-art AI hardware from every two years to a new platform per year (they announced the Rubin platform for 2026 while they even haven’t delivered a single Blackwell chip, the upcoming generation, yet).

This feels like a strategic move to consolidate future cash flows.

I have no proof of this, but I assume Jensen wants to start racking up 2026 orders (in turn for a discount, they have almost three times Apple’s margins, which are quite healthy already) in order to secure revenues for years and calm down investors, even potentially reducing their forward p/e ratio to appear cheaper than the stock really is, even neglecting the inflated value arguments.

But they are also venturing into LLM serving.

In other words, they will no longer be a hardware-only company; they are building a platform of microservices, known as NIMs, that allow the seamless deployment of LLMs and other frontier models in docker containers.

AI engineers know how hard it is to run LLMs efficiently. These files are multiple hundred Gigabytes (or even Terabytes), forcing them to be deployed in a distributed GPU cluster and requiring extensive CUDA (NVIDIA’s software to manage GPUs) development.

By providing an on-ramp to LLM deployment through docker containers that are already prepared to handle the loads, you make the set-up exercise a breeze and, importantly, easily scalable.

If successful, NVIDIA will become not only a hardware provider but also an AI provider, increasing revenues considerably and safeguarding the company from the undeniable fact that most of NVIDIA’s revenues come from a small set of companies that are desperately trying to build their custom hardware and stop depending on NVIDIA.

All in all, NVIDIA seems like a no-brainer, right?

Well, hold your horses just one second.

The Hawkish Case for NVIDIA

Three constraints threaten NVIDIA’s dominance: energy grids, semiconductor supply chain geopolitics, and technological disruptors.

The Growing Imbalance

According to the Electric Power Research Institute, the current data center energy load sits at around 4% of total energy demand.

However, that number is expected to grow to 10% by 2030, with AI accounting for half the growth requirements, according to Semianalysis.

Long story short, although I will write an article very soon on this topic, our energy grids are far from ready to withhold that demand, requiring huge investments in generation and transmission pipelines.

The hopes toward a huge increase in energy generation and tramission are so dire that many companies are seriously considering the idea of side-stepping the grid completely and build small nuclear reactors to power the datacenters.

One of the biggest advocates for this is none other than Sam Altman.

The main issue is that data center efficiency improvements, such as how much actual computing we get per unit of energy, have been stalling over the last few years.

Adding insult to injury, AI requests are very expensive. The average ChatGPT request requires 2.9 Watts per hour, 10 times the energy requirements for a standard Google query.

And with AI-powered search around the corner, some estimates grow this number to 9 Wh, 30 times the average Google query.

And we still have no precedent for image, video and audio models, which could potentially cost much more.

And if we factor in long-inference models, LLMs that generate multiple possible solutions per requests, it’s safe to say that the average query will easily grow beyond 10-15 Wh, unsustainable for current energy grids and even for projected ones.

Seeing these numbers, it’s no surprise that Sam Altman or Mark Zuckerberg have been strikingly clear on what keeps them up at night: energy constraints.

NVIDIA is naturally highly exposed to these issues, and our only hope, considering that increasing demand takes years, is for research scientists to create more efficient frontier models.

But that’s a topic for another day.

With all said, if I were a one-trick pony NVIDIA investor, the main thing that would keep me up at night is, surprisingly, Taiwan.

Wait, what?

The Foundry of the World, at Huge Risk

Amazingly, the supply chain of the most sought-after asset in the world right now, GPUs, is absolutely dependent on a small island 1,200 miles (1,900 km) east of China, Taiwan.

This wouldn’t be a problem if not for the fact that China has been very outspoken on its intentions to invade Taiwan at some point.

They consider Taiwan part of China, and are very clear on that commitment, even suggesting a potential invasion before the decade's end.

This is a huge risk for NVIDIA. Unlike Intel or Samsung, which are Integrated Device Manufacturers (IDMs), NVIDIA only designs chips but outsources the production, a term known as a ‘fabless’ company.

Subscribe to Leaders to read the rest.

Become a paying subscriber of Leaders to get access to this post and other subscriber-only content.

Upgrade

Already a paying subscriber? Sign In

A subscription gets you:

High-signal deep-dives into the most advanced AI in the world in a easy-to-understand language

Additional insights to other cutting-edge research you should be paying attention to

Curiosity-inducing facts and reflections to make you the most interesting person in the room