• TheTechOasis
  • Posts
  • Elon's world-largest AI Cluster, Cohere's Mounstrous Round, New deep fake method, & more

Elon's world-largest AI Cluster, Cohere's Mounstrous Round, New deep fake method, & more

THEWHITEBOX
TLDR;

  • 🤜🏼🤛🏾 Mistral and Nvidia join forces

  • 🫣 Elon announces world’s largest AI cluster

  • 🔍 An overview of the future of search

  • 🫡 Cohere raises monstrous $500 million round

  • 👁️ New method to identify deep fakes

SOFTWARE PARTNERSHIPS
Mistral and NVIDIA join forces 🤜🏼🤛🏾

A state-of-the-art at its size

Mistral, the French star AI company valued at $6 billion, and NVIDIA, the world’s main GPU provider and one of the top three most valuable companies in the world, have presented Mistral-NeMo.

This state-of-the-art model has a 12 billion parameter size and a 128k context window, meaning that you can send up to 100k words in one single prompt, similar to what ChatGPT offers.

TheWhiteBox's take:

Besides being quite small, the model was trained on FP8 precision. In other words, every parameter takes up just one byte, so the model easily fits in mid-tier GPUs and can be served at scale very efficiently, as it only weighs 12 GB.

Also, it has been deployed on an NVIDIA microservice, also known as NIM, ensuring that the model is optimized to run on NVIDIA GPUs. Moreover, it excels at function calling, enabling you to run the model on pipelines with other tools, enhancing its overall usefulness and, importantly, reducing the chances of hallucinations (i.e., make the model call a calculator tool to perform computations instead of using its standard, high-error prone next-word prediction).

Finally delivering on its promise, Elon Musk can now brag about having the world’s largest AI cluster, with a staggering 100,000 NVIDIA H100 GPUs, the state-of-the-art ahead of the release of the NVIDIA Blackwell platform sometime this year.

The average H100 cost is $35k per GPU, although this number is probably far from exact as such large purchase orders that xAI has made to NVIDIA will have included discounts. That said, if we factor in other data center costs, the value of this cluster is surely around $4 billion.

With such a cluster, it’s probably a matter of time before Grok-3, the upcoming version of xAI’s Multimodal model, becomes the largest model ever trained.

TheWhiteBox’s take:

The numbers around this cluster are ridiculous. Nonetheless, the power needed to run this cluster is a staggering 138 MW, enough to power almost 150k homes at average US consumption rates.

For reference, GPT-4 was trained on a 20k A100 cluster for 90/100 days. That same model in this cluster?

- Knowing each H100 has a theoretical FLOPs peak of 198 ExaFLOPs, or 198 × 1018 FLOPs/second,

- and knowing GPT-4 was a workload of 2.15×1025 FLOP, or 21.5 million ExaFLOPs,

- this means that the cluster needs to run for 21,500,000 / 198 = 109k seconds, or 1.25 days. However, as the standard real peak of power in H100 clusters, or MFU, is 40%, that means the cluster would require between 3 and 4 days to train GPT-4.

For reference, that value would have been 100 days in 2022. Or, to be specific, running this cluster for that long would allow to train a model 31 times larger GPT-4.

The world is ready for GPT-5… or dare I say Grok-3?

Perplexity Founders

This is an interesting article that dives into the world of search and how AI could disrupt it, offering interesting insights on its level of adoption, current issues, key rivals to Google like Perplexity, and possible ways these companies will monetize.

TheWhiteBox’s take:

If you are familiar with LLMs and how they work, you will have realized that there’s a non-zero chance that AI-generated search never works.

As the model needs to enhance (or augment) its context in real time using some sort of retrieval API, the model will have to generate an answer based on data that it might not know at first, leading to very common hallucinated responses.

You may counterargue that LLMs are already trained with all the public information on the Internet but

1. this data increases by the day, and,

2. for them to actually memorize the data they must perform more than one epoch over the data, not common in large-scale LLM training pipelines due to excessive costs.

While it’s safe to say that the hallucination problem will decrease over time, the search business model is built on trust, meaning that the tolerance for error is very, very low, and people won’t hesitate to go back to traditional search otherwise.

Cohere co-founders

In the latest round of huge raises from LLM providers, Cohere, a Canadian firm focused on enterprise GenAI, has raised half a billion at a valuation of $5.5 billion.

Contrary to other labs like OpenAI, they are much more down-to-earth and not focused on building artificial general intelligence, or AGI, or whatever that is (nobody really knows).

TheWhiteBox’s take:

Cohere has quite the challenge ahead. Enterprises have a much lower adoption rate than individual users due to the nature of LLMs, which never provide sufficient guarantees of not making mistakes.

Concerningly, The Information estimates Cohere was already valued at 227 times projected revenues (2024 run rate, meaning that only if they achieve their predicted revenues will they be valued at that multiple). Hence, this company is one of the prime examples of the AI bubble.

That said, they are really nice guys and seem to know what they are doing, so hopefully they get things straight.

A new, physics-based method looks at the eyes and can tell with some confidence if the portrait is AI-generated or not, taking inspiration from astronomy.

TheWhiteBox’s take:

Very cool method, but I’m afraid that the extent to which this will be useful in the future will be limited as models learn to generate better images.

I foresee the idea of forcing model providers to watermark the generated images a more viable one, as there are already methods in existence that, while invisible to the naked eye, can be used to identify AI-generated images.

What’s coming next?

This week on our weekly premium Leaders segment we will looking into the depths of one of the world’s most powerful corporations, Google, and what their AI strategy and outlook are for the next years.

For business inquiries, reach me out at [email protected]