• TheTechOasis
  • Posts
  • Google does it again, Microsoft disappoints, & more.

Google does it again, Microsoft disappoints, & more.

In partnership with

THEWHITEBOX
TLDR;

  • 📰 News from Microsoft, mind-controlled Apple Vision Pros, Perplexity, & more.

  • 📚 The Insights Corner

  • 👏 Google Conquers Depth

An entirely new way to present ideas

Gamma’s AI creates beautiful presentations, websites, and more. No design or coding skills required. Try it free today.

🚨 News of the Week

MARKETS
Microsoft Disappoints

Starting with Microsoft, their quarterly earnings were quite a disappointment. Presented on Tuesday, the stock fell 6% after hours.

The major correction came after they announced their slower growth in its Azure business (cloud computing), from 31% last quarter to 29% this one, even though they are still growing faster than Google despite being a larger cloud.

In the meantime, they are increasing their investments in Generative AI equipment, reaching an astonishing $19 billion last quarter alone (divided into land and data center equipment building and straight AI accelerator spending), for a yearly projected cost of almost $80 billion.

For reference, NVIDIA’s AI data center business has a projected revenue run rate of $90 billion.

TheWhiteBox’s take:

At first, it may seem that investors are exaggerating. However, they might not be, as people are starting to become really anxious about the huge investments in Generative AI with no discernable return yet.

For reference, if we take the Azure CEO’s word, they are investing around $3 billion a month on H100s, meaning the company's GPU spending might have been on $12 of the $19 billion this quarter.

Once again, just like Google (who also fumbled the bag, as I covered in detail in this piece), Microsoft failed to mention any specific Generative AI revenues (as you’ll see below, things are looking even worse off), which obviously means they aren’t great (barely existing at this point).

Investors might have been fooled in the past with the growth of the cloud business (as AI revenues are part of it), but the failure to disclose actual returns on GenAI investments is starting to take a toll on the values of these companies as they might be simply pretending to have a business when they don’t.

Adding insult to injury, with Sundar and Zuck, the CEOs of Google and Meta, openly acknowledging that some of these investments will probably be written off and that they are long-term plays—a narrative also echoed by Amy Hood, Microsoft’s CFO, who says these investments are a “15-year investment”—the complete absence of potential returns in the short term has investors panicking.

But should they? As I commented often (click here and here for an in-depth analysis), they definitely should.

AI AGENTS
Microsoft Copilot Also Disappoints

In an article by Business Insider, an IT Executive from an undisclosed Pharmaceutical company acknowledged they dumped Microsoft Copilot365 “due to high cost and low value.”

This is Microsoft’s AI agent, which helps customers perform actions in the Office365 suite using an AI they can communicate with.

He even compared the capabilities of creating PowerPoint to “middle school presentations,” prompting the company to drop the service altogether.

TheWhiteBox’s take:

If you’ve been reading my newsletter for quite some time, you should NOT be surprised. As I’ve covered, Generative AI products are not yet fully ready for enterprise adoption (they are ready in terms of unit economics, but a lot of fine-tuning is required to make them work well).

In this regard, agents are the least enterprise-ready of all GenAI deployments because they require LLM-level understanding and simultaneous seamless integration with third-party services, which increases the failure rate.

This leads to a rather daunting question: Will agents ever work?

My gut tells me yes, but there’s really an ironic thing going on here: wanting to use stochastic models (LLMs always have a random component to the predictions they make, leading to inaccuracies) for actions that should be fully deterministic (no mistakes at all), like interacting with an Excel file or a PowerPoint.

Only time will tell if this makes sense, but one thing’s clear based on the previous news: investors want to know, and the clock is ticking.

FUTURE
Mind-controlled Apple Vision Pros

Synchron, a rival to Elon Musk’s Neuralink company that allows people with disabilities to interact with interfaces with their minds, has announced a technology that allows people to control Apple’s Vision Pro goggles with their minds (the link includes a video demonstration).

TheWhiteBox’s take:

Brain-computer interfaces, or BCIs, as with any AI, are data transformations.

If ChatGPT turns input text data into text continuations, or MidJourney takes in text and outputs images, BCIs take brain signals (in the case of Synchron through invasive brain implants directly implanted into the human’s brain tissue) and decode—transform—those signals into actions, in this case over the screen of an Apple Vision Pro.

Read my article on Neuralink to further understand how they work.

Out of all the things AI will disrupt, being able to reduce or eliminate disabilities by helping their brand decode actions or information better is probably the one I’m most excited about.

Nonetheless, according to the World Health Organization, 16% of the global population, or 1.3 billion, have a significant disability and could one day benefit from this technology.

BUSINESS MODELS
Perplexity’s revenue-sharing business

Perplexity, the AI-search company accused of plagiarism by several publishers, such as Forbes or Wired, has announced a revenue-sharing business model in which money will flow into the publisher if its content is used as part of an AI-generated search response.

The company is also exploring ads via related follow-up content, where publishers pay to appear as another revenue source from which they will receive “double-digit revenue percentages” if their content is used to answer the question, acting as a sort of ‘reimbursement’ over the ad pay.

TheWhiteBox’s take:

This news has much more than meets the eye; it’s a clear view into what might become one of the best business models in history or a legendary flop: AI-generated search.

I am concerned about how well companies like Google, Microsoft, Perplexity, and even OpenAI, with the just-announced SearchGPT, will monetize this abnormally huge business.

If we take Google’s recently announced Google Search ad revenues ($192 billion run rate) and factor in Google’s 91% market share, we are talking about a 211-ish billion-dollar-a-year business.

In Perplexity’s case, instead of ads, they present a subscription service that costs $20/month. Worth every penny, but here’s the thing:

Adding to the massive cost of goods sold (COGS) of using the different model provider APIs, we need to factor in a revenue-sharing mechanism that will pay publishers per search.

For reference, I estimate Perplexity’s API costs to be around 17% of revenues in a very conservative analysis, which is not great considering further COGS (in-house search engine) could imply that gross margins are actually closer to 50%, leaving little margin for operating costs (like salaries, bills, rents…, etc.).

Long story short, the outlook isn’t great.

I published the full-blown analysis of the AI search business model on the Notion Premium site.

LEARN
The Insights Corner

🧐 Test-time training, byCloud. A look into test-time training, originally proposed by OpenAI, which could be the next reasoning frontier.

😒 Can LLMs Reason? by Prof. Kambhampati in the MLST podcast, illustrating the clear limitations of LLM intelligence.

PREMIUM NOTION SITE
Things you’ve missed from not being Premium…

I’ve launched a Notion site for Premium members, where you will receive much more content (with no additional email overload, don’t worry), including insights not shared anywhere else.

It also allows you to ask direct questions you want me to answer for a more tailored experience.

For this week, we have new pieces of Premium-only content such as:

  • 😟 The Worst Mistake You Can Do With LLMs. The single best advice I can provide on everyday LLM use.

  • 😖 $210 billion on the line. Will the AI Search Business Work? A thorough analysis of how AI-generated search will work and the estimated margins.

  • 🤔 Are Markets Pricing in NVIDIA’s Risks to non-GPU Hardware? An analysis of NVIDIA’s current market state and potential risk factors not currently priced in.

  • 🥸 AI and Its Effects on SaaS Companies. Answering a subscriber question on how impacted SaaS companies are by AI and what key metrics they should be looking for.

  • 🤖 How Klarna Substituted 700 Humans with AI. How Klarna became one of OpenAI’s greatest success stories.

DEMYSTIFYING FRONTIER AI IN SIMPLE WORDS
Google Depth Breakthrough

Google Deepmind has done it again. And this time, it’s a double win.

They have presented AlphaProof and AlphaGeometry 2, models that have achieved silver medalist-level performance by solving challenging International Mathematical Olympiad problems, competing with the best humanity has to offer.

This is a highly interesting piece of research, as it gives insight into the cutting-edge of what the AI industry represents in terms of mathematical and reasoning progress.

At the same time, it signals how researchers still have to crack the code to create what would certainly take us close to a real super AI: a depth generalizer.

But what do I mean by that?

Depth vs Breadth

At some time in your AI journey, you may have wondered: what made ChatGPT different from everything that came before?

The field of AI is older than most of us and can be traced back to the ‘Dartmouth Summer Research Project on Artificial Intelligence in 1956.

That said, Alan Turing first introduced the notion of AI in its historically significant Computing Machinery & Intelligence in 1950. One way or the other, it’s safe to say that AI has become prominent in our lives over the last two years.

A man way ahead of his time.

To put into perspective how advanced to his time Alan Turing was, he conceived machine thinking as ‘The Imitation Game.’ Well, 74 years later, most AI systems, including ChatGPT, are literally the embodiment of this idea.

And only a handful of AI systems, casually the ones we are looking at today, are not based on pure human imitation, making our story today even more relevant.

Before the advent of Large Language Models (LLMs) like ChatGPT, all AI was deeply ‘narrow.’ In other words, we trained models to perform one task as well as possible.

This idea, called ‘depth,’ governed the industry for decades, not because that was everything we wanted, but because generalization, when a model can perform various tasks, was just a pipe dream.

LLMs like ChatGPT sort of solved that problem. However, this has come at a sacrifice.

From AlphaGo to ChatGPT and Back

While this ‘weak generalization’ we have achieved with ChatGPT (they still completely fail in tasks where their memorization capabilities can’t help them) has been extremely economically fruitful to those building these models (especially Big Tech, which has added $7.26 trillion combined since November 2022 excluding Tesla), it’s also a setback in other regards.

For reference, $7.26 trillion is more than what the stock markets of the UK, Germany, and Spain are valued combined, and you would still have a spare trillion dollars.

However, despite what markets will indicate based on valuations, LLMs are good at many things but great at none.

In other words, we have sacrificed task-specific greatness in lieu of models that, while they can write a Shakespearean poem and talk to you about nuclear fission, both responses, if evaluated by experts, will be surprisingly average.

In more technical terms, our best models are currently losing ‘depth’ (per-task prowess) for better ‘generalization’ (doing many tasks but being ‘meh’ at them).

While this turn is understandable from a business perspective, it has had tragic consequences, and we might be going backward.

Before ChatGPT came into our lives, the most impressive AI the world had ever seen was the AlphaGo model family, deep neural networks (just like ChatGPT in that sense) that achieved superhuman capabilities in the game of Go, a Chinese board game (with AlphaZero being the state-of-the-art not only in Go but also in Shogi and Chess).

It even defeated Lee Sedol, the champion at the time, in a historical event that even led to documentaries.

But how do Alpha (Go and Zero) work?

They are based on Monte Carlo Tree Search, but on steroids. The idea is that for each new movement, the models explore thousands or even millions of possible outcomes for each movement they could make, deciding on the one that its probabilities suggest is the best.

In a way, you can think of these models as machines that ‘look ahead’ of their current move to choose the best one. Interestingly, this is the precise capability researchers are trying to instill in LLMs to improve their reasoning, using methods similar to MCTS, like Tree-of-Thought, depicted below, where the LLM explores different solution paths before deciding on one.

For more detail on the convergence of both worlds in the path to conquer true intelligence, read my deep dive ‘Is AI Really Intelligent?’

Tree-of-Thought. Source

Although these models could only play Go (and a handful of other board games), they were better than any human in history at them. But as money has flown into LLMs, the quest for superhuman AIs like these has mostly stalled in recent years.

Obviously, merging both worlds is the ultimate triumph, and here is where our fellows at Google come in: Can we create an LLM with star capabilities at many tasks?

On the Measure of Rewards

Despite what many may believe thanks to OpenAI’s fame, when it comes to achieving depth, Google Deepmind is head and shoulders above the rest.

This is because they are great at Reinforcement Learning, a field where robots are incentivized to perform actions in an environment by punishing or rewarding them depending on the quality of the outcome.

As we need to measure the rewards continuously, this method requires an auxiliary model, a reward model, to perform this evaluation. In a way, RL is like playing a game, but a game in which the model learns based on this reward feedback.

As you may have guessed, the quality of the outcome depends heavily on choosing the correct reward and punish mechanisms, which is a really hard thing to define, especially on robotics.

In particular, NVIDIA has been proposing AIs that build reward functions for some time, achieving impressive results we covered in this newsletter a while back.

In a particular set of cases, like AlphaGo, this method can create ‘superhuman AIs’ that, by performing self-improvement—sometimes known as self-play, or using its outputs as feedback—can transcend human limitations and become much better than us at certain tasks.

A good example is this video we shared in this newsletter months ago, a robot that, in just six hours of training, becomes superhuman at the “Labyrinth” game.

Well, now, Deepmind is experimenting with this idea in mathematics theorem proving. And the results are humbling (for humans).

In The Conquest of Maths Reasoning

Google Deepmind has presented two models:

And, in both cases, LLMs play a crucial role.

AlphaProof

To design AlphaProof, they created a self-improvement loop of theorem proving with AlphaZero, the model we discussed earlier, by using an LLM to draft the mathematical statements in a formal way that AlphaZero can then try to prove.

Then, for those theorems that AlphaZero successfully proves, they use them to ‘reinforce’ that behavior, aka they use them as a signal to improve the model.

The reason for this is that adequate data for such type of training is almost non-existent.

Thus, using Gemini’s capabilities to rewrite data (depicted as the ‘Formalizer network’ above), they created 100 million formal problems on which they trained AlphaZero.

AlphaGeometry 2

As for the latter, AlphaGeometry is an even more interesting model. In a nutshell, it’s a neurosymbolic AI model, a combination of an LLM (Gemini) and symbolic engines.

But what do I mean by that?

A neurosymbolic system is an AI system that combines a neural network, in this case, an LLM, with hard-coded systems of human-written code that can perform accurate calculations or actions if we constrain the problem enough.

For instance, a symbolic engine might be a mathematics software written by humans that takes in a set of constraints and calculates the output (like being provided the length of two sides of a right triangle and using the Pythagoras theorem to compute the third side).

But what is the role of the LLM here? They search. But search what?

Symbolic engines are lists of conditions written by humans in the form of ‘if x happens, do y'. They are the epitome of what we call ‘Symbolic AI,’ which, in reality, is just machines imitating intelligent human actions in highly constrained environments.

But here’s the thing: when facing an open geometry theorem problem, there are potentially infinite ways to approach the solution.

Symbolic engines can’t search; they are limited by the number of scenarios that the humans who coded that engine thought of. In other words, when faced with open problems, they don’t work.

So what does Gemini (Google’s LLM) do? When faced with an open problem like proving a geometry problem, it suggests what researchers call ‘auxiliary constructions’, cues added to the problem that constrain the space of possible solutions.

For instance, in the image below, Gemini proposes computing point E (the right angle in triangle AEB), which is then used to compute other triangles that narrow the problem and facilitate the solution.

AlphaGeometry 2 solving a problem. Source

In layman’s terms, Gemini suggests possible solution paths to the symbolic engines performing the computation. Simply put, we are narrowing down an open problem into something the symbolic engines can compute and test whether it’s sufficient to prove the theorem.

And that silver medal tells you all you need to know about how powerful these models are combined.

Once again, Google is openly telling the world: When it comes to deep AI, everyone else is in our rearview mirror.

TheWhiteBox’s take

Technology:

In a way, these two models exemplify what, in a nutshell, is the next frontier of AI: combining LLM-powered search with RL-trained models that excel in depth (aka, excel at specific tasks).

Unlocking this paradigm at scale could create the first deep generalizer that, akin to humans, can not only perform several tasks but, upon facing a complex problem, can search the ‘space of possible solutions’ until it finds the best one.

Products:

It’s unclear how Google will monetize this; these models seem more like research projects.

However, the very smart use of neurosymbolic systems, which not only learn faster but are cheaper to run, might suggest that Google could release AlphaGeometry 2, at least to academia, to enhance the works of mathematicians worldwide.

Markets:

Research like this should have a major effect on markets, but it doesn’t, as investors usually look at numbers and what a handful of biased tech tycoons say in deeply orchestrated interviews.

However, when considering these investors' sole job is to predict the future and make a return, seeing a company like Google present research like this should be extremely optimistic and even a major signal that Google might soon release a commercially available AI mathematician copilot.

THEWHITEBOX
Closing Thoughts 🧐

This week perfectly exemplifies AI’s reality. It is absolutely impressive in the lab, but a sheer disappointment in production environments, with investors really starting to lose patience as only Meta has presented over-the-top results.

AI desperately needs killer applications to emerge before it’s too late. Therefore, this Sunday, we’ll look at the top killer applications I predict AI will develop over the next year.

Until next time!

For business inquiries, reach me out at [email protected]