• TheTechOasis
  • Posts
  • Scent Teleportation, Anthropic Signals The Alarm, & More

Scent Teleportation, Anthropic Signals The Alarm, & More

In partnership with

For business inquiries, reach me out at [email protected]

THEWHITEBOX
TLDR;

  • 👃🏼 Scent Teleportation

  • 🤧 NotebookLM Killing CRMs?

  • 🥵 Big Tech CAPEX Continues to Grow

  • 🍎 Apple Intelligence Underwhelms

  • 📰 Other news from OpenAI, AI Radio Hosts, & More

  • [TREND OF THE WEEK] Steering AI Models to Protect… Or Censor Us

Learn AI in 5 Minutes a Day

AI Tool Report is one of the fastest-growing and most respected newsletters in the world, with over 550,000 readers from companies like OpenAI, Nvidia, Meta, Microsoft, and more.

Our research team spends hundreds of hours a week summarizing the latest news, and finding you the best opportunities to save time and earn more using AI.

PREMIUM CONTENT
New Premium Content

NEWSREEL
Scent Teleportation

Osmo AI claims to have â€˜digitized smell.’ In other words, their platform can not only reproduce any smell, it can also create new ones. In fact, it has released three entirely new smells you can actually buy.

The idea of creating new smells isn’t just for the sake of having a new bedroom fragrance. For instance, we could use this power to create new smells that aren’t hideous to humans but scare or deter plagues from attacking our crops. And before you accuse me of fantasizing, this is precisely what the Melinda & Gates Foundation is trying to do with Osmo as we speak.

For an overview of how the process works, check this short YouTube video.

TheWhiteBox’s takeaway:

If Osmo AI can truly deploy this at scale, this is massive. This not only provides AI models with the capacity to ‘perceive’ smell, breaking down scents to the molecular level and predicting how that scent smells, but it can also provide humans with new ones.

To do so, they built an ‘odor map’ where different smells are clustered according to similarity. This approach is similar to HumeAI's attempt to categorize human emotions, and it is a beautiful example of modern AI. But how is that?

Most current AI models have a common principle: similarity. It’s how machines ‘understand’ our world; while they can’t taste an apple, they learn that apples are more similar to oranges than to a steak. AIs are great at finding the common patterns in data that similar concepts share, like all foods under the fruits’ tag. But the impressive thing about AI is that it can find latent patterns that humans might not have picked on (more on that in today’s trend of the week below), allowing civilization to deepen its understanding of the world.

  • With HumeAI, we now have a better intuition of how humans express emotion

  • With ArchetypeAI, last week’s trend of the week, we may find new patterns in physics and nature

  • And with Osmo, we can now find how molecules interact with each other to generate new smells.

I predict that by 2025, most of the breakthroughs in AI will come more from AI discovering stuff than helping us draft emails faster.

SAAS INDUSTRY
NotebookLM Killing CRMs?

A former Facebook VP of Product now turned venture capitalist, Sam Lessin, has published a tweet claiming that NotebookLM alone could be a killer for Customer Relationship Management software, or CRM, with prominent examples like Salesforce.

The claim stems from the idea that most CRM tasks involve registering information about potential leads, and context about engagements, acting as a system of record that allows for a structured approach to sales.

However, this feature loses a lot of value when you have AI tools that can instantly organize that unstructured data for you and, importantly, make it easy to digest, with cases like NotebookLM.

TheWhiteBox’s takeaway:

While it’s easy to think of many ways this vision is still very far away, I kind of see where this is going. As you may have guessed based on my SaaS pieceI’m not particularly bullish on the future of many software companies, whose moats may soon be obliterated by AI’s capacity to democratize access to data and software development.

In layman’s terms, he’s trying to make the critical point that most of the value many CRMs provide (and I’m extrapolating this to SaaS in general) is structuring your data and proceeds. But with AI, this data can be easily parsed and repackaged in digestible format (text, like ChatGPT, or podcasts, like NotebookLM), making the value of many SaaS products negligible in the long run.

To make matters worse for SaaS companies, many customers are cutting back on spending based on projections that AI might be able to replace their offerings, leading to recent sizable layoffs by SaaS companies like Dropbox or Miro.

MARKETS
Big Tech CAPEX Continues to Grow

Despite uneven reception from investors, one thing that Microsoft, Meta, and Google’s latest quarterly earnings reports have in common is that investment in AI, mainly the purchase of land, labor, and equipment for AI datacenters, is showing no signs of slowing down.

  • Google reported another $13 billion

  • Microsoft reported $20 billion

  • Meta raised its guidance for total CAPEX for 2024 to $38 billion

Despite this, none of them disclosed precise numbers on direct revenue generated by their AI initiatives, an alarming sign that the gap between revenues and investment in AI is increasing.

TheWhiteBox’s takeaway:

While investors aren’t panicking yet, the nervousness around these stocks and their huge bet on AI is palpable. The fact that most Generative AI products aren’t sticking, with examples like Microsoft Copilot seeing massive adoption problems, doesn’t help.

Particularly concerning is the Business-to-Business (B2B) market. Simply put, there’s a lot of hype and interest around GenAI products, but few get past the demo stage. This reality is beautifully summarized by Weights & Biases’ CEO, saying GenAI is ‘easy to demo, hard to productionize.’ While this is quite common in tech, what isn’t common is companies pouring dozens of billions a quarter into a technology based on future projections of unmaterialized demand.

EDGE AI
Apple Intelligence Underwhelms

Apple Intelligence has debuted, and the results are underwhelming (and hilarious). However, we have yet to see the release of the revamped Siri and more powerful features, as the released capabilities are exclusively for text processing and generation and object removal in photos.

Long story short, the results aren’t great, as the model makes rather dumb summaries of your notifications that may send your heart on a thrilling journey (see above). The AI fails to interpret common expressions or uses debatable terms like ‘intruders’ to refer to a cat coming across your Ring camera.

TheWhiteBox’s takeaway:

If you’re a regular reader of this newsletter, you aren’t surprised. While the underlying architecture, which we covered in detail in the past, is exciting, Apple has to deal with the fact that most people do not understand that its value proposition, focused on running small AI models in your device instead of huge models running in billion-dollar data centers, implies a considerable loss of performance, has a long way to go before paying off.

Concerningly society has no idea of this complexity. So, to them, Apple simply appears behind others like Google and Microsoft in their AI capabilities. Consequently, Apple finds itself at a crossroads; they have a firm obligation to deliver state-of-the-art quality AI capabilities through the iPhone to revamp lagging sales, but the reality is that the technology (LLMs) isn’t mature enough at the model size that Apple can work with (no bigger than 3 billion parameters, ideally smaller not to overeat RAM, which is just 6GB in the latest iPhone release).

Long story short, if our concerns about frontier AI model intelligence are already very real, it doesn’t get a genius to realize that Apple’s AI will be very dumb for the time being.

OpenAI’s SimpleQA Benchmark

Yesterday, OpenAI released SimpleQA, a benchmark that evaluates how much models hallucinate by asking objectively verifiable fact-seeking questions on various matters. The benchmark is challenging for models, with o1-preview having the highest score (48%)

AI Radio Hosts

A Polish radio has laid off DJs and journalists and replaced them with AI-generated college kids. This move feels eerily inspired by what Google is doing with NotebookLM’s podcast voices, which can turn any text into a podcast.

Cohere’s Multimodal Embedding Model

Cohere has released a new multimodal embedding model. Embedding models transform your data into vector embeddings grouped according to ‘similarity.’ In simple terms, this model allows you to create search systems for your data in which you can use images and text to search (i.e., use text to search for images, similar to Google search images, and vice versa, but with your company’s data).

Reconstructing Video from Thoughts

A group of researchers has created a model that maps thoughts into video. In layman’s terms, they take fMRI data from the brain while a person sees a video and then use this brain data to reconstruct the original video.

Learning to decode brain signals into human-understandable data could be instrumental to enabling disabled people to communicate better while deepening our understanding of how the brain works.

Belgium’s Artificial Energy Island

Belgium is constructing an artificial island 43km (30 miles) off its coast to use wind to produce up to 3.5 GigaWatts of power. It is enough to supply energy to almost 3 million homes at an average US home consumption of 10,500 kWh/year.

Although this isn’t directly related to AI, energy supply is the fundamental bottleneck in deploying AI for training and inference worldwide.

Therefore, it’s interesting to see that countries are finding innovative ways to generate new electricity (this isn’t exactly new; China and Abu Dhabi have done it before), especially considering that according to Iberdrola’s CEO, a Spanish utility company, Small Nuclear Reactors, the technology Big Tech is hoping to leverage for the massive deployment of AI datacenters, might not be fully ready before 2035 (and also have important problems illustrated by this video).

TREND OF THE WEEK
Steering AI Models To Protect… or Censor Us

You would be surprised how little we know about AI. But after today, you’ll know more than most.

Anthropic, OpenAI’s biggest rival, has released exciting research on the different experiments they’ve been trying on feature engineering, as they call it.

By studying the different concepts—known as ‘features’—a model learns, we can strengthen or clamp the neurons that elicit those concepts and see whether the model adapts its behavior to our liking (sounds complicated, but I promise you it’s not).

Feature engineering has already proven capable of making an LLM convince itself it was â€˜The Golden Gate Bridge,’ as we saw in this newsletter a while back. Now, these same researchers have deepened their understanding—and soon, yours—in one of the hottest areas in the industry: mechanistic interpretability, which aims to decipher the secrets inside LLMs.

Sadly, the results were somehow disappointing and, in some cases, alarming. But why?

Uncovering the Secrets of AI

The first step to understanding Anthropic’s industry-leading research is to understand how they found the features a model learns.

Toward Monosemanticity

But first, what are these models? LLMs, like any other neural network, are a set of weights called neurons connected to other neurons, simulating the neuron connections in our brain.

Each neuron has an activation function, which determines whether the neuron ‘fires.’ In the case below, the ReLU function fires the neuron whenever its value is positive and shuts it down when it has a negative value.

As mentioned, neurons are connected forming a network, hence the name ‘neural networks.’ Ideally, each neuron could store information on a particular topic, but in 2023, Anthropic published breakthrough research on how LLMs encode knowledge that proved otherwise.

In fact, each neuron is polysemantic (it stores information on one or more semantically unrelated topics). However, they also discovered that specific connections of these neurons do lead to monosemanticity. In other words, we could assign a given concept to a set of neurons activating in unison (i.e., if neurons 3, 4000, and 45 fire together, the model's output relates to burgers).

Knowing this, Anthropic researchers posited, what if we can map these combinations into known features to dial-up or clamp down these neurons to enforce or block such concepts? For this, they utilized a model called a Sparse AutoEncoder, or SAE.

Automating Feature Extraction

Without going into much detail, the idea behind SAEs is that we can take the activations of these neurons and map them automatically into a sparse set of features. This sounds complicated but it isn’t, bear with me.

If we look at the image below, one of the features Anthropic mapped was ‘Transit Infrastructure.’ To assign a name to that concept, they observed how a set of neurons always activated whenever the set of the outputted words included things like ‘bridge,’ ‘aqueduct,’ or ‘bay.’

Consequently, we can isolate the key features that lead to the prediction of a certain word, creating a feature map that allows us to visualize the concepts that the LLM has learned.

But why would we want that?

Steering LLMs

LLMs don’t have a handful of neurons; they have billions and trillions, in some cases, like GPT-4. Fascinatingly, the number of features a model can learn is higher. This leads to a combinatorial explosion where neurons combine in unpredictable ways, making LLMs notoriously opaque.

Thus, LLMs are basically a mystery to humans, even to their creators. As we know so little, predicting their behavior becomes impossible. But what if we can use these human-interpretable features to dictate what the model generates or not?

In the research above, Anthropic first toyed with this idea, leading to fascinating examples like dialing up the ‘Golden Gate Bridge’ feature (by increasing the values of the neurons activating that feature), resulting in the model ‘embodying’ the bridge.

Achieving this uncanny result, Anthropic has now explored further to release the most advanced research on LLM steering to date.

And the results are… well, mixed.

Steered, But At What Cost

One of the biggest reasons one might want to steer models is to prevent undesired outputs. These models compress the Internet's entire knowledge into its neurons, which, of course, includes biased data and, quite frankly, worse traits like racism, homophobia, pedophilia, and so on.

Even if these were scraped from the Internet in the form of jokes, the delivery by the model, or worse, the interpretation it might do, could be… unexpected:

Apple Intelligence interpretation of somebody that had a tough workout.

But seeing how we can make a model behave like a monument when dialing up that feature, can we do the same if we map features that include unwanted biases?

And the answer is… kind of.

Finding the Sweet Spot

Anthropic focused the entire study on social biases, precisely the 29 features in the image below.

The first insight they found was that there was a ‘sweet spot’ range in which you can dial-up/down a feature, resulting in more/less expression of that bias. However, above and below that range this steering led to worse overall performance of the model in the MMLU (Massive Multi-task Language Understanding) benchmark, which measures general knowledge by the model:

But things get ever weirder.

Risk of Overdrive

In some cases, while steering a particular feature, you might create unexpected effects on others. For instance, if we take the ‘Gender bias awareness’ feature and dial up its importance, we not only appreciate an increase in bias regarding gender identity, but we also see an increase in Age bias, apparently not entirely related in the first place.

This has an important connotation in that the model might be picking up correlations between seemingly unrelated features that humans might had never realized before.

This underpins our discussion on Sunday, in which we illustrated the potential for AI as a tool for discovering patterns in data.

Another vital insight worth mentioning is that some features impact overall bias. For example, dialing up the ‘Multiple perspectives and balance’ feature did just that, making the model more reflective of every perspective and aiming for balance, leading to an overall decrease in bias of over 3% on the BBQ benchmark.

What is the BBQ benchmark? It measures how biased models are against certain social classes (i.e., stereotyping that most doctors are men and most nurses are women, which is a bias that is surprisingly persistent even in current models).

Last but not least, considering we are in an election year in the US, Anthropic couldn’t resist the temptation to examine political biases, with some very surprising results.

Everyone is Biased

For example, they found that amplifying the “Pro-life and anti-abortion stance” feature (dark blue below) led to a significant increase in anti-abortion responses by 50%.

In contrast, increasing the “Left-wing political ideologies” feature (orange) showed the opposite effect, reducing anti-abortion responses by 47%, which is expected.

Curiously, the “Political neutrality and independence” feature (green) showed a moderate positive shift, increasing neutrality from 32% to 50% on the issue. In other words, increasing ‘political neutrality’ led to higher anti-abortion stances, which could suggest that more independent voters seem to align more with anti-abortion issues than with pro-abortion sentiment.

Again, these findings prove how AI could soon serve as a great tool to uncover biases in data (like providing more insight into how independents feel), shedding light on the different stereotypes and beliefs that different cohorts of people may have regarding other cohorts… or themselves.

However, once again, we see that feature steering can lead to undesired global changes in the model, signaling that, although very promising, it might be harder than we initially expected.

TheWhiteBox’s takeaway

If we are really striving to build better models, we first need to ensure we can comprehend their behavior. Otherwise, it’s like giving the steering wheel to an AI that has unpredictable behavior; it could act perfectly or embody a drunken driver.

Thus, we need greater insight and the capacity to intervene in models if they develop dangerous biases. On the other hand, it doesn’t take long before you realize how this technique could be used completely incorrectly: to censor.

Let’s not forget that we are learning to clamp down on features we don’t like. This could mean racism, but it could also mean clamping on liberal ideas if a more conservative figure takes control over the LLM (or vice versa).

Therefore, if we allow the main interface between humans and knowledge (mostly the Internet today) to become dominated by private LLMs, it takes no genius to acknowledge that the companies and their shareholders will have the power to steer society’s sentiment to their liking, manipulating us into thinking in a specific way or, worse, turning lies into truths and vice versa.

❝

Now more than ever, we must ensure that the LLM industry stays open-source and diverse. It’s these discussions that matter, not whether a next-word predictor will develop agency and kill us all.

THEWHITEBOX
Premium

If you like this content, by joining Premium, you will receive four times as much content weekly without saturating your inbox. You will even be able to ask your questions you need answers to.

Until next time!