- TheTechOasis
- Posts
- Scent Teleportation, Anthropic Signals The Alarm, & More
Scent Teleportation, Anthropic Signals The Alarm, & More
For business inquiries, reach me out at [email protected]
THEWHITEBOX
TLDR;
đđź Scent Teleportation
𤧠NotebookLM Killing CRMs?
𼾠Big Tech CAPEX Continues to Grow
đ Apple Intelligence Underwhelms
đ° Other news from OpenAI, AI Radio Hosts, & More
[TREND OF THE WEEK] Steering AI Models to Protect⌠Or Censor Us
Learn AI in 5 Minutes a Day
AI Tool Report is one of the fastest-growing and most respected newsletters in the world, with over 550,000 readers from companies like OpenAI, Nvidia, Meta, Microsoft, and more.
Our research team spends hundreds of hours a week summarizing the latest news, and finding you the best opportunities to save time and earn more using AI.
PREMIUM CONTENT
New Premium Content
Fighting Hallucinations with Uncertainty: A deep look into how you can fight hallucinations by leveraging a clever trick to measure entropy yourself (Only Full Premium Subscribers)
Apple speaks the truth about AI. Itâs not good: A deeper dive into Appleâs hard-hitting research on the limitations of AI. (All Premium Subscribers)
NEWSREEL
Scent Teleportation
Osmo AI claims to have âdigitized smell.â In other words, their platform can not only reproduce any smell, it can also create new ones. In fact, it has released three entirely new smells you can actually buy.
The idea of creating new smells isnât just for the sake of having a new bedroom fragrance. For instance, we could use this power to create new smells that arenât hideous to humans but scare or deter plagues from attacking our crops. And before you accuse me of fantasizing, this is precisely what the Melinda & Gates Foundation is trying to do with Osmo as we speak.
For an overview of how the process works, check this short YouTube video.
TheWhiteBoxâs takeaway:
If Osmo AI can truly deploy this at scale, this is massive. This not only provides AI models with the capacity to âperceiveâ smell, breaking down scents to the molecular level and predicting how that scent smells, but it can also provide humans with new ones.
To do so, they built an âodor mapâ where different smells are clustered according to similarity. This approach is similar to HumeAI's attempt to categorize human emotions, and it is a beautiful example of modern AI. But how is that?
Most current AI models have a common principle: similarity. Itâs how machines âunderstandâ our world; while they canât taste an apple, they learn that apples are more similar to oranges than to a steak. AIs are great at finding the common patterns in data that similar concepts share, like all foods under the fruitsâ tag. But the impressive thing about AI is that it can find latent patterns that humans might not have picked on (more on that in todayâs trend of the week below), allowing civilization to deepen its understanding of the world.
With HumeAI, we now have a better intuition of how humans express emotion
With ArchetypeAI, last weekâs trend of the week, we may find new patterns in physics and nature
And with Osmo, we can now find how molecules interact with each other to generate new smells.
I predict that by 2025, most of the breakthroughs in AI will come more from AI discovering stuff than helping us draft emails faster.
SAAS INDUSTRY
NotebookLM Killing CRMs?
A former Facebook VP of Product now turned venture capitalist, Sam Lessin, has published a tweet claiming that NotebookLM alone could be a killer for Customer Relationship Management software, or CRM, with prominent examples like Salesforce.
The claim stems from the idea that most CRM tasks involve registering information about potential leads, and context about engagements, acting as a system of record that allows for a structured approach to sales.
However, this feature loses a lot of value when you have AI tools that can instantly organize that unstructured data for you and, importantly, make it easy to digest, with cases like NotebookLM.
TheWhiteBoxâs takeaway:
While itâs easy to think of many ways this vision is still very far away, I kind of see where this is going. As you may have guessed based on my SaaS piece, Iâm not particularly bullish on the future of many software companies, whose moats may soon be obliterated by AIâs capacity to democratize access to data and software development.
In laymanâs terms, heâs trying to make the critical point that most of the value many CRMs provide (and Iâm extrapolating this to SaaS in general) is structuring your data and proceeds. But with AI, this data can be easily parsed and repackaged in digestible format (text, like ChatGPT, or podcasts, like NotebookLM), making the value of many SaaS products negligible in the long run.
To make matters worse for SaaS companies, many customers are cutting back on spending based on projections that AI might be able to replace their offerings, leading to recent sizable layoffs by SaaS companies like Dropbox or Miro.
MARKETS
Big Tech CAPEX Continues to Grow
Despite uneven reception from investors, one thing that Microsoft, Meta, and Googleâs latest quarterly earnings reports have in common is that investment in AI, mainly the purchase of land, labor, and equipment for AI datacenters, is showing no signs of slowing down.
Despite this, none of them disclosed precise numbers on direct revenue generated by their AI initiatives, an alarming sign that the gap between revenues and investment in AI is increasing.
TheWhiteBoxâs takeaway:
While investors arenât panicking yet, the nervousness around these stocks and their huge bet on AI is palpable. The fact that most Generative AI products arenât sticking, with examples like Microsoft Copilot seeing massive adoption problems, doesnât help.
Particularly concerning is the Business-to-Business (B2B) market. Simply put, thereâs a lot of hype and interest around GenAI products, but few get past the demo stage. This reality is beautifully summarized by Weights & Biasesâ CEO, saying GenAI is âeasy to demo, hard to productionize.â While this is quite common in tech, what isnât common is companies pouring dozens of billions a quarter into a technology based on future projections of unmaterialized demand.
EDGE AI
Apple Intelligence Underwhelms
Apple Intelligence has debuted, and the results are underwhelming (and hilarious). However, we have yet to see the release of the revamped Siri and more powerful features, as the released capabilities are exclusively for text processing and generation and object removal in photos.
Long story short, the results arenât great, as the model makes rather dumb summaries of your notifications that may send your heart on a thrilling journey (see above). The AI fails to interpret common expressions or uses debatable terms like âintrudersâ to refer to a cat coming across your Ring camera.
TheWhiteBoxâs takeaway:
If youâre a regular reader of this newsletter, you arenât surprised. While the underlying architecture, which we covered in detail in the past, is exciting, Apple has to deal with the fact that most people do not understand that its value proposition, focused on running small AI models in your device instead of huge models running in billion-dollar data centers, implies a considerable loss of performance, has a long way to go before paying off.
Concerningly society has no idea of this complexity. So, to them, Apple simply appears behind others like Google and Microsoft in their AI capabilities. Consequently, Apple finds itself at a crossroads; they have a firm obligation to deliver state-of-the-art quality AI capabilities through the iPhone to revamp lagging sales, but the reality is that the technology (LLMs) isnât mature enough at the model size that Apple can work with (no bigger than 3 billion parameters, ideally smaller not to overeat RAM, which is just 6GB in the latest iPhone release).
Long story short, if our concerns about frontier AI model intelligence are already very real, it doesnât get a genius to realize that Appleâs AI will be very dumb for the time being.
OpenAIâs SimpleQA Benchmark
Yesterday, OpenAI released SimpleQA, a benchmark that evaluates how much models hallucinate by asking objectively verifiable fact-seeking questions on various matters. The benchmark is challenging for models, with o1-preview having the highest score (48%)
AI Radio Hosts
A Polish radio has laid off DJs and journalists and replaced them with AI-generated college kids. This move feels eerily inspired by what Google is doing with NotebookLMâs podcast voices, which can turn any text into a podcast.
Cohereâs Multimodal Embedding Model
Cohere has released a new multimodal embedding model. Embedding models transform your data into vector embeddings grouped according to âsimilarity.â In simple terms, this model allows you to create search systems for your data in which you can use images and text to search (i.e., use text to search for images, similar to Google search images, and vice versa, but with your companyâs data).
Reconstructing Video from Thoughts
A group of researchers has created a model that maps thoughts into video. In laymanâs terms, they take fMRI data from the brain while a person sees a video and then use this brain data to reconstruct the original video.
Learning to decode brain signals into human-understandable data could be instrumental to enabling disabled people to communicate better while deepening our understanding of how the brain works.
Belgiumâs Artificial Energy Island
Belgium is constructing an artificial island 43km (30 miles) off its coast to use wind to produce up to 3.5 GigaWatts of power. It is enough to supply energy to almost 3 million homes at an average US home consumption of 10,500 kWh/year.
Although this isnât directly related to AI, energy supply is the fundamental bottleneck in deploying AI for training and inference worldwide.
Therefore, itâs interesting to see that countries are finding innovative ways to generate new electricity (this isnât exactly new; China and Abu Dhabi have done it before), especially considering that according to Iberdrolaâs CEO, a Spanish utility company, Small Nuclear Reactors, the technology Big Tech is hoping to leverage for the massive deployment of AI datacenters, might not be fully ready before 2035 (and also have important problems illustrated by this video).
TREND OF THE WEEK
Steering AI Models To Protect⌠or Censor Us
You would be surprised how little we know about AI. But after today, youâll know more than most.
Anthropic, OpenAIâs biggest rival, has released exciting research on the different experiments theyâve been trying on feature engineering, as they call it.
By studying the different conceptsâknown as âfeaturesââa model learns, we can strengthen or clamp the neurons that elicit those concepts and see whether the model adapts its behavior to our liking (sounds complicated, but I promise you itâs not).
Feature engineering has already proven capable of making an LLM convince itself it was âThe Golden Gate Bridge,â as we saw in this newsletter a while back. Now, these same researchers have deepened their understandingâand soon, yoursâin one of the hottest areas in the industry: mechanistic interpretability, which aims to decipher the secrets inside LLMs.
Sadly, the results were somehow disappointing and, in some cases, alarming. But why?
Uncovering the Secrets of AI
The first step to understanding Anthropicâs industry-leading research is to understand how they found the features a model learns.
Toward Monosemanticity
But first, what are these models? LLMs, like any other neural network, are a set of weights called neurons connected to other neurons, simulating the neuron connections in our brain.
Each neuron has an activation function, which determines whether the neuron âfires.â In the case below, the ReLU function fires the neuron whenever its value is positive and shuts it down when it has a negative value.
As mentioned, neurons are connected forming a network, hence the name âneural networks.â Ideally, each neuron could store information on a particular topic, but in 2023, Anthropic published breakthrough research on how LLMs encode knowledge that proved otherwise.
In fact, each neuron is polysemantic (it stores information on one or more semantically unrelated topics). However, they also discovered that specific connections of these neurons do lead to monosemanticity. In other words, we could assign a given concept to a set of neurons activating in unison (i.e., if neurons 3, 4000, and 45 fire together, the model's output relates to burgers).
Knowing this, Anthropic researchers posited, what if we can map these combinations into known features to dial-up or clamp down these neurons to enforce or block such concepts? For this, they utilized a model called a Sparse AutoEncoder, or SAE.
Automating Feature Extraction
Without going into much detail, the idea behind SAEs is that we can take the activations of these neurons and map them automatically into a sparse set of features. This sounds complicated but it isnât, bear with me.
If we look at the image below, one of the features Anthropic mapped was âTransit Infrastructure.â To assign a name to that concept, they observed how a set of neurons always activated whenever the set of the outputted words included things like âbridge,â âaqueduct,â or âbay.â
Consequently, we can isolate the key features that lead to the prediction of a certain word, creating a feature map that allows us to visualize the concepts that the LLM has learned.
But why would we want that?
Steering LLMs
LLMs donât have a handful of neurons; they have billions and trillions, in some cases, like GPT-4. Fascinatingly, the number of features a model can learn is higher. This leads to a combinatorial explosion where neurons combine in unpredictable ways, making LLMs notoriously opaque.
Thus, LLMs are basically a mystery to humans, even to their creators. As we know so little, predicting their behavior becomes impossible. But what if we can use these human-interpretable features to dictate what the model generates or not?
In the research above, Anthropic first toyed with this idea, leading to fascinating examples like dialing up the âGolden Gate Bridgeâ feature (by increasing the values of the neurons activating that feature), resulting in the model âembodyingâ the bridge.
Achieving this uncanny result, Anthropic has now explored further to release the most advanced research on LLM steering to date.
And the results are⌠well, mixed.
Steered, But At What Cost
One of the biggest reasons one might want to steer models is to prevent undesired outputs. These models compress the Internet's entire knowledge into its neurons, which, of course, includes biased data and, quite frankly, worse traits like racism, homophobia, pedophilia, and so on.
Even if these were scraped from the Internet in the form of jokes, the delivery by the model, or worse, the interpretation it might do, could be⌠unexpected:
Apple Intelligence interpretation of somebody that had a tough workout.
But seeing how we can make a model behave like a monument when dialing up that feature, can we do the same if we map features that include unwanted biases?
And the answer is⌠kind of.
Finding the Sweet Spot
Anthropic focused the entire study on social biases, precisely the 29 features in the image below.
The first insight they found was that there was a âsweet spotâ range in which you can dial-up/down a feature, resulting in more/less expression of that bias. However, above and below that range this steering led to worse overall performance of the model in the MMLU (Massive Multi-task Language Understanding) benchmark, which measures general knowledge by the model:
But things get ever weirder.
Risk of Overdrive
In some cases, while steering a particular feature, you might create unexpected effects on others. For instance, if we take the âGender bias awarenessâ feature and dial up its importance, we not only appreciate an increase in bias regarding gender identity, but we also see an increase in Age bias, apparently not entirely related in the first place.
This has an important connotation in that the model might be picking up correlations between seemingly unrelated features that humans might had never realized before.
This underpins our discussion on Sunday, in which we illustrated the potential for AI as a tool for discovering patterns in data.
Another vital insight worth mentioning is that some features impact overall bias. For example, dialing up the âMultiple perspectives and balanceâ feature did just that, making the model more reflective of every perspective and aiming for balance, leading to an overall decrease in bias of over 3% on the BBQ benchmark.
What is the BBQ benchmark? It measures how biased models are against certain social classes (i.e., stereotyping that most doctors are men and most nurses are women, which is a bias that is surprisingly persistent even in current models).
Last but not least, considering we are in an election year in the US, Anthropic couldnât resist the temptation to examine political biases, with some very surprising results.
Everyone is Biased
For example, they found that amplifying the âPro-life and anti-abortion stanceâ feature (dark blue below) led to a significant increase in anti-abortion responses by 50%.
In contrast, increasing the âLeft-wing political ideologiesâ feature (orange) showed the opposite effect, reducing anti-abortion responses by 47%, which is expected.
Curiously, the âPolitical neutrality and independenceâ feature (green) showed a moderate positive shift, increasing neutrality from 32% to 50% on the issue. In other words, increasing âpolitical neutralityâ led to higher anti-abortion stances, which could suggest that more independent voters seem to align more with anti-abortion issues than with pro-abortion sentiment.
Again, these findings prove how AI could soon serve as a great tool to uncover biases in data (like providing more insight into how independents feel), shedding light on the different stereotypes and beliefs that different cohorts of people may have regarding other cohorts⌠or themselves.
However, once again, we see that feature steering can lead to undesired global changes in the model, signaling that, although very promising, it might be harder than we initially expected.
TheWhiteBoxâs takeaway
If we are really striving to build better models, we first need to ensure we can comprehend their behavior. Otherwise, itâs like giving the steering wheel to an AI that has unpredictable behavior; it could act perfectly or embody a drunken driver.
Thus, we need greater insight and the capacity to intervene in models if they develop dangerous biases. On the other hand, it doesnât take long before you realize how this technique could be used completely incorrectly: to censor.
Letâs not forget that we are learning to clamp down on features we donât like. This could mean racism, but it could also mean clamping on liberal ideas if a more conservative figure takes control over the LLM (or vice versa).
Therefore, if we allow the main interface between humans and knowledge (mostly the Internet today) to become dominated by private LLMs, it takes no genius to acknowledge that the companies and their shareholders will have the power to steer societyâs sentiment to their liking, manipulating us into thinking in a specific way or, worse, turning lies into truths and vice versa.
Now more than ever, we must ensure that the LLM industry stays open-source and diverse. Itâs these discussions that matter, not whether a next-word predictor will develop agency and kill us all.
THEWHITEBOX
Premium
If you like this content, by joining Premium, you will receive four times as much content weekly without saturating your inbox. You will even be able to ask your questions you need answers to.
Until next time!
Give a Rating to Today's Newsletter |