I-XRAY, Nobel Prizes, War Drones, & More

AI's Consolidation Week

In partnership with

For business inquiries, reach me out at [email protected]

THEWHITEBOX
TLDR;

  • 🎖️ The Nobel Prize Embraces AI

  • 🔫 Anduril Wins $250 Million US Defense Contract

  • 🧐 An AI Melts Down When It Discovers It Isn’t Human

  • 🤨 OpenAI Disenchanted with Microsoft?

  • 🌥️ AI is Good for the Climate?

  • 🔮 Writer’s Impressive Palmyra X 004 Model

  • 😣 AI & The Loneliness Epidemic

  • 🤔 Cursor Founders Go on Lex Fridman

  • [TREND OF THE WEEK] Meta’s AI Glasses Can Be Used as SpyWare

Learn AI in 5 Minutes a Day

AI Tool Report is one of the fastest-growing and most respected newsletters in the world, with over 550,000 readers from companies like OpenAI, Nvidia, Meta, Microsoft, and more.

Our research team spends hundreds of hours a week summarizing the latest news, and finding you the best opportunities to save time and earn more using AI.

NOBEL PRIZES
The Nobel Prize Embraces AI

This week, the traditional scientific community acknowledged AI's already-crucial impact on the world.

First, Hopfield and Hinton, two of the precursors of neural nets, the data-compressing algorithm that underpins developments like ChatGPT, were awarded the Nobel Prize in Physics and, yesterday, three other computer scientists/neuroscientists/chemists, two of which work at Google Deepmind (the CEO, Demis, and John, director), were awarded the Nobel Chemistry Prize, for their contributions to protein folding prediction.

TheWhiteBox’s take:

As for the Physics prize, the consensus is that both Hopfield and Hinton have been instrumental to the success of Deep Learning, the AI field that comprises neural networks and explains much of the progress of the field over the last two decades.

However, their election has been met with mixed feelings, especially from the very own AI field. Especially harsh was Jurgen Schmidhuber, one of the most prominent AI researchers, who directly accused the Academy of ‘rewarding plagiarism’ as, according to him, both rewardees failed to cite crucial developments by other researchers like Shun-Ichi Amari in their respective research.

The reception for the Chemistry prize has been much more widely accepted, as it’s hard to deny that the emergence of ‘protein folding predictor’ AIs like AlphaFold has massively contributed to the field.

AlphaFold takes sequences of amino acids (the building blocks of proteins) and predicts the protein’s 3D structure. Proteins are the quintessential component of all living organisms, and understanding their structure gives us insight into their function.

Predicting this shape is complex and slow, but AlphaFold has significantly accelerated the process, revolutionizing protein structure prediction and providing crucial insights into key fields like drug discovery, biotechnology, and fundamental biology.

Are we, as humans, handing in the torch of scientific discovery to AI? I don’t know, but it certainly feels that way.

WARFARE
Anduril Wins $250 Million Contract

Anduril has won a $250 million US Defense contract to deploy drone-intercepting AI technology known as the RoadRunner, which you can see in action here.

TheWhiteBox’s take:

As you may imagine, this technology is extremely AI-heavy. But Roadrunner's key feature, similar to Space X’s rockets, is reusability.

With the US forces burning millions of dollars every second through the use of expensive artillery, having smart hardware that can return to base if it doesn’t intercept a drone is exceptionally cost-effective. This points to the new era of warfare: reusable, fully autonomous (AI), cheap weapons in lieu of over-the-top, million-dollar weaponry and systems.

SAFETY
An AI Melts Down When It Discovers It’s Not Human

NotebookLM is a Google product and one of the few that has left very hard-to-impress AI experts speechless. It allows you to turn any piece of text, even yours, into a podcast, turning boring documentation into an interesting and engaging back-and-forth conversation between two AIs.

Now, someone sent the product a document explaining that they aren’t real humans, and, well, the results are uncanny, to put mildly, as the AIs literally break down.

Before you listen, you need not worry; it’s not like the AI has developed feelings, it's simply replicating how a human would have described a machine discovering it’s ‘non-human’ nature. They are brilliant at imitating human in all of our aspects, but they do not embody them. At least currently.

FRONTIER AI
OpenAI & Microsoft Disenchantment?

According to The Information, it’s reported that OpenAI has grown concerned about the speed at which Microsoft is spinning up data centers, even leading them to make deals with other infrastructure providers like Oracle, who intend to build a 2GW data center (1 GW by mid-2026, 2 GW eventually).

As we have discussed in our Google deep dive, Microsoft and OpenAI are largely behind Google in their data center buildups, and Sam Altman, OpenAI’s CEO, is also allegedly very concerned about how fast Elon Musk is building its data center network.

TheWhiteBox’s take:

As we covered previously in this newsletter, AI has become a race for who manages to rack up the most compute power.

The almost-religious belief that “larger compute budgets will yield better models,” combined with the fact that compute is actually much scarcer than there is demand for it, has all important players obsessed with winning every possible deal they can afford.

My previous link gives much more insight into what you can build with data centers this size. Still, in a nutshell, 2 GW could provide energy to 1.6 million homes, and that’s considering that the US average energy consumption (around 10,500 KWh/year) is much, much higher than the global average.

This data center would consume 17.3 TWh a year, as much as the entire island of Puerto Rico in 2021 numbers (including homes, businesses, and infrastructure).

ENERGY
AI is Good for the Climate

When it comes to AI and energy, AI is considered to be one of the future’s most contaminating assets. While that will be the case, this article examines it from another perspective. The writer argues that AI will, in fact, be positive for the climate.

The logic is that clean energy is expensive, and as all Big Tech companies are desperate to access more power and are very rich, they will be some of the leading supporters of such projects, with all four of the Hyperscalers already planning to become their own utility companies (companies that generate the energy they need for their businesses), with examples like Amazon and Microsoft’s purchase of entire nuclear power plants to sustain their insane data center energy demands.

This, added to their commitments to achieving carbon neutrality, could lead us to a world where carbon-free energy sources are the norm. While this article could have been signed by Satya Nadella himself, (Microsoft’s CEO), it’s an interesting way to address AI’s climate impact.

The timing of this article is particularly ironic, considering that Eric Schmidt, the legendary Google CEO, recently stated he believed Hyperscalers should ignore their sustainability commitments (carbon neutral) and just build more data centers because â€œwe aren't going to hit climate goals anyway.”

PRODUCT
Writer’s Impressive Palmyra X 004 Model

Writer, an enterprise AI company and a sponsor of this newsletter (last week, for instance), has released Palmyra X 004, a new LLM that sets new records in function calling, a key feature of AI agents, as we have described in the past.

In simple terms, agents are LLMs that can use external tools, like functions or APIs, to perform actions on the user’s behalf.

In this facet, the model doesn’t only reach state-of-the-art (SOTA) results, it obliterates other models from OpenAI, Google, Meta, or Anthropic, reaching 78% on Berkeley’s Function Calling Benchmark (results not yet officially published), more than 20% over the previous SOTA, GPT-4 Turbo, as well as very competitive results on Stanford’s HELM benchmarks.

Crucially, unlike proprietary models it’s competing with, you can deploy Palmyra X 004 in your own GPU cluster (cloud or on-premise), allowing you to use its powerful features without putting your data at risk.

HEALTHCARE
The Loneliness Epidemic

According to the US Surgeon General, the US (and the world in general) is suffering from a loneliness epidemic. Interestingly, Generative AI chatbots have been proposed to deal with this through examples like Replika or Character.ai (acquired by Google).

In this thought-provoking article, the writer engages with several of these systems to assess whether they could decrease loneliness or make it worse. While I’m not a fan—some of the cited researchers aren’t either—of portraying AI as a substitute for human interaction, the writer draws several valid points on how these AI systems, if well trained, could provide therapy and, while not substitute humans entirely, give some closure whenever you’re feeling down.

AI & PROGRAMMING
Cursor Founders Go on Lex Fridman

The founders of Cursor, the Visual Studio Code fork that allows the integration of LLMs into programming in a seamless and, quite frankly, impressive fashion, went on the Lex Fridman podcast.

Although I recommend listening to the full interview (they explain different approaches and hacks they’ve used to build probably the best GenAI consumer product in the world right now), they also had some spicy takes, including whether AI will replace programmers.

TREND OF THE WEEK
Meta’s AI Glasses Can Be Used as SpyWare

Two Harvard researchers have released a video showing a quite concerning new hardware tool they’ve developed, I-XRAY, that is so dangerous they won’t release it publicly.

It’s a tool that, using Meta’s RayBan glasses, facial recognition, and Large Language Models (LLMs), can find every public information available on the Internet about you in seconds and give it to the glasses’ user.

Your name, past experiences, home address, social media handles… everything. As you’ll see, the results are pretty remarkable, and they raise important questions about how simple it is for strangers to find information about you these days.

Today, you’ll learn about the dark side of AI systems and what they can really enable, as well as a way to protect yourself from these systems for free.

Cool And Dangerous Can Go Hand By Hand

So, what is I-XRAY?

I-XRAY, A Cheap Doxxer

This system uses Meta’s Ray-Ban smart glasses with widely accessible AI to “dox” individuals in real time by identifying their faces and retrieving sensitive personal information such as names, addresses, phone numbers, and even information about relatives.

It works as follows:

  • It leverages the smart glasses’ ability to live stream video to Instagram. A computer program monitors this video feed, and an AI algorithm, PimEyes, is used to identify faces in the footage.

  • Once the system recognizes a face, it queries public databases to pull up personal details.

  • They also search the Internet for online articles or voter registration databases to find additional information.

  • The retrieved information is then returned to the user via a smartphone app, where an LLM parses through all the information sources, connects the dots, and assembles a profile of the person in question.

And what can this product do? Well, amazingly illegal stuff.

Making Doxxing an Art

In their demonstration, Nguyen and Ardayfio used the glasses to identify classmates, revealing their addresses and relatives’ names. 

They also approached strangers on public transportation and pretended to know them, using the data they had extracted through the system.

Of course, the students clarified that the project's purpose was not to exploit the technology but to raise awareness of how easily available tech can be used for harmful purposes, showing what many fear as a dystopian future is actually achievable with current technologies.

It’s important to note that such technology has been widely available to government agencies for years, so even though this hardware makes this technology accessible to everyone, it has been accessible to ‘the powers that be’ for a good while.

Importantly, this tool also proves how AI safety measures still have a long way to go.

While Meta has a privacy policy for these glasses, encouraging users to respect others’ privacy and to signal clearly when recording, the reality is that these policies are more about etiquette than enforceable rules, and there is no guarantee users will comply.

In fact, when The Verge, one of the newslets that echoed this tool, asked Meta, the spokesperson responded by referring to the company’s terms of service without providing any guidance or solutions to fix the issue.

But how does this tool work under the hood?

LLMs & ResNets

Three technologies underpin the success of this system.

Meta’s RayBan Glasses

Meta’s Ray-Ban smart glasses are a collaboration between Meta and Ray-Ban’s parent company, Luxottica, offering features such as:

  • Hands-free photo and video capture: Built-in cameras allow users to take photos and record short videos (up to 30 seconds) using voice commands or physical buttons.

  • Audio features: The glasses include open-ear speakers and microphones, enabling users to listen to music, take calls, and interact with Meta’s voice assistant without wearing headphones.

  • Social integration: Content captured with the glasses can be easily shared to platforms like Facebook and Instagram via the Meta View app.

In summary, the glasses provide a more seamless way to capture and share moments while incorporating smart features into everyday eyewear.

PimEyes, Facial Recognition Technology

Another important part of the puzzle is PimEyes, a facial recognition technology that takes your photo and uses image search to find ‘hits,’ other images on the open web that are most likely to be you.

Although PimEyes doesn’t provide much information on how it works, a competitor, ClearviewAI, does, so let’s use them as a reference.

This technology works similarly to how most search engines today do: using vector embeddings. Although I provide a more detailed description in my personal blog for those brave enough, the idea is as follows:

As classical computers can only process numbers, we need to find a way to express these images as numbers (in a way, they already are, as each pixel is already represented numerically). However, we add a twist; we express these images in an encoded form: a vector embedding.

Source: ClearViewAI

As seen above, to make such a transformation, we first preprocess the original image to crop all parts except the face. Then, we use embedding models, neural networks that take in the image and output a vector representation.

This is literally the same process multimodal models like OpenAI’s GPT-4o models do upon receiving an image.

We then transform the images into vectors. But why? Besides the need to process numbers, vectors can be compared to other vectors mathematically using methods like cosine similarity or Euclidean distance.

Consequently, we can transform semantic similarity between two images (how similar images are) into a mathematical exercise; the closer and similar their vector representations are, the more similar their underlying images are, too.

Source: ClearViewAI

You can think of the numbers in these vectors as attributes of the underlying concept. Thus, the more similar two vectors are, the more similar the attributes of both images are, too.

Knowing this, it becomes apparent how this search works; upon seeing a new image, the system encodes (transforms) the image into a vector embedding, inserts the embedding into a vector database, and extracts the ‘k’ most similar vectors, which, if the system is granular enough, will most likely be other images of you on the open web.

Of course, this means that the system doesn’t query the web in real time, they are continuously scraping images from the web and storing them in case you ever perform the retrieval.

In short, systems like ClearView AI and PimEyes receive an image and output a list of images of that same person published in the open web while also scraping any additional metadata that came with the image (like source, for instance, to access more information that may also be present in that source).

Naturally, finally we have the LLM, the parser of all data. 

LLMs Are Great Parsers

As mentioned, the system uses the retrieved images as references to access other data (e.g., if the system retrieves an image from a blog post, it also accesses the text in the blog post).

This way, it assembles scores of information regarding that person. Then, all this data is sent to an LLM tasked with profiling the person, parsing and inferring all kinds of information from the user from that restructured blob of data that PimEyes has provided.

Finally, that data is sent to the custom app, which displays all the information to the user.

And just like that, anyone with access to this simple system can go into the street and assemble all kinds of data from people, even leading to real conversations with users who pretend to know everything about a random stranger they saw for the first time just a few seconds ago.

Scary.

With Great Power Comes Great Responsability

Naturally, I-XRAY will not be released. Importantly, you can erase your signature from many of these image search databases following the steps in this short guide.

However, that won’t stop many AI doomers to use these examples to convince the world that AI is dangerous. However, the same tools that help bad actors are also crucial to fight these precise cohorts.

The very same AIs that are helping agencies identify nefarious actors engaging in criminal activities (PimEyes) or AIs that are used by millions in the world to make themselves more productive (LLMs), as well as seemingly harmless glasses that help people communicate their lives to the world (Meta’s RayBan glasses) can be used in conjunction to build use cases that walk in the thin edge of legality.

But do these edge cases force us to ban the technology entirely? Should we prosecute the creators of PimEyes or the LLM if someone uses them to dox people, as California’s vetoed SB-1047 intended?

Of course not!

The solution is much easier. While we should draw a higher bar for powerful companies like Meta to ensure they release safe technology, the solution is ultimately to ban the application.

Ban facial recognition in Meta’s Ray-Ban glasses, not the glasses themselves.

Closing Thoughts

This week will be remembered as the week that the science world bend to its knees to pay its respects to AI, with none other than 5 new Nobel laureates coming from the AI industry.

In its acceptance speech, Hinton, one of the awardees, basically ignored the award and focused solely on the need to put stringent guardrails on AI, which comes at a great time seeing what a handful of Harvard students have done with open-source AI and $200 smart glasses.

Also, we continue to see the inevitable transition of AI from a purely pen&paper technology to one where billions of dollars, war drones, the climate, giga-scale data centers, and its effects on loneliness are more discussed than the technology itself.

AI is moving into mainstream territory, a technology that, being already instrumental to society yet hidden to most of it, is now top of mind to everyone, forever.

THEWHITEBOX
Premium

If you like this content, join Premium to receive four times as much content weekly without saturating your inbox. You will even be able to ask the questions you need answers to.

Until next time!