• TheTechOasis
  • Posts
  • Anthropic's Claude Can Now Ingest All Six Star Wars Films At Once

Anthropic's Claude Can Now Ingest All Six Star Wars Films At Once

šŸ TheTechOasis šŸ

šŸ¤– This Weekā€™s AI Insight šŸ¤–

Weā€™ve grown accustomed to continuous breakthroughs in AI over the last few months.

But not record-breaking announcements that set the new bar at 10 times the one before, which is precisely what Anthropic, OpenAIā€™s biggest rival, has done with its newest version of Claude, their ChatGPT competitor.

Now, youā€™ll soon be turning hours of text and information searchesā€¦ into seconds.

A Chatbot focused on harmlessness

Albeit the countless benefits Generative AI is bringing to the world, as with anything in technology, it came with a trade-off.

With GenAI weā€™ve opened a window for this technology to generate stuff, like text, or images, which is awesome.

But the problem is that GenAI models lack awareness of whatā€™s ā€˜goodā€™ or ā€˜badā€™ and are trained with a humongous amount of raw data in almost every form you can imagine, data that carries in many cases debatable biases and dubious content.

Sadly, as these models grow better as they get bigger, the incentive to simply give it any possible text you can find, no matter the content, is particularly enticing.

This has led to several cases where these models have acted in a sketchy, almost vile way towards their uses, as weā€™ve seen in cases like Bing, forcing Microsoft to act.

Robot in the style of Hebru Brantley, Diffusion model

To prevent this, these based models have been trained with humans in feedback, a concept dubbed as Reinforcement Learning from Human Feedback or RLHF, to create Instruction-based models that are capable of responding, almost every time, following certain guidelines those humans gave them.

Examples of such models include ChatGPT, or Bard.

But as we saw with Bing (based on ChatGPT), this solution isnā€™t perfect.

For that reason, Anthropic decided to take it a step further with a concept described as Constitutional AI, a new training paradigm with one sole objective, creating the first real harmless chatbot.

And this takes us to Claude.

Allegedly harmless and now super powerful

The biggest difference between Claude and other chatbots is that it was trained against a Constitution.

But what does that mean?

Using several documents like the Universal Declaration of Human Rights, this model not only was taught to predict the next word in a sentence (like any other language model) very well, but it also had to take into account, in each and every response it gave, a Constitution that determined what it could say or not.

But what could really make all the difference for Claude is that, this week, Anthropic has announced that it has become 10 times more powerful.

Specifically, it has increased its context window from 9k tokens to 100k. An unprecedented number that has incomparable implications.

Let me digress.

Itā€™s all about tokens

Despite what many people may tell you, LLMs donā€™t predict the next word in a sequenceā€¦ at least not literally.

They predict the next token, which usually represents between 3 and 4 characters. Naturally, these tokens may represent a word, or words can be composed of several of them.

For reference, 100 tokens represent around 75 words.

To do so, it breaks the text you gave it into parts and performs a series of matrix calculations, a concept defined as self-attention, that combine all the different tokens in the text to learn how each token impacts all the rest.

That way, the model ā€œlearnsā€ the meaning and context of the text and, that way, can then proceed to respond.

The issue is that this process is computationally intensive for the model.

To be precise, the computation requirements are quadratic to the input length, so the longer the text you give it, described as the context window, the more expensive is to run the model, both in training and in execution time.

These forced researchers to considerably limit the allowed size of the input given to these models to around a standard size between 2k to 8k, the latter of which is around 6,000 words.

This is okay for chatting, but what if you want to summarize an entire book?

Not a chanceā€¦ until now.

ā€œAll the knowledge in the worldā€, Diffusion model

The Great Gatsby in seconds

Iā€™ll get to the point.

The newest version of Claude can ingest, in one go, 100,000 tokens, or around 75,000 words.

I know that didnā€™t mean that much to you, so let me give you some references:

  • Thatā€™s around the length of Mary Shelleyā€™s Frankenstein book

  • A human would take around 5 hours nonstop to read that amount of words

  • Itā€™s enough to include all dialog from 8 Star Wars filmsā€¦ combined

Now, think about a chatbot that can, in a matter of seconds, give you the power to ask it anything you want about that text.

This is the pinnacle solution for lawyers, research scientists, and basically anyone or any company that needs to go through lots of data at once.

If you want to see this Claude version live, you can check Assembly AIā€™s awesome video.

ā

The technology we thought was decades away is now hereā€¦ one token at a time.

Key AI concepts youā€™ve learned from reading this article:

- Large Language Model Token

- Constitutional AI

- Context Window of LLMs

šŸ‘¾Top AI news for the weekšŸ‘¾

šŸ—£ Stability AI launches their first text-to-animation model

šŸ‘ Google expands TensorFlowā€™s capabilities

šŸ‘ØšŸ¼ā€šŸ« Deep dive into Claudeā€™s constitutional AI

šŸŽ¼ Google releases MusicLM, its text-to-music model