TheTechOasis
Posts
ChatGPT biggest rival heats up the AI race

ChatGPT biggest rival heats up the AI race

Ignacio de Gregorio Noblejas
July 16, 2023

🏝 TheTechOasis 🏝

It’s more updated than ChatGPT.

It can take much larger inputs than ChatGPT.

It can read multiple documents at once, ChatGPT can’t.

It’s Claude’s version 2, and it’s coming strong to fight for the Generative AI throne.

The fight is real

For several months, ChatGPT-4 has been the undisputed king in the world of Large Language Models (LLMs).

Its raw power allowed it to obliterate every other chatbot in most benchmarks, by memorizing stuff and thriving in ever more so complicated reasoning problems.

Needless to say, the biggest proof that GPT-4 is considered the best model in the game is that most open-source models, trained with distillation (a process where a larger model teaches a smaller one to imitate it), used ChatGPT as the teaching model, even though it’s the most expensive model in the world, by far.

But a group of distinguished, ex-OpenAI researchers is really giving OpenAI a run for its money with their model Claude, a model whose version 1 was already considered “on par” with ChatGPT’s 3.5 version.

Now, they’ve launched version 2, and considering how great version 1 looked, it’s obvious who they are coming for now.

But if you think this is just a straight comparison between two almost identical chatbots, you’re very wrong.

The reality is that this model brings many things to the table that make it different, and unique.

Do we have a new heir to the throne?

An updated, elongated beast

The two biggest differences between this new Claude and ChatGPT are its time cutoff and the context window.

On the former, while ChatGPT is famously limited in knowledge until September 2021, Claude is up to date with current times (although not daily updated like Pi is, the chatbot we covered last week).

On the latter, ChatGPT is limited to an already-awesome 32k tokens — or around 26,000 words per prompt — while Claude handles up to 100,000, or 75,000 words, the entire transcript of six Star Wars films combined.

This, considering that you can send Claude multiple documents at once, allows for something completely unseen until now in AI, you can send multiple document sources and generate content that considers all of them combined.

Thinking about creating a report on topics that require multiple documents and thousands of words?

Claude’s got you, and the most impressive thing is that… only Claude’s got you, as it’s literally your only option in this case, not even ChatGPT.

Consequently, in certain use cases, Claude is in a league of its own.

Yet, another impressive thing about Claude is code.

Explaining and improving

Out of all the demos shown, this was probably the most striking one.

Claude is capable of understanding code, debugging it, and even suggesting or executing improvements on command.

This means that Claude 2 is poised to become a real competitor to GitHub Copilot and GPT-4 as a copilot assistant for coders, something no one else can say today.

But what makes Claude really unique? Well, it’s actually how it was trained.

Constitutional AI and moral self-correction

When creating chatbots like ChatGPT, there’s a super important step in the process that only a handful of companies in the world can afford, RLHF.

Reinforcement Learning from Human Feedback teaches AI not only to become good at conversing but also helps it ‘forget’ about certain biases it has learned during the pretraining phase.

As ChatGPT is trained with the Internet, a place full of hate, racism, and other unacceptable biases, we use RLHF to fight that.

To perform this process, OpenAI used humans, who taught ChatGPT what was good or bad.

But Anthropic trained Claude a different way, they used actual AI, an AI that would follow a set of human principles, named as ‘Constitution’, to teach Claude what’s good or bad.

They dubbed this method ‘Constitutional AI’.

The reason for this?

Anthropic argue that a select group of humans are ill-suited to ‘play God’ and teach the model what’s good or bad, and that we need to use an AI guided by human global principles to steer other AIs.

The name? Reinforcement Learning from AI Feedback, or RLAIF.

But now Anthropic is going further, and it’s also leveraging another innovation called moral self-correction.

According to their research, LLMs above 22 billion parameters are big enough to have learned complex definitions like ‘fair’ or ‘unjust’, and will reduce their bias if told so.

In other words, they will be fair if you tell them to be fair.

Naturally, as many future users won’t ask Claude for that, they have fine-tuned the model to have a natural tendency to adhere to its Constitution and to a set of principles that shouldn’t be violated, becoming what’s probably the most harmless chatbot out there right now.

The race heats up

If you thought no one could fight OpenAI, you forgot that many researchers that put it in its current place left and built other awesome companies like Anthropic.

The idea of ‘AI aligning AI’ appeals to me, but detaching humans from the pipeline may be dangerous.

In the meantime, Claude just gave us an idea of how powerful GenAI is becoming, and if you’re lucky enough to live in the US or the UK, you can try it for yourself.

Key AI concepts you’ve learned today:

- RLAIF, or Constitutional AI

- Moral self-correction for LLMs

- Constitutional AI