• TheTechOasis
  • Posts
  • Behold the Power of Smaug & AI's Holy Grail, Neurosymbolic Systems

Behold the Power of Smaug & AI's Holy Grail, Neurosymbolic Systems

šŸ TheTechOasis šŸ

Breaking down the most advanced AI systems in the world to prepare you for your future.

5-minute weekly reads.

TLDR:

  • AI Research of the Week: Behold the Power of Smaug

  • Leaders: The Holy Grail of AI, Neurosymbolic models, are Here

šŸ¤© AI Research of the week šŸ¤©

"My armor is like tenfold shields, my teeth are swords, my claws spears, the shock of my tail is a thunderbolt, my wings a hurricane, and my breath death!"

That was the words of the main antagonist in one of The Hobbit films, the huge dragon that almost killed our dear protagonist Bilbo while scaring him to death.

Now, the same feeling might have been felt by the open-source community as Abacus.ai has released Smaug, a fine-tuned version of Qwen_72B by Alibaba that is unequivocally the new open-source king, and the first open-source model in history to reach an average of 80 points across benchmarks.

Also, itā€™s undeniable proof that we finally have a canonical method that will close the gap between open-source and proprietary models, DPO, which we talked about a few weeks ago.

Letā€™s unpack the secrets of the king under the mountain.

A Thing of Beauty

You donā€™t have to take my word that this is the best open-source model, as it already appears as such in the most famous Open LLM leaderboard from HuggingFace.

And although Alibaba is the one who grants the license (itā€™s not a truly open license like Apache 2.0) itā€™s fairly permissive, meaning you only have to request permission to use it commercially if your product has more than 100 million users.

Iā€™m sure you can accept that.

So, what is Smaug?

Smaug is a fine-tuned, DPO-aligned version of Qwen-72B, a great Large Language Model (LLM) by the Alibaba group that is heavily influenced by Metaā€™s LLaMa 2 model.

As you probably know, LLaMa is a family of pre-trained generative transformers (meaning they are the equivalent of GPT to OpenAI or Gemini to Google) that are widely considered the best open-source base models in the industry.

In fact, they are the seminal model sitting behind most open-source models today.

Moreover, as Mark Zuckerberg himself acknowledged, they are already training the 3.0 version, which means we could be weeks away from the next big leap in open-source capabilities.

In the meantime, nothing beats Smaug in the world of open-source and, incredibly, it manages to beat some of the best proprietary models in the world too, including Gemini Pro from Google (in some benchmarks) and Mistral-Medium from Mistral, according to Abacus AI founder, Bingu Reddy.

Were these claims to hold true at scale, that would make Smaug a top-5 foundation model overall, putting it almost at the level of GPT-4 and Gemini Ultra despite being more than ten times smaller!

But what are these benchmarks?

Winograd: Tests AI's understanding of language ambiguity requiring common sense.

TruthfulQA: Assesses AI's ability to provide accurate and non-misleading answers.

GSM-8k: Challenges AI with grade-school level math problems.

ARC: Measures AI's reasoning over complex, knowledge-intensive questions.

HellaSwag: Evaluates AI's prediction of plausible scenario endings using commonsense reasoning.

MMLU: Tests AI's understanding across a wide range of subjects and tasks.

But how is this possible? Why are we seeing such massive improvements in open-source models?

And the answer might be, in fact, DPO.

The Alignment Breakthrough

Just a few months ago, the general consensus was that these models and their inductive biases (the assumptions these models make when working with new data to make accurate predictions) were not really that good and thus required extensive human support.

However, researchers have realized that aligning models (maximizing utility and safety of use) is something much easier than we thought.

Known as Direct Preference Optimization by Rafailov et al, it has become the canonical approach to model alignment, with recent examples like Mixtral 8Ɨ7B, or Smaug.

In laymanā€™s terms, while the standard approach known as Reinforcement Learning from Human Feedback, or RLHF, implied the creation of a separate reward modelā€”the teacherā€”to tutor our model on ā€˜how to behaveā€™, we have realized that, just like humans, models can teach themselves.

How does DPO work?

The last phase of training LLMs, the alignment phase, involves helping the models create a decision policy to make better actions by maximizing a human score.

DPOā€™s breakthrough is that we can find this decision-making policy without actually calculating scores, making the process swift and cheap when compared to the traditional approach.

In other words, itā€™s like teaching a student to learn math, but instead of making it learn by doing a plethora of math quizzes and having a teacher calculate the results, and repeating the process until he/she obtains perfect score, you teach him/her the key concepts that implicitly teach the student to perform well without the student having to perform hundreds of tests.

Hence, just like having to pay a salary to the teacher is a much more expensive procedure than simply making the student self-learn, with DPO the economic requirements are orders of magnitude smaller, meaning that researchers can train models for longer, and with more data, which in turn makes the models much better.

For further analysis, you can read here my explanation of what DPO and RLHF are. For a more hands-on explanation, I deeply recommend this recent AI tutorial by HuggingFace and DeepLearning.ai

The Window is Closing for OpenAI

As I said when I first wrote about this topic, Smaug is just proof that the biggest competitive moat that closed-source models had, the incredibly capital-intensive alignment phase, is now gone.

And although most people are focusing on the fact that itā€™s the first open-sourced model that reaches an average of 80 across popular benchmarks, itā€™s much more than that.

Itā€™s the proof that DPO is the real deal, and that the world is about to see an explosion of super-powerful open-source models.

In other words, unless OpenAI et al announce a new technological breakthrough, their moat is slowly but steadily closing on them.

šŸ«” Key contributions šŸ«”

  • Smaug is the first open-source model to reach an average score of 80 across the most popular benchmarks, making it the best open-source model overall.

  • It was aligned using DPO, making this method the canonical approach to LLM alignment.

šŸ‘¾ Best news of the week šŸ‘¾

šŸ„‡ Leaders šŸ„‡

AIā€™s Holy Grail, NeuroSymbolic Systems, is Here

Few topics in the world of AI are more controversial.

For years, even decades, researchers around the world have argued whether Deep Learning, the methods and architectures that have given us ChatGPT, Stable Diffusion, or Gemini, are enough to take us to AGI.

For that, an almost mystical and misunderstood concept, neurosymbolic AI, was thrown about as the key to unlocking the AIā€™s real power, but our lack of understanding and proof about its benefits meant that it was seen as wishful thinking, and still to this today almost no information about it is available.

But, to the surprise of many, these systems are finally here, and they are insanely powerful.

Thus, today we are delving into this hot topic that, according to some sources, might be what research labs at the forefront of the space, like OpenAI or Google, might be working on as we speak, probably influenced by the fact that other competitors are already bringing these models into the market.

But first, whatā€™s the issue with standard Deep Learning?

The Perception and Cognition Gap

Although Deep Learning is clearly a respected field today, substantiated by the fact that our state-of-the-art vision and language systems are entirely based on it, it wasnā€™t always that way.

Due to our poor understanding of them, a fact that is still unequivocally true, and the lack of computational resources to prove that neural networks worked, scientists working on them were seen as complete fools.

Decades later, these ā€˜foolsā€™ are highly-respected figures, almost seen as gods, like Yoshua Bengio, Geoffrey Hinton, or Yann LeCun.

But despite the impressive results we have seen in several tasks like generating language or classifying objects on an image, our most powerful systems in the world seem utterly stupid and unable to handle tasks that humans regard as ā€˜simpleā€™.

And this is due to the perception/cognition gap.

Great perceptrons, terrible learners

Few theories have been more heavily influential on AI than Daniel Kahnemanā€™s two levels of cognition.

Daniel Kahneman's theory distinguishes between two types of thinking:

  • System 1 is fast, instinctive, and emotional;

  • System 2 is slower, more deliberative, and logical.

Thus, System 1 handles everyday decisions effortlessly, while System 2 takes over for more complex reasoning tasks.

In his more recent 2003 book, he also added perception as a ā€˜system 0ā€™ mode, but the principles stay the same.

But why is this relevant?

Well, although low-level perception and intuition (Systems 0 and 1) have been pretty much solved with AI, to the point that AI systems are fair and square better than us at that alreadyā€”at least at language processing and visionā€”the same doesnā€™t apply to System 2.

Deep Learning reasoning capabilities are really bad, which explains why they take so long to learn, or directly canā€™t learn, very simple reasoning tasks.

For instance, while AI systems are already used in manufacturing pipelines to detect the smallest issues on the products being built, our most advanced robotic systems have just recently learned to fold a t-shirt, something 6-year-old kids can learn in no time and perform effortlessly.

Put simply, when it comes to performing what humans do unconsciously, like using our senses or performing intuitive actions, Deep Learning seems like a viable option, but when it comes to performing complex, ā€œconscious", and deliberate tasks or problem-solving exercises, they miserably fail.

And hereā€™s where neurosymbolic AI systems come in to solve this.

Subscribe to Leaders to read the rest.

Become a paying subscriber of Leaders to get access to this post and other subscriber-only content.

Already a paying subscriber? Sign In

A subscription gets you:
High-signal deep-dives into the most advanced AI in the world in a easy-to-understand language
Additional insights to other cutting-edge research you should be paying attention to
Curiosity-inducing facts and reflections to make you the most interesting person in the room