Meet ‘Smaug-72B’: The new king of open-source AI

Contents

A new open-source language model has claimed the throne of the best in the world, according to the latest rankings from Hugging Face, one of the leading platforms for natural language processing (NLP) research and applications.

The model, called “Smaug-72B,” was released publicly today by the startup Abacus AI, which helps enterprises solve difficult problems in the artificial intelligence and machine learning space. Smaug-72B is technically a fine-tuned version of “Qwen-72B,” another powerful language model that was released just a few months ago by Qwen, a team of researchers at Alibaba Group.

What’s most noteworthy about today’s release is that Smaug-72B outperforms GPT-3.5 and Mistral Medium, two of the most advanced open source large language models developed by OpenAI and Mistral, respectively, in several of the most popular benchmarks. Smaug-72B also surpasses Qwen-72B, the model from which it was derived, by a significant margin in many of these evaluations.

According to the Hugging Face Open LLM leaderboard, which measures the performance of open-source language models on a variety of natural language understanding and generation tasks, Smaug-72B is now the first and only open-source model to have an average score more than 80 across all major LLM evaluations.

VB Event

The AI Impact Tour – NYC

We’ll be in New York on February 29 in partnership with Microsoft to discuss how to balance risks and rewards of AI applications. Request an invite to the exclusive event below.

Readers Also Like: Stock market news today: Stocks clobbered as Nasdaq drops 2.5% amid Google's slide - Yahoo Finance

Request an invite

While the model still falls short of the 90-100 point average indicative of human-level performance, its birth signals that open source AI may soon rival Big Tech’s capabilities, which have long been shrouded in secrecy. In short, the release of Smaug-72B could fundamentally reshape how AI progress unfolds, tapping the ingenuity of those beyond just a handful of wealthy companies.

The open-source advantage

“Smaug-72B from Abacus AI is available now on Hugging Face, is on top of the LLM leaderboard, and is the first model with an average score of 80!! In other words, it is the world’s best open-source foundation model,” said Abacus AI CEO Bindu Reddy in a post on X.com.

“Our next goal will be to publish these techniques as a research paper and apply them to some of the best Mistral Models, including miqu (a 70B fine-tine of LLama-2),” she added. “The techniques we used specifically target reasoning and math skills, which explains the high GSM8K scores! Our upcoming paper will explain more.”

Smaug-72B – The Best Open Source Model In The World – Top of Hugging LLM LeaderBoard!!

Smaug72B from Abacus AI is available now on Hugging Face, is on top of the LLM leaderboard, and is the first model with an average score of 80!!

In other words, it is the world’s best… pic.twitter.com/CGHawmLhqI

— Bindu Reddy (@bindureddy) February 6, 2024

With today’s release, Smaug-72B becomes the first open-source model to achieve an average score of 80 on the Hugging Face Open LLM leaderboard, which is considered a remarkable feat in the field of natural language processing and open source AI.

Readers Also Like: 94-year-old Arthur Divers has never given up on serving - Detroit Free Press

Smaug-72B excels especially in reasoning and math tasks, thanks to the techniques that Abacus AI applied to the fine-tuning process. These techniques, which will be detailed in an upcoming research paper, target the weaknesses of large language models and enhance their capabilities.

Smaug-72B is not the only open-source language model that has made headlines recently. Qwen, the group behind Qwen-72B, also released Qwen 1.5, a suite of small powerful language models ranging from 0.5B to 72B parameters.

Qwen 1.5 outperforms popular open source models like Mistral-Medium and GPT-3.5, has a 32k context length, and works with various tools and platforms for fast and local inference. Qwen also open-sourced Qwen-VL-Max, a new large vision language model that rivals Gemini Ultra and GPT-4V, two of the most advanced proprietary vision language models developed by Google and OpenAI, respectively.

Implications for the future of AI

The emergence of Smaug-72B and Qwen 1.5 has sparked a lot of excitement and debate in the AI community and beyond. Many experts and influencers have praised the achievements of Abacus AI and Qwen, and expressed their admiration for their contribution to open-source AI.

“It’s hard to believe that less than a year ago, we all got excited about models like Dolly,” said Sahar Mor, an AI influencer and analyst, in a Linkedin post, reveling at the progress of open source models in the past year.

Smaug-72B and Qwen 1.5 are currently available on Hugging Face, where anyone can download, use, and modify them. Abacus AI and Qwen have also announced their plans to submit their models to the llmsys human eval leaderboard, which is a new benchmark that evaluates the performance of language models on human-like tasks and scenarios. Abacus AI and Qwen have also hinted at their future projects and goals, which include creating more open-source models and applying them to various domains and applications.

Readers Also Like: RENHENG Enterprise Holdings Limited (HKG:3628) Not Lagging Industry On Growth Or Pricing - Simply Wall St

Smaug-72B and Qwen 1.5 are just the latest examples of the rapid and remarkable evolution of open-source AI this year. They represent a new wave of AI innovation and democratization that is challenging the dominance and monopoly of the big tech companies and opening new possibilities and opportunities for everyone. Only time will tell how long Smaug-72B will remain at the top of the Hugging Face leaderboard, but for now, its safe to say that open source AI is having a big moment to start the year.

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.

READ SOURCE