Marketing AI Institute | Blog

Claude 3 Opus Challenges GPT-4 As Most Powerful Model

Written by Mike Kaput | Apr 2, 2024 1:16:22 PM

Claude 3 Opus now competes with GPT-4 as the most powerful AI model available.

That's according to a major leaderboard from LMSYS.org called the Chatbot Arena. The Arena uses both crowdsourcing and Elo ratings to evaluate AI model capabilities.

(And it's considered a trustworthy source by many industry experts.)

And, as of publication, Claude 3 Opus now sits in the #1 spot. These rankings are ever-shifting, and at times while writing, Claude 3 Opus has been #1 on its own—or tied with GPT-4.

What does this mean for anyone using these models in business?

More importantly, how can any of us keep up?!

I got the answers from Marketing AI Institute founder/CEO Paul Roetzer on Episode 90 of The Artificial Intelligence Show.

The achievement is impressive…

We’ve been pleased with Claude 3 Opus’ performance in our experiments. It’s quickly become a go-to model for our work. 

But it’s just the most recent winner in an ever-shifting competition.

“It is so fast-moving,” says Roetzer. “I think it shows just how dynamic this space is.”

We get new models almost every week. And models get updates and changes that alter their capabilities.

So the point here isn’t that Claude 3 Opus is objectively the best model and always will be…

It’s that you need to test all the major models. Because it’s very easy to suddenly find your favorite model has become outclassed overnight.

…But get ready for what’s coming next

Despite the (well-deserved) accolades for Claude 3 Opus, Roetzer says we need to keep one important point in mind:

GPT-4 came out in March 2023. It stopped training six months before it came out. 

“So everyone is now beating a model that is 18 months old,” he says.

“OpenAI has not stopped building and training a more powerful model. So if all these companies are just catching up to GPT-4 and just beating them slightly, what in the world is GPT-5 going to look like?”

So, definitely test all the current models as much as possible. 

But also understand:

 Significantly better models and capabilities are coming. Potentially very soon.