Anthropic’s New Model Outperforms Human Engineers

Written by Mike Kaput | Dec 3, 2025 1:30:00 PM

Anthropic released Claude Opus 4.5, a new frontier model that the company says is its most intelligent system for coding agents and computer use.

The model scored higher than any human candidate on the company’s internal engineering exam when taken within a two-hour time limit, according to Anthropic.

Despite this performance, we likely haven’t seen the true ceiling of what these labs have built, says SmarterX and Marketing AI Institute founder and CEO Paul Roetzer on Episode 183 of The Artificial Intelligence Show.

I talked with Roetzer about Opus 4.5 and why Anthropic’s strategy points to much more powerful systems to come.

A New Standard for Coding Agents

Claude Opus 4.5, released on November 24, is positioning itself as the premier model for complex technical work.

Beyond acing Anthropic’s internal human hiring exams, the model wrote better code in seven out of eight programming languages, when measured against a key benchmark. It also allows developers to prioritize speed over maximum capability and vice versa.

For Roetzer, Opus 4.5 signals a clear strategic focus for the company.

“They’re all in on the AI researcher,” says Roetzer. “Then using the AI researcher to take off into more powerful AI.”

The feedback from early users has been glowing, with many citing the model’s ability to handle ambiguity and fix complex bugs without human intervention. But as impressive as Opus 4.5 is, Roetzer says this is not the limit of AI’s capability.

“We know from interviews with Dario [Amodei] and others that this is not their most powerful model,” says Roetzer.

This is in line with a growing trend among top AI labs. Whether it is Google, OpenAI, or Anthropic, the models released to the public often lag behind the true state-of-the-art systems currently running in their research clusters.

“What we’re getting is not the best they have,” says Roetzer. “I don’t know how else to stress that. These models are capable of far more than what you and I are going be able to do with them.”

See What Is Possible, Not What Is Here

If more powerful models exist, why don’t we have access to them?

The answer most likely lies in safety and alignment. As models become more capable of autonomous action, such as the coding agents Opus 4.5 powers, the risks of misuse or unintended behavior rise exponentially.

Anthropic, in particular, has built its brand around safety-first development, showing what Roetzer calls "great restraint" in releasing their most potent systems.

This restraint provides perspective for the recent warnings from AI leaders regarding the technology's impact on the economy and workforce.

When leaders, including Amodei and OpenAI’s Sam Altman, warn about societal disruption, they aren't just speculating based on the chatbots we use today. They are looking at the capabilities of yet unreleased models.

“They are seeing what is actually possible, not just what we all have access to,’’ Roetzer says.

For business leaders, the message is clear: The disruption you see today is just the beginning of what’s to come.

View full post