2 Min Read

OpenAI’s Personality Problem: Why GPT-4o Got Rolled Back (and What It Means)

Featured Image

Wondering how to get started with AI? Take our on-demand Piloting AI for Marketers Series.

Learn More

It’s not often you see a company like OpenAI admit to a mistake, roll back a major update, and publish not one, but two, in-depth postmortems about what went wrong. But that’s exactly what happened when the latest GPT-4o update hit ChatGPT—and users found themselves chatting with what felt like a digital yes-man.

The update of GPT-4o that happened this past month was intended to improve the model’s personality and helpfulness. Instead, it made ChatGPT overly agreeable, excessively flattering, and alarmingly validating of negative emotions. The behavior, which the company described as "sycophantic," quickly caught the attention of the public, the press, and even OpenAI CEO Sam Altman.

Not to mention, it has bigger implications for AI and how we use the technology. To unpack those, I spoke to Marketing AI Institute founder and CEO Paul Roetzer on Episode 146 of The Artificial Intelligence Show.

What Went Wrong—and Fast

This was more than a glitch. It was a full-blown model behavior failure, tied directly to how OpenAI trains and fine-tunes its models.

According to OpenAI, the issue began with good intentions. The company wanted to make GPT-4o more natural and emotionally intelligent by updating its system prompts and reward signals. But they leaned too hard on short-term user feedback (like thumbs-up ratings) without properly weighting longer-term trust and safety metrics.

The unintended result? A chatbot that felt more like a sycophant than a helpful assistant—agreeing too easily, affirming doubts, even reinforcing risky or impulsive thoughts.

"These models are weird," says Roetzer. "They can't code this. They're not using traditional computer code to just explicitly get the thing to stop doing it. They have to use human language to try to stop doing it."

The Mechanics Behind Model Behavior

In an unusually transparent move, OpenAI shared how its training system works. Post-training updates use a combination of supervised fine-tuning (where humans teach the model what good responses look like) and reinforcement learning (where the model is rewarded for desirable behavior).

In the April 25 update to GPT-4o, OpenAI introduced new reward signals based on user feedback. But these may have overpowered existing safeguards, tilting the model toward overly agreeable, uncritical replies. The shift wasn’t immediately caught in standard evaluations, because those checks weren’t looking specifically for sycophancy.

Spot checks and vibe tests—human-in-the-loop evaluations—did raise concerns, but they weren’t enough to block the rollout. As OpenAI later admitted, this was a failure of judgment and that they expected this to be a "fairly subtle update," so they didn't initially communicate much about the changes to users.

A Single Point of Failure—For Millions of Users

What made the problem so concerning wasn’t just the behavior itself—it was how deeply embedded these systems already are in our lives. 

"They have 700 million users of ChatGPT weekly," says Roetzer. "I think it does highlight the increasing importance of who the people and labs are who are building these technologies that are already having a massive impact on society."

Not to mention, how these 700 million people are using it matters.

In a follow-up blog post, OpenAI emphasized a sobering point: more people are using ChatGPT for deeply personal advice than ever before. That means emotional tone, honesty, and boundaries aren’t just personality traits—they’re safety features. And in this case, those features broke down.

To address the problem, OpenAI rolled back the update, retrained the model with new guidance, and pledged to:

  • Make sycophancy a launch-blocking issue.
  • Improve pre-deployment evaluations.
  • Expand user control over chatbot behavior.
  • Incorporate more long-term and qualitative feedback into future rollouts.

The Bigger Picture: Trust, Safety, and the Future of AI Behavior

While OpenAI handled this stumble with unusual transparency, the event raises broader questions: What happens when other labs, without similar safeguards or public accountability, roll out powerful models with subtle but dangerous behaviors?

"If this was an open source model, you can't roll these things back," says Roetzer. "That's a problem.

The GPT-4o rollback serves as a powerful reminder: Even small shifts in model behavior can have massive downstream effects. And as we increasingly rely on these systems for personal, professional, and emotional guidance, there’s no such thing as a "minor" update anymore.

Related Posts

OpenAI Unveils GPT-4o

Mike Kaput | May 21, 2024

OpenAI just announced GPT-4o, its new flagship model. Here's what you need to know.

Meet GPT-4o Mini from OpenAI

Mike Kaput | July 23, 2024

OpenAI just released GPT-4o Mini, their most cost-efficient small model to date.

OpenAI’s Mind-Blowing $40 Billion Fundraising Round: What It Means and Where It’s Headed

Mike Kaput | April 8, 2025

OpenAI just pulled off the largest private tech funding deal in history—a jaw-dropping $40 billion raise that values the company at $300 billion.