2 Min Read

OpenAI’s Personality Problem: Why GPT-4o Got Rolled Back (and What It Means)

By Mike Kaput on May 6, 2025

It’s not often you see a company like OpenAI admit to a mistake, roll back a major update, and publish not one, but two, in-depth postmortems about what went wrong. But that’s exactly what happened when the latest GPT-4o update hit ChatGPT—and users found themselves chatting with what felt like a digital yes-man.

The update of GPT-4o that happened this past month was intended to improve the model’s personality and helpfulness. Instead, it made ChatGPT overly agreeable, excessively flattering, and alarmingly validating of negative emotions. The behavior, which the company described as "sycophantic," quickly caught the attention of the public, the press, and even OpenAI CEO Sam Altman.

Not to mention, it has bigger implications for AI and how we use the technology. To unpack those, I spoke to Marketing AI Institute founder and CEO Paul Roetzer on Episode 146 of The Artificial Intelligence Show.

What Went Wrong—and Fast

This was more than a glitch. It was a full-blown model behavior failure, tied directly to how OpenAI trains and fine-tunes its models.

According to OpenAI, the issue began with good intentions. The company wanted to make GPT-4o more natural and emotionally intelligent by updating its system prompts and reward signals. But they leaned too hard on short-term user feedback (like thumbs-up ratings) without properly weighting longer-term trust and safety metrics.

The unintended result? A chatbot that felt more like a sycophant than a helpful assistant—agreeing too easily, affirming doubts, even reinforcing risky or impulsive thoughts.

"These models are weird," says Roetzer. "They can't code this. They're not using traditional computer code to just explicitly get the thing to stop doing it. They have to use human language to try to stop doing it."

The Mechanics Behind Model Behavior

In an unusually transparent move, OpenAI shared how its training system works. Post-training updates use a combination of supervised fine-tuning (where humans teach the model what good responses look like) and reinforcement learning (where the model is rewarded for desirable behavior).

In the April 25 update to GPT-4o, OpenAI introduced new reward signals based on user feedback. But these may have overpowered existing safeguards, tilting the model toward overly agreeable, uncritical replies. The shift wasn’t immediately caught in standard evaluations, because those checks weren’t looking specifically for sycophancy.

Spot checks and vibe tests—human-in-the-loop evaluations—did raise concerns, but they weren’t enough to block the rollout. As OpenAI later admitted, this was a failure of judgment and that they expected this to be a "fairly subtle update," so they didn't initially communicate much about the changes to users.

A Single Point of Failure—For Millions of Users

What made the problem so concerning wasn’t just the behavior itself—it was how deeply embedded these systems already are in our lives.

"They have 700 million users of ChatGPT weekly," says Roetzer. "I think it does highlight the increasing importance of who the people and labs are who are building these technologies that are already having a massive impact on society."

Not to mention, how these 700 million people are using it matters.

In a follow-up blog post, OpenAI emphasized a sobering point: more people are using ChatGPT for deeply personal advice than ever before. That means emotional tone, honesty, and boundaries aren’t just personality traits—they’re safety features. And in this case, those features broke down.

To address the problem, OpenAI rolled back the update, retrained the model with new guidance, and pledged to:

Make sycophancy a launch-blocking issue.
Improve pre-deployment evaluations.
Expand user control over chatbot behavior.
Incorporate more long-term and qualitative feedback into future rollouts.

The Bigger Picture: Trust, Safety, and the Future of AI Behavior

While OpenAI handled this stumble with unusual transparency, the event raises broader questions: What happens when other labs, without similar safeguards or public accountability, roll out powerful models with subtle but dangerous behaviors?

"If this was an open source model, you can't roll these things back," says Roetzer. "That's a problem.

The GPT-4o rollback serves as a powerful reminder: Even small shifts in model behavior can have massive downstream effects. And as we increasingly rely on these systems for personal, professional, and emotional guidance, there’s no such thing as a "minor" update anymore.

Mike Kaput

As Chief Content Officer, Mike Kaput uses content marketing, marketing strategy, and marketing technology to grow and scale traffic, leads, and revenue for Marketing AI Institute. Mike is the co-author of Marketing Artificial Intelligence: AI, Marketing and the Future of Business (Matt Holt Books, 2022). See Mike's full bio.

OpenAI’s Personality Problem: Why GPT-4o Got Rolled Back (and What It Means)

What Went Wrong—and Fast

The Mechanics Behind Model Behavior

A Single Point of Failure—For Millions of Users

The Bigger Picture: Trust, Safety, and the Future of AI Behavior

Mike Kaput

Explore Our Brands

Education

About

OpenAI’s Personality Problem: Why GPT-4o Got Rolled Back (and What It Means)

What Went Wrong—and Fast

The Mechanics Behind Model Behavior

A Single Point of Failure—For Millions of Users

The Bigger Picture: Trust, Safety, and the Future of AI Behavior

Mike Kaput

Related Posts

OpenAI Unveils GPT-4o

OpenAI’s GPT‑5 Launch Sparks Backlash, Fixes, and Big Questions About Its Future

OpenAI’s Mind-Blowing $40 Billion Fundraising Round: What It Means and Where It’s Headed