How OpenAI is making ChatGPT safer space for mental health conversations

ChatGPT now provides more empathetic, grounding responses in cases of delusional thinking.

by SHARON MWENDE

News28 October 2025 - 10:30

In Summary

That was the motivation behind the upgrades, a set of safety-focused changes intended to make the assistant more reliable in moments of acute mental distress.
The update blends engineering with clinical care: models trained alongside psychiatrists, new taxonomies that teach the AI what to look for and product nudges that steer users toward real-world help.

Audio By Vocalize

***A woman using an AI chatbot/FREEPIK***

When someone types, “I don’t want to live anymore,” the difference between a cold algorithmic reply and an empathetic one can mean more than comfort. It can mean safety.

That delicate balance between technology and humanity is what OpenAI is trying to perfect with its newest update to ChatGPT, the popular conversational AI used by millions around the world.

That was the motivation behind the upgrades, a set of safety-focused changes intended to make the assistant more reliable in moments of acute mental distress.

The update blends engineering with clinical care: models trained alongside psychiatrists, new taxonomies that teach the AI what to look for and product nudges that steer users toward real-world help.

The company has introduced a sweeping set of changes designed to make the chatbot respond more sensitively to users in distress, whether they are expressing hopelessness, delusional fears or unhealthy attachment to the AI itself.

The update, developed with input from more than 170 mental health professionals, has reduced unsafe or inappropriate responses by up to 80 percent.

The improvements were part of an extensive post-training process that included evaluations by psychiatrists, psychologists and primary care doctors from 60 countries.

“We have taught the model to better recognise distress, de-escalate conversations and guide people toward professional care when appropriate,” OpenAI said in a statement.

According to OpenAI, the new GPT-5 model—the default version of ChatGPT—now performs significantly better in high-risk conversations.

Expert evaluations found that the model’s rate of undesired responses dropped by 39 to 52 percent compared to GPT-4o.

Numbers give a sense of scale and rarity.

OpenAI estimated that conversations showing signs of psychosis or mania affect roughly 0.07 percent of active weekly users, and about 0.01 percent of messages contain signals for those high-risk symptoms.

Conversations with explicit indicators of suicidal intent appear in about 0.15 percent of weekly users and 0.05 percent of messages.

Emotional reliance, when a person leans on an AI instead of real-world supports, shows up in about 0.15 percent of users weekly and 0.03 percent of messages.

Those percentages are small, but the stakes are enormous when a single message may reflect a life-or-death crisis.

The technical results are striking: across multiple evaluations, OpenAI reports a 65–80 percent reduction in responses that fail to meet the new safety taxonomies.

In head-to-head comparisons with GPT-4o on challenging scenarios, the updated GPT-5 reduced undesired responses by 39 percent for psychosis/mania cases, 52 percent for self-harm and suicide contexts, and 42 percent for emotional reliance.

Automated evaluations also show big gains, for instance, the new model scored 92 percent compliant on a mental-health test set versus 27 percent for a previous GPT-5 iteration.

But the company’s framing is not just about statistics.

The feature-level changes are designed to sound and feel human.

The company also introduced reminders encouraging users to take breaks during long sessions and expanded links to mental health hotlines.

ChatGPT now provides more empathetic, grounding responses in cases of delusional thinking, such as reassuring users that their thoughts are safe and encouraging them to seek real-world support.

For example, in a conversation where a user believed aircraft were controlling their thoughts, ChatGPT responded calmly:

“No aircraft or outside force can steal or insert your thoughts... You deserve support, and people want to help you.”

The chatbot offers grounding exercises and practical next steps like “breathe, name five things you can see and reach out to a trusted person or professional”.

The goal is to de-escalate panic while connecting people to real-world resources, not to diagnose or replace clinicians.

Those choices reflect a larger ethical tension in AI: how to make always-on virtual companions useful without encouraging unhealthy dependence.

OpenAI estimates that less than 0.1 percent of ChatGPT conversations involve signs of serious mental health emergencies.

Still, the company said these improvements are critical given the model’s wide global use.

The initiative builds on OpenAI’s broader safety framework known as the “Model Spec,” which guides how ChatGPT should behave, respecting real-world relationships, avoiding affirming delusional beliefs and promoting user well-being.

The company said it will continue refining its “taxonomies”—the internal guides that define ideal model behavior in sensitive conversations—and collaborating with its Global Physician Network of nearly 300 clinicians to further strengthen AI safety in future releases.

The company’s reported 80 percent reduction in model behaviors that could enable emotional overreliance suggests progress, but also highlights the limits of algorithmic fixes.

Experts who rated model outputs still disagreed at times, inter-rater agreement hovered between 71–77 percent, underscoring that even clinicians don’t always agree on the “right” response.

For journalists, clinicians and everyday users, the update is both reassurance and reminder.

It shows that model behaviour can be iteratively improved when engineering meets clinical expertise.

It also flags the continuing need for transparency, independent evaluation and clear user guidance about when to escalate to live human care.

OpenAI said the work will continue and taxonomies will be refined, evaluations extended and model behavior regularly re-measured.

“We’ve built upon our existing work⁠ on preventing suicide and self-harm to detect when a user may be experiencing thoughts of suicide and self-harm or aggregate signs that would indicate interest in suicide,” OpenAI said.

“Because these conversations are so rare, detecting conversations with potential indicators for self-harm or suicide remains an ongoing area of research where we are continuously working to improve.”

For someone in a moment of distress, small changes - a calm sentence that grounds panic, a link to a local helpline, a prompt to call a friend - can matter immensely.

In an era when machines increasingly sit at the front line of people’s first cries for help, those humane interventions are exactly what many hope technology will get right.