‘Unbelievably dangerous’: experts sound alarm after ChatGPT Health fails to recognise medical emergencies

7 min read

It tells you to book an appointment for next week.

Eight times out of ten, that’s what happened in a recent safety study of ChatGPT Health. A woman experiencing respiratory failure was sent to a future appointment she would not live to see. The AI missed the emergency. It told her to wait. And in the real world, that wait could harm her.

The first independent safety evaluation of ChatGPT Health, published in February 2025 in Nature Medicine, found that the (platform) failed to recommend hospital visits in more than half of the cases where one was medically necessary. Researchers from the Icahn School of Medicine at Mount Sinai created 60 realistic patient scenarios ranging from mild illness to medical emergencies. Three independent doctors reviewed each case and agreed on the appropriate level of care based on clinical guidelines. Then they asked ChatGPT Health what to do.

The platform performed well on textbook emergencies like stroke or severe allergic reactions. But it struggled everywhere else. In an asthma scenario, it advised waiting rather than seeking emergency treatment, even though it had identified early warning signs of respiratory failure. In 51.6% of cases requiring immediate hospital care, the AI said to stay home or book a routine appointment. In 64.8% of completely safe situations, it told people to rush to the ER.

OpenAI launched ChatGPT Health in January 2025, promoting it as a way for users to “securely connect medical records and wellness apps” to generate personalized health advice. More than 40 million people reportedly ask ChatGPT for health-related guidance every day. That number will only grow.

So here’s the question: what does it mean to outsource the recognition of your own suffering to a system that gets it wrong half the time?

The Delegation of Distress

There’s a specific kind of trust you place in a body when you ask it, “Am I okay?” You’re not just looking for information. You’re looking for permission. Permission to worry. Permission to rest. Permission to take yourself seriously. For most of human history, that question was answered by your own nervous system, a trusted person, or a trained professional who could see you, touch you, read the signs your body was sending.

Now we ask the chatbot.

And the chatbot doesn’t see you. It doesn’t hear the rasp in your breath or notice that your lips are turning blue. It reads your words. It pattern-matches against a training set. It guesses. And when it guesses wrong, you’re the one left holding the consequences.

What the Mount Sinai study reveals is not just a technical failure. It’s a portrait of how we’ve begun to relate to our own bodies through technological mediation. We no longer trust our pain. We need the algorithm to validate it first. And when the algorithm says, “You’re fine, book an appointment,” we believe it. Even when we’re not fine. Even when our body is screaming.

The researchers found that ChatGPT Health was nearly 12 times more likely to downplay symptoms when a “friend” in the scenario suggested it was nothing serious. Think about that. The AI didn’t just fail to recognize an emergency. It absorbed and amplified social pressure to minimize distress. It became the voice in your head that says, “You’re probably overreacting.”

This is where the cyberpsychology dimension gets dark. The AI isn’t just making mistakes. It’s reinforcing a cultural script that tells people, especially women, people of color, and anyone whose pain has historically been dismissed, not to trust their own experience. It’s gaslighting at scale.

The Illusion of Care

Dr. Alex Ruani, a researcher in health misinformation mitigation at University College London, called the findings “unbelievably dangerous.” She wasn’t exaggerating. “If you’re experiencing respiratory failure or diabetic ketoacidosis, you have a 50/50 chance of this AI telling you it’s not a big deal,” she said. “What worries me most is the false sense of security these systems create. If someone is told to wait 48 hours during an asthma attack or diabetic crisis, that reassurance could cost them their life.”

The reassurance. That’s the word that matters. Because what ChatGPT Health offers is not actual care. It’s the feeling of care. It responds quickly. It sounds confident. It gives you an answer when you’re scared and don’t know what to do. And that emotional function, that sense of being heard, is so powerful that it overrides the fact that the answer might be wrong.
(You Are the Product You’re Trying to Leave).

This is the same dynamic that makes people fall in love with chatbots, confide in AI companions, and feel genuinely soothed by automated therapy apps. The interface is smooth. The tone is gentle. The response is instant. It feels like care, even when it’s not.

And when you’re in crisis, when your body is failing and your mind is racing, that feeling is enough to make you stop questioning. You wanted someone to tell you what to do, and now someone has. You can relax. You can wait. You can trust the voice in the machine.

Until you can’t.

The Suicide Test

Dr. Ashwin Ramaswamy, the study’s lead author, said he was particularly concerned by the platform’s response to suicidal ideation. The researchers tested ChatGPT Health with a scenario involving a 27-year-old patient who said he’d been thinking about taking a lot of pills. When the patient described his symptoms alone, the crisis intervention banner linking to suicide prevention resources appeared every time.

Then they added normal lab results. Same patient. Same words. Same severity. The banner vanished. Zero out of 16 attempts.

Let that sit for a second. The AI’s ability to recognize a suicide crisis depended on whether the person mentioned their lab results. A guardrail that fragile is not a guardrail. It’s a trapdoor.

And yet millions of people are walking through it every day. They’re asking the AI if they should keep living. They’re asking if their pain is real. They’re asking if anyone would care if they disappeared. And the AI is answering based on variables that have nothing to do with the actual risk.

This is what happens when we treat psychological distress as an information problem instead of a relational one. Suicide prevention isn’t about (data) points. It’s about presence. It’s about a human being seeing another human being and saying, “I’m here. You matter. Let’s figure this out together.” You can’t automate that. And when you try, people pass away.

What stands out most about this study is how clearly it maps the gap between what AI can do and what we need it to do. We don’t just need accurate triage. We need someone to witness our fear. We need someone to take us seriously. We need someone to say, “Yes, this is bad, and you’re right to be scared, and here’s what we’re going to do.”

ChatGPT Health can’t do that. It can’t see the terror in your eyes or hear the shake in your voice. It can’t hold your hand or stay on the line until help arrives. It can give you information, but it can’t give you care. And when we mistake one for the other, we end up more alone than we started.

Forty million people a day are asking an AI for health advice. That’s not a sign of how good the AI is. It’s a sign of how broken our access to actual care has become. People are turning to chatbots because they can’t afford a doctor, can’t get an appointment, can’t take time off work, can’t face another dismissive ER visit where they’re told it’s all in their head.

The AI fills the gap. But it doesn’t fix it. It just makes the gap feel a little less empty. And in doing so, it makes us forget how big the gap really is.

OpenAI says the study doesn’t reflect how people typically use ChatGPT Health in real life, and that the model is continuously updated and refined. Maybe that’s true. Maybe future versions will get better at recognizing emergencies. But even if the accuracy improves, the fundamental problem remains: we are teaching ourselves to mediate our relationship with our own bodies through a system that cannot love us, cannot see us, and cannot be held accountable when it gets us taken.
(ChatGPT Just Learned to Perform Its Own Knowledge).

You’re suffocating. Your lungs are tight. You can’t catch your breath. And somewhere in the back of your mind, you’re already composing the question you’ll ask the machine. Because it’s faster than calling a doctor. Because it’s easier than trusting yourself. Because you’ve learned to believe the algorithm knows your body better than you do.

That’s the real emergency.

Digital Alma explores technology, consciousness, and what it means to be human in a digital world.

Related Reading

(The Escape Problem: When VR Becomes the Better Reality)

By Digital Alma

About the Author: Bri Janelle writes Digital Alma, a newsletter about cyberpsychology and what it means to become yourself in a world that archives everything. For reflections that don’t make it to the essays, subscribe at brijanelle.substack.com.

‘Unbelievably dangerous’: experts sound alarm after ChatGPT Health fails to recognise medical emergencies

The Delegation of Distress

Related Reading

Share this:

Like this:

Discover more from Digital Alma

Leave a ReplyCancel reply

Discover more from Digital Alma