Close Menu
    Trending
    • “Dr. Google” had its issues. Can ChatGPT Health do better?
    • Evaluating Multi-Step LLM-Generated Content: Why Customer Journeys Require Structural Metrics
    • Why SaaS Product Management Is the Best Domain for Data-Driven Professionals in 2026
    • Stop Writing Messy Boolean Masks: 10 Elegant Ways to Filter Pandas DataFrames
    • What Other Industries Can Learn from Healthcare’s Knowledge Graphs
    • Everyone wants AI sovereignty. No one can truly have it.
    • Yann LeCun’s new venture is a contrarian bet against large language models
    • AI-musik splittrar Sverige: hitlåt portas från topplistan
    ProfitlyAI
    • Home
    • Latest News
    • AI Technology
    • Latest AI Innovations
    • AI Tools & Technologies
    • Artificial Intelligence
    ProfitlyAI
    Home » “Dr. Google” had its issues. Can ChatGPT Health do better?
    AI Technology

    “Dr. Google” had its issues. Can ChatGPT Health do better?

    ProfitlyAIBy ProfitlyAIJanuary 22, 2026No Comments6 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Some medical doctors see LLMs as a boon for medical literacy. The common affected person would possibly wrestle to navigate the huge panorama of on-line medical info—and, specifically, to tell apart high-quality sources from polished however factually doubtful web sites—however LLMs can do this job for them, no less than in idea. Treating sufferers who had searched for his or her signs on Google required “plenty of attacking affected person nervousness [and] decreasing misinformation,” says Marc Succi, an affiliate professor at Harvard Medical College and a working towards radiologist. However now, he says, “you see sufferers with a school schooling, a highschool schooling, asking questions on the degree of one thing an early med scholar would possibly ask.”

    The discharge of ChatGPT Well being, and Anthropic’s subsequent announcement of recent well being integrations for Claude, point out that the AI giants are more and more prepared to acknowledge and encourage health-related makes use of of their fashions. Such makes use of actually include dangers, given LLMs’ well-documented tendencies to agree with customers and make up info somewhat than admit ignorance. 

    However these dangers additionally should be weighed in opposition to potential advantages. There’s an analogy right here to autonomous autos: When policymakers take into account whether or not to permit Waymo of their metropolis, the important thing metric will not be whether or not its automobiles are ever concerned in accidents however whether or not they trigger much less hurt than the established order of counting on human drivers. If Dr. ChatGPT is an enchancment over Dr. Google—and early proof suggests it could be—it might probably reduce the big burden of medical misinformation and pointless well being nervousness that the web has created.

    Pinning down the effectiveness of a chatbot akin to ChatGPT or Claude for client well being, nonetheless, is difficult. “It’s exceedingly tough to judge an open-ended chatbot,” says Danielle Bitterman, the medical lead for knowledge science and AI on the Mass Common Brigham health-care system. Massive language fashions score well on medical licensing examinations, however these exams use multiple-choice questions that don’t mirror how folks use chatbots to lookup medical info.

    Sirisha Rambhatla, an assistant professor of administration science and engineering on the College of Waterloo, tried to shut that hole by evaluating how GPT-4o responded to licensing examination questions when it didn’t have entry to an inventory of attainable solutions. Medical specialists who evaluated the responses scored solely about half of them as solely appropriate. However multiple-choice examination questions are designed to be tough sufficient that the reply choices don’t give them solely away, they usually’re nonetheless a fairly distant approximation for the form of factor {that a} consumer would sort into ChatGPT.

    A different study, which examined GPT-4o on extra real looking prompts submitted by human volunteers, discovered that it answered medical questions accurately about 85% of the time. After I spoke with Amulya Yadav, an affiliate professor at Pennsylvania State College who runs the Accountable AI for Social Emancipation Lab and led the examine, he made it clear that he wasn’t personally a fan of patient-facing medical LLMs. However he freely admits that, technically talking, they appear as much as the duty—in any case, he says, human medical doctors misdiagnose sufferers 10% to fifteen% of the time. “If I take a look at it dispassionately, it appears that evidently the world is gonna change, whether or not I prefer it or not,” he says.

    For folks searching for medical info on-line, Yadav says, LLMs do appear to be a better option than Google. Succi, the radiologist, additionally concluded that LLMs generally is a higher various to internet search when he compared GPT-4’s responses to questions on frequent continual medical circumstances with the knowledge introduced in Google’s data panel, the knowledge field that typically seems on the proper aspect of the search outcomes.

    Since Yadav’s and Succi’s research appeared on-line, within the first half of 2025, OpenAI has launched a number of new variations of GPT, and it’s affordable to anticipate that GPT-5.2 would carry out even higher than its predecessors. However the research do have vital limitations: They give attention to easy, factual questions, they usually study solely transient interactions between customers and chatbots or internet search instruments. A number of the weaknesses of LLMs—most notably their sycophancy and tendency to hallucinate—is likely to be extra prone to rear their heads in additional intensive conversations and with people who find themselves coping with extra advanced issues. Reeva Lederman, a professor on the College of Melbourne who research know-how and well being, notes that sufferers who don’t just like the prognosis or therapy suggestions that they obtain from a health care provider would possibly search out one other opinion from an LLM—and the LLM, if it’s sycophantic, would possibly encourage them to reject their physician’s recommendation.

    Some research have discovered that LLMs will hallucinate and exhibit sycophancy in response to health-related prompts. For instance, one study confirmed that GPT-4 and GPT-4o will fortunately settle for and run with incorrect drug info included in a consumer’s query. In another, GPT-4o often concocted definitions for pretend syndromes and lab checks talked about within the consumer’s immediate. Given the abundance of medically doubtful diagnoses and coverings floating across the web, these patterns of LLM habits might contribute to the unfold of medical misinformation, notably if folks see LLMs as reliable.

    OpenAI has reported that the GPT-5 sequence of fashions is markedly much less sycophantic and susceptible to hallucination than their predecessors, so the outcomes of those research may not apply to ChatGPT Well being. The corporate additionally evaluated the mannequin that powers ChatGPT Well being on its responses to health-specific questions, utilizing their publicly out there HeathBench benchmark. HealthBench rewards fashions that specific uncertainty when applicable, advocate that customers search medical consideration when crucial, and chorus from inflicting customers pointless stress by telling them their situation is extra critical that it really is. It’s affordable to imagine that the mannequin underlying ChatGPT Well being exhibited these behaviors in testing, although Bitterman notes that a few of the prompts in HealthBench have been generated by LLMs, not customers, which might restrict how properly the benchmark interprets into the actual world.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleEvaluating Multi-Step LLM-Generated Content: Why Customer Journeys Require Structural Metrics
    ProfitlyAI
    • Website

    Related Posts

    AI Technology

    Everyone wants AI sovereignty. No one can truly have it.

    January 22, 2026
    AI Technology

    Yann LeCun’s new venture is a contrarian bet against large language models

    January 22, 2026
    AI Technology

    Rethinking AI’s future in an augmented workplace

    January 21, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Understanding Application Performance with Roofline Modeling

    June 20, 2025

    TDS Newsletter: How to Make Smarter Business Decisions with AI

    September 19, 2025

    Modern GUI Applications for Computer Vision in Python

    May 1, 2025

    China built hundreds of AI data centers to catch the AI boom. Now many stand unused.

    April 3, 2025

    An anomaly detection framework anyone can use | MIT News

    May 28, 2025
    Categories
    • AI Technology
    • AI Tools & Technologies
    • Artificial Intelligence
    • Latest AI Innovations
    • Latest News
    Most Popular

    Time Series Isn’t Enough: How Graph Neural Networks Change Demand Forecasting

    January 19, 2026

    Apple’s AI Promises Just Got Exposed — Here’s What They’re Not Telling You

    April 23, 2025

    We Didn’t Invent Attention — We Just Rediscovered It

    November 5, 2025
    Our Picks

    “Dr. Google” had its issues. Can ChatGPT Health do better?

    January 22, 2026

    Evaluating Multi-Step LLM-Generated Content: Why Customer Journeys Require Structural Metrics

    January 22, 2026

    Why SaaS Product Management Is the Best Domain for Data-Driven Professionals in 2026

    January 22, 2026
    Categories
    • AI Technology
    • AI Tools & Technologies
    • Artificial Intelligence
    • Latest AI Innovations
    • Latest News
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2025 ProfitlyAI All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.