The AI will see you now as ChatGPT scores doctor’s exam pass

ChatGPT was able to achieve passing scores for major medical exam, which usually requires years of training, laying groundwork for AI doctors and surgeons

OpenAI’s ChatGPT has met the 60% passing threshold for the United States Medical Licensing Exam taken by all medical students and physicians-in-training, offering a glimpse of the bot’s potential to work in medical education and clinical practice.

The chatbot provided responses that make coherent, internal sense and contain frequent insights, according to a study published yesterday in the open-access journal PLOS Digital Health by Tiffany Kung, Victor Tseng, and colleagues at AnsibleHealth.

Kung and her team tested the performance of ChatGPT on the USMLE, a set of three highly standardised and regulated exams required to practice medicine in the United States. The USMLE, which is taken by medical students and physicians-in-training, assesses knowledge across a range of medical disciplines, including biochemistry, diagnostic reasoning, and bioethics.

After removing image-based questions, the researchers tested ChatGPT on 350 of the 376 publicly available questions from the June 2022 USMLE release. After eliminating indeterminate responses, ChatGPT scored between 52.4% and 75.0% on the three USMLE exams, with the passing threshold each year being around 60%.

ChatGPT also showed 94.6% agreement in its responses and produced at least one new, non-obvious and clinically valid insight for 88.9% of its answers. It is worth mentioning that ChatGPT outperformed PubMedGPT's model, which was trained exclusively on biomedical domain literature and scored 50.8% on an older dataset of USMLE-style questions.

ChatGPT's future as clinical advisor

Despite the limited scope of analysis due to the small input size, the authors believe their findings offer a glimpse of ChatGPT's potential to improve medical education and, eventually, clinical practice. For instance, clinicians at AnsibleHealth already use ChatGPT to reword jargon-heavy reports for better patient understanding.

The authors say reaching the passing score for this notoriously tricky expert exam, and doing so without any human reinforcement, marks a notable milestone in clinical AI.

"ChatGPT contributed substantially to the writing of [our] manuscript, says Kung. “We interacted with ChatGPT much like a colleague, asking it to synthesise, simplify, and offer counterpoints to drafts in progress. All of the co-authors valued ChatGPT's input."

Share

Featured Articles

AI and Broadcasting: BBC Commits to Transforming Education

The global broadcaster seeks to use AI to make its education offerings personalised and interactive to encourage young people to engage with the company

Why Businesses are Building AI Strategy on Amazon Bedrock

AWS partners such as Accenture, Delta Air Lines, Intuit, Salesforce, Siemens, Toyota & United Airlines are using Amazon Bedrock to build and deploy Gen AI

Pick N Pay’s Leon Van Niekerk: Evaluating Enterprise AI

We spoke with Pick N Pay Head of Testing Leon Van Niekerk at OpenText World Europe 2024 about its partnership with OpenText and how it plans to use AI

AI Agenda at Paris 2024: Revolutionising the Olympic Games

AI Strategy

Who is Gurdeep Singh Pall? Qualtrics’ AI Strategy President

Technology

Should Tech Leaders be Concerned About the Power of AI?

Technology