OpenAI’s ChatGPT has met the 60% passing threshold for the United States Medical Licensing Exam taken by all medical students and physicians-in-training, offering a glimpse of the bot’s potential to work in medical education and clinical practice.
The chatbot provided responses that make coherent, internal sense and contain frequent insights, according to a study published yesterday in the open-access journal PLOS Digital Health by Tiffany Kung, Victor Tseng, and colleagues at AnsibleHealth.
Kung and her team tested the performance of ChatGPT on the USMLE, a set of three highly standardised and regulated exams required to practice medicine in the United States. The USMLE, which is taken by medical students and physicians-in-training, assesses knowledge across a range of medical disciplines, including biochemistry, diagnostic reasoning, and bioethics.
After removing image-based questions, the researchers tested ChatGPT on 350 of the 376 publicly available questions from the June 2022 USMLE release. After eliminating indeterminate responses, ChatGPT scored between 52.4% and 75.0% on the three USMLE exams, with the passing threshold each year being around 60%.
ChatGPT also showed 94.6% agreement in its responses and produced at least one new, non-obvious and clinically valid insight for 88.9% of its answers. It is worth mentioning that ChatGPT outperformed PubMedGPT's model, which was trained exclusively on biomedical domain literature and scored 50.8% on an older dataset of USMLE-style questions.
ChatGPT's future as clinical advisor
Despite the limited scope of analysis due to the small input size, the authors believe their findings offer a glimpse of ChatGPT's potential to improve medical education and, eventually, clinical practice. For instance, clinicians at AnsibleHealth already use ChatGPT to reword jargon-heavy reports for better patient understanding.
The authors say reaching the passing score for this notoriously tricky expert exam, and doing so without any human reinforcement, marks a notable milestone in clinical AI.
"ChatGPT contributed substantially to the writing of [our] manuscript, says Kung. “We interacted with ChatGPT much like a colleague, asking it to synthesise, simplify, and offer counterpoints to drafts in progress. All of the co-authors valued ChatGPT's input."
- Clever coders lead the way as Microsoft launches 365 CopilotAI Applications
- GPT-4 is live, with Morgan Stanley and Stripe early adoptersAI Applications
- Large language models a welcome “wild west” for economistsAI Strategy
- Scientists reflect on the Harry Potter nature of AI chatbotsAI Applications