Medical Xpress (05/16/23)
The latest version of the ChatGPT chatbot passed a radiology board-level exam, according to two studies published in Radiology. "Our research provides insight into ChatGPT's performance in a radiology context, highlighting the incredible potential of large language models, along with the current limitations that make it unreliable," said Rajesh Bhayana, MD, FRCPC, at University Medical Imaging Toronto. The researchers tested the GPT-3.5 version of ChatGPT on 150 multiple-choice questions conforming to the style, content and difficulty of the Canadian Royal College and American Board of Radiology exams. The chatbot correctly answered 69% of the questions correctly; the Canadian Royal College's passing grade is 70%. It correctly answered 84% of questions demanding lower-order thinking but scored only 60% on questions requiring higher-order thinking. Higher-order questions involving description of imaging findings, calculation and classification, and application of concepts were particularly challenging, reflecting a lack of radiology-specific pre-education. Follow-up tests with the GPT-4 upgrade yielded a 81% passing score and better performance on higher-order thinking questions involving description of imaging findings and application of concepts. However, GPT-4 missed 12 questions that GPT-3.5 did not on lower-order thinking. ChatGPT's consistent use of confident language, even when incorrect, was especially worrisome, particularly if it is relied on as a sole source of information. "At present, ChatGPT is best used to spark ideas, help start the medical writing process and in data summarization," Bhayana concluded. "If used for quick information recall, it always needs to be fact-checked."
Read More