Cureus (08/18/2023) Ayub, Ibraheim; Hamann, Dathan; Hamann, Carsten R.; et al.
Researchers qualitatively analyzed the potential and constraints of OpenAI's Chat Generative Pre-trained Transformer (ChatGPT) language model as a study tool in dermatology. They tasked ChatGPT with formulating multiple-choice American Board of Dermatology Applied Exam (ABD-AE)-style questions extrapolated from eight continuing medical education articles published in the Journal of the American Academy of Dermatology. Two board-certified dermatologists assessed the questions in terms of accuracy, complexity and clarity, along with each question's suitability for the needed depth of knowledge for the ABD-AE and the clarity of its wording and structure. ChatGPT usage yielded 40 questions for the articles; however, reviewers rated 10 as being of low complexity, nine as vague or unclear and five as inaccurate. Just 16 questions generated using ChatGPT 3.5 were accurate and had an appropriate degree of complexity for trainees studying for ABD-AE. "This study identified the limited domain-specific knowledge of ChatGPT as a major limitation as dermatology requires a deep understanding of skin anatomy, physiology and pathology, which ChatGPT lacks," the authors wrote. Additional limitations include the model's failure to understand context and produce high-quality distractor options, and its lack of image-generating abilities. The researchers recommend that future investigation should concentrate "on developing domain-specific language models that possess deep knowledge of dermatology." They concluded that ChatGPT cannot substitute for dermatologists and medical educators when devising questions for evaluating candidates' knowledge and reasoning capabilities.
Read More