Matches in SemOpenAlex for { <https://semopenalex.org/work/W4385334680> ?p ?o ?g. }
Showing items 1 to 78 of
78
with 100 items per page.
- W4385334680 abstract "ABSTRACT Background Artificial intelligence (AI) large language models (LLMs) such as ChatGPT have demonstrated the ability to pass standardized exams. These models are not trained for a specific task, but instead trained to predict sequences of text from large corpora of documents sourced from the internet. It has been shown that even models trained on this general task can pass exams in a variety of domain-specific fields, including the United States Medical Licensing Examination. We asked if LLMs would perform as well on a much narrower subdomain tests designed for medical specialists. Furthermore, we wanted to better understand how progressive generations of GPT (generative pre-trained transformer) models may be evolving in the completeness and sophistication of their responses even while generational training remains general. In this study, we evaluated the performance of two versions of GPT (GPT 3 and 4) on their ability to pass the certification exam given to physicians to work as osteoporosis specialists and become a certified clinical densitometrists. Methods A 100-question multiple-choice practice exam was obtained from a 3 rd party exam preparation website that mimics the accredited certification tests given by the ISCD (international society for clinical densitometry). The exam was administered to two versions of GPT, the free version (GPT Playground) and ChatGPT+, which are based on GPT-3 and GPT-4, respectively (OpenAI, San Francisco, CA). The systems were prompted with the exam questions verbatim. If the response was purely textual and did not specify which of the multiple-choice answers to select, the authors matched the text to the closest answer. Each exam was graded and an estimated ISCD score was provided from the exam website. In addition, each response was evaluated by a rheumatologist CCD and ranked for accuracy using a 5-level scale. The two GPT versions were compared in terms of response accuracy and length. Results The average response length was 11.6 ±19 words for GPT-3 and 50.0±43.6 words for GPT-4. GPT-3 answered 62 questions correctly resulting in a failing ISCD score of 289. However, GPT-4 answered 82 questions correctly with a passing score of 342. GPT-3 scored highest on the “Overview of Low Bone Mass and Osteoporosis” category (72% correct) while GPT-4 scored well above 80% accuracy on all categories except “Imaging Technology in Bone Health” (65% correct). Regarding subjective accuracy, GPT-3 answered 23 questions with nonsensical or totally wrong responses while GPT-4 had no responses in that category. Conclusion If this had been an actual certification exam, GPT-4 would now have a CCD suffix to its name even after being trained using general internet knowledge. Clearly, more goes into physician training than can be captured in this exam. However, GPT algorithms may prove to be valuable physician aids in the diagnoses and monitoring of osteoporosis and other diseases." @default.
- W4385334680 created "2023-07-29" @default.
- W4385334680 creator A5069392685 @default.
- W4385334680 creator A5083550114 @default.
- W4385334680 creator A5086272988 @default.
- W4385334680 creator A5088785272 @default.
- W4385334680 creator A5091571142 @default.
- W4385334680 date "2023-07-28" @default.
- W4385334680 modified "2023-10-03" @default.
- W4385334680 title "Performance of progressive generations of GPT on an exam designed for certifying physicians as Certified Clinical Densitometrists" @default.
- W4385334680 cites W2033621026 @default.
- W4385334680 cites W2320571802 @default.
- W4385334680 cites W3121909116 @default.
- W4385334680 cites W3201276366 @default.
- W4385334680 cites W3201701662 @default.
- W4385334680 cites W4280517006 @default.
- W4385334680 cites W4296056389 @default.
- W4385334680 cites W4309478090 @default.
- W4385334680 cites W4309650827 @default.
- W4385334680 cites W4313583499 @default.
- W4385334680 cites W4316687669 @default.
- W4385334680 cites W4319662928 @default.
- W4385334680 cites W4319663047 @default.
- W4385334680 cites W4323035111 @default.
- W4385334680 cites W4324308091 @default.
- W4385334680 cites W4361296421 @default.
- W4385334680 cites W4365148225 @default.
- W4385334680 doi "https://doi.org/10.1101/2023.07.25.23293171" @default.
- W4385334680 hasPublicationYear "2023" @default.
- W4385334680 type Work @default.
- W4385334680 citedByCount "0" @default.
- W4385334680 crossrefType "posted-content" @default.
- W4385334680 hasAuthorship W4385334680A5069392685 @default.
- W4385334680 hasAuthorship W4385334680A5083550114 @default.
- W4385334680 hasAuthorship W4385334680A5086272988 @default.
- W4385334680 hasAuthorship W4385334680A5088785272 @default.
- W4385334680 hasAuthorship W4385334680A5091571142 @default.
- W4385334680 hasBestOaLocation W43853346801 @default.
- W4385334680 hasConcept C144024400 @default.
- W4385334680 hasConcept C154945302 @default.
- W4385334680 hasConcept C162324750 @default.
- W4385334680 hasConcept C168725872 @default.
- W4385334680 hasConcept C187736073 @default.
- W4385334680 hasConcept C2780451532 @default.
- W4385334680 hasConcept C36289849 @default.
- W4385334680 hasConcept C41008148 @default.
- W4385334680 hasConcept C46304622 @default.
- W4385334680 hasConcept C509550671 @default.
- W4385334680 hasConcept C61521584 @default.
- W4385334680 hasConcept C71924100 @default.
- W4385334680 hasConceptScore W4385334680C144024400 @default.
- W4385334680 hasConceptScore W4385334680C154945302 @default.
- W4385334680 hasConceptScore W4385334680C162324750 @default.
- W4385334680 hasConceptScore W4385334680C168725872 @default.
- W4385334680 hasConceptScore W4385334680C187736073 @default.
- W4385334680 hasConceptScore W4385334680C2780451532 @default.
- W4385334680 hasConceptScore W4385334680C36289849 @default.
- W4385334680 hasConceptScore W4385334680C41008148 @default.
- W4385334680 hasConceptScore W4385334680C46304622 @default.
- W4385334680 hasConceptScore W4385334680C509550671 @default.
- W4385334680 hasConceptScore W4385334680C61521584 @default.
- W4385334680 hasConceptScore W4385334680C71924100 @default.
- W4385334680 hasLocation W43853346801 @default.
- W4385334680 hasOpenAccess W4385334680 @default.
- W4385334680 hasPrimaryLocation W43853346801 @default.
- W4385334680 hasRelatedWork W105605535 @default.
- W4385334680 hasRelatedWork W1202880915 @default.
- W4385334680 hasRelatedWork W1778313862 @default.
- W4385334680 hasRelatedWork W2004166206 @default.
- W4385334680 hasRelatedWork W2748952813 @default.
- W4385334680 hasRelatedWork W2804031704 @default.
- W4385334680 hasRelatedWork W2899084033 @default.
- W4385334680 hasRelatedWork W3037745133 @default.
- W4385334680 hasRelatedWork W3103742218 @default.
- W4385334680 hasRelatedWork W2251308513 @default.
- W4385334680 isParatext "false" @default.
- W4385334680 isRetracted "false" @default.
- W4385334680 workType "article" @default.