Matches in SemOpenAlex for { <https://semopenalex.org/work/W4386110374> ?p ?o ?g. }
- W4386110374 endingPage "104770" @default.
- W4386110374 startingPage "104770" @default.
- W4386110374 abstract "Large language models (LLMs) are garnering wide interest due to their human-like and contextually relevant responses. However, LLMs' accuracy across specific medical domains has yet been thoroughly evaluated. Myopia is a frequent topic which patients and parents commonly seek information online. Our study evaluated the performance of three LLMs namely ChatGPT-3.5, ChatGPT-4.0, and Google Bard, in delivering accurate responses to common myopia-related queries.We curated thirty-one commonly asked myopia care-related questions, which were categorised into six domains-pathogenesis, risk factors, clinical presentation, diagnosis, treatment and prevention, and prognosis. Each question was posed to the LLMs, and their responses were independently graded by three consultant-level paediatric ophthalmologists on a three-point accuracy scale (poor, borderline, good). A majority consensus approach was used to determine the final rating for each response. 'Good' rated responses were further evaluated for comprehensiveness on a five-point scale. Conversely, 'poor' rated responses were further prompted for self-correction and then re-evaluated for accuracy.ChatGPT-4.0 demonstrated superior accuracy, with 80.6% of responses rated as 'good', compared to 61.3% in ChatGPT-3.5 and 54.8% in Google Bard (Pearson's chi-squared test, all p ≤ 0.009). All three LLM-Chatbots showed high mean comprehensiveness scores (Google Bard: 4.35; ChatGPT-4.0: 4.23; ChatGPT-3.5: 4.11, out of a maximum score of 5). All LLM-Chatbots also demonstrated substantial self-correction capabilities: 66.7% (2 in 3) of ChatGPT-4.0's, 40% (2 in 5) of ChatGPT-3.5's, and 60% (3 in 5) of Google Bard's responses improved after self-correction. The LLM-Chatbots performed consistently across domains, except for 'treatment and prevention'. However, ChatGPT-4.0 still performed superiorly in this domain, receiving 70% 'good' ratings, compared to 40% in ChatGPT-3.5 and 45% in Google Bard (Pearson's chi-squared test, all p ≤ 0.001).Our findings underscore the potential of LLMs, particularly ChatGPT-4.0, for delivering accurate and comprehensive responses to myopia-related queries. Continuous strategies and evaluations to improve LLMs' accuracy remain crucial.Dr Yih-Chung Tham was supported by the National Medical Research Council of Singapore (NMRC/MOH/HCSAINV21nov-0001)." @default.
- W4386110374 created "2023-08-24" @default.
- W4386110374 creator A5009380042 @default.
- W4386110374 creator A5009598072 @default.
- W4386110374 creator A5017063138 @default.
- W4386110374 creator A5032242330 @default.
- W4386110374 creator A5032933669 @default.
- W4386110374 creator A5051108261 @default.
- W4386110374 creator A5051241744 @default.
- W4386110374 creator A5057230344 @default.
- W4386110374 creator A5068722359 @default.
- W4386110374 creator A5069499692 @default.
- W4386110374 creator A5075448224 @default.
- W4386110374 creator A5085096438 @default.
- W4386110374 creator A5092683731 @default.
- W4386110374 date "2023-09-01" @default.
- W4386110374 modified "2023-09-30" @default.
- W4386110374 title "Benchmarking large language models’ performances for myopia care: a comparative analysis of ChatGPT-3.5, ChatGPT-4.0, and Google Bard" @default.
- W4386110374 cites W2029286390 @default.
- W4386110374 cites W2138595060 @default.
- W4386110374 cites W2558931807 @default.
- W4386110374 cites W2728112287 @default.
- W4386110374 cites W3004784834 @default.
- W4386110374 cites W3031677973 @default.
- W4386110374 cites W3131571875 @default.
- W4386110374 cites W3199181343 @default.
- W4386110374 cites W3200742808 @default.
- W4386110374 cites W4220757449 @default.
- W4386110374 cites W4224611968 @default.
- W4386110374 cites W4319083882 @default.
- W4386110374 cites W4319263749 @default.
- W4386110374 cites W4319332853 @default.
- W4386110374 cites W4319662928 @default.
- W4386110374 cites W4319813031 @default.
- W4386110374 cites W4322719044 @default.
- W4386110374 cites W4324020464 @default.
- W4386110374 cites W4324129637 @default.
- W4386110374 cites W4324308091 @default.
- W4386110374 cites W4327604757 @default.
- W4386110374 cites W4327681325 @default.
- W4386110374 cites W4353015365 @default.
- W4386110374 cites W4360993412 @default.
- W4386110374 cites W4361000349 @default.
- W4386110374 cites W4361289889 @default.
- W4386110374 cites W4362641141 @default.
- W4386110374 cites W4365143687 @default.
- W4386110374 cites W4366823098 @default.
- W4386110374 cites W4366989525 @default.
- W4386110374 cites W4367175039 @default.
- W4386110374 cites W4367175507 @default.
- W4386110374 cites W4367186868 @default.
- W4386110374 cites W4367310920 @default.
- W4386110374 cites W4367669592 @default.
- W4386110374 cites W4367834585 @default.
- W4386110374 cites W4368340908 @default.
- W4386110374 cites W4376133327 @default.
- W4386110374 cites W4379231355 @default.
- W4386110374 cites W4379278559 @default.
- W4386110374 cites W4380741754 @default.
- W4386110374 doi "https://doi.org/10.1016/j.ebiom.2023.104770" @default.
- W4386110374 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/37625267" @default.
- W4386110374 hasPublicationYear "2023" @default.
- W4386110374 type Work @default.
- W4386110374 citedByCount "0" @default.
- W4386110374 crossrefType "journal-article" @default.
- W4386110374 hasAuthorship W4386110374A5009380042 @default.
- W4386110374 hasAuthorship W4386110374A5009598072 @default.
- W4386110374 hasAuthorship W4386110374A5017063138 @default.
- W4386110374 hasAuthorship W4386110374A5032242330 @default.
- W4386110374 hasAuthorship W4386110374A5032933669 @default.
- W4386110374 hasAuthorship W4386110374A5051108261 @default.
- W4386110374 hasAuthorship W4386110374A5051241744 @default.
- W4386110374 hasAuthorship W4386110374A5057230344 @default.
- W4386110374 hasAuthorship W4386110374A5068722359 @default.
- W4386110374 hasAuthorship W4386110374A5069499692 @default.
- W4386110374 hasAuthorship W4386110374A5075448224 @default.
- W4386110374 hasAuthorship W4386110374A5085096438 @default.
- W4386110374 hasAuthorship W4386110374A5092683731 @default.
- W4386110374 hasConcept C144024400 @default.
- W4386110374 hasConcept C144133560 @default.
- W4386110374 hasConcept C149923435 @default.
- W4386110374 hasConcept C151730666 @default.
- W4386110374 hasConcept C162853370 @default.
- W4386110374 hasConcept C205649164 @default.
- W4386110374 hasConcept C2524010 @default.
- W4386110374 hasConcept C2777267654 @default.
- W4386110374 hasConcept C2778755073 @default.
- W4386110374 hasConcept C28719098 @default.
- W4386110374 hasConcept C33923547 @default.
- W4386110374 hasConcept C512399662 @default.
- W4386110374 hasConcept C58640448 @default.
- W4386110374 hasConcept C71924100 @default.
- W4386110374 hasConcept C86251818 @default.
- W4386110374 hasConcept C86803240 @default.
- W4386110374 hasConceptScore W4386110374C144024400 @default.
- W4386110374 hasConceptScore W4386110374C144133560 @default.
- W4386110374 hasConceptScore W4386110374C149923435 @default.
- W4386110374 hasConceptScore W4386110374C151730666 @default.