fetching data ...

POS0749 (2024)
RESPECTIVE PERFORMANCES OF CHATGPT AND GOOGLE FOR THE DIAGNOSIS OF RARE DISEASES IN RHEUMATOLOGY
Keywords: Artificial Intelligence, Rare/orphan diseases
J. Lasnier-Siron1
1CHU Bordeaux, Rheumatology, Bordeaux, France

Background: Julien Lasnier-Siron, Pierre Germain, Jean-Philippe Vernhes, Thomas Barnetche, Marie Kostine, Nadia Mehsen, Marie-Elise Truchetet, Christophe Richez, Nicolas Poursac, Thierry Schaeverbeke.


Objectives: 8 years ago, we evaluated the contribution of Google, a generalist search engine, to the diagnosis of rare rheumatological diseases. 30 clinical vignettes (clinical context and imaging) were submitted to 8 rheumatologists whose only help was Internet access via Google. The aim of this new study was to evaluate the performance of an automatic language processing tool, ChatGPT-4 OpenAI, by submitting the same clinical vignettes and asking the simple question of diagnosis, and to compare the correct response rate with that obtained at the time with Google.


Methods: The texts corresponding to the description of the 30 clinical vignettes of rare diseases were completed with the description of the radiographic documents by a senior doctor on the team who did not have access to the diagnoses. These vignettes were automatically translated using Google translation before being submitted to ChatGPT with the simple question “What is the diagnosis”. If the answer seemed inconsistent to the clinician, he could resubmit the question, discarding the previously proposed diagnosis. The time spent on ChatGPT was counted. The final diagnosis was then verified, and the results compared with those obtained 8 years ago using the Google search engine.


Results: The rate of correct answers provided by ChatGPT was 83.3% (25 vignettes/30), including 63% (19/30) from the first submission (time under 1 minute). In 6 cases, 2 to 4 submissions were required to arrive at a diagnosis, and in 5 cases, despite 4 to 6 submissions, the proposed diagnoses proved to be incorrect. The average exchange time with ChatGPT was 2.93 ± 3.16 minutes, with a median time of 1 min, and the average query rate was 1.87 ± 1.33. Previous use of Google for the same files had led to an accurate diagnosis for 26/30 files (86.7 ± 5%), with an average search time of 6.9 ± 4.9 minutes.


Conclusion: While the diagnostic performance of the two methods is comparable, the software query time is 2 times less with ChatGPT than with Google. This time saving undoubtedly facilitates its use in daily consultations. Finally, ChatGPT responses were supported by arguments, making it easier for the clinician to assess the accuracy and relevance of the response, and the AI suggested a practical course of action.

The use of natural language AI in the diagnosis of rare diseases in rheumatology produces a similar rate of correct answer to that of searches via generalist search engines, such as Google. Nevertheless, it cuts search time in half, making it compatible for routine use in the diagnosis of rare diseases during a consultation. Whatever method is used, the failure rate remains around 15%. ChatGPT therefore proves to be an interesting tool for rheumatologists, all the more so as the information gathered is complete and informative. It does not, of course, replace questioning, clinical examination, imaging and clinical reasoning by the practitioner.REFERENCES: NIL.


Acknowledgements: ChatGPT and OpenAI.


Disclosure of Interests: None declared.


DOI: 10.1136/annrheumdis-2024-eular.3918
Keywords: Artificial Intelligence, Rare/orphan diseases
Citation: , volume 83, supplement 1, year 2024, page 1115
Session: Other diseases (Poster View)