(WHS-L4.01) EASING THE PRESSURE IN WOUND THERAPY: EVALUATING CHATGPT’s MANAGEMENT AND TREATMENT OF WOUNDS
Thursday, May 16, 2024
10:30 AM – 11:30 AM East Coast USA Time
Background: ChatGPT, a large language model utilizing generative artificial intelligence (AI), is the fastest-growing consumer application. As an emerging tool that is convenient, it has been accessed for medical advice and applied to healthcare with potential in providing patient education and guidance to practitioners in the management of cases. Wound care is a critical and common medical concern. This study comprehensively evaluates ChatGPT’s capacity to answer users’ questions regarding wound care, efficacy in providing evidence-based recommendations, and effectiveness in offering treatment strategies extrapolated from images from performed cases.
Methods: ChatGPT (GPT-3.5) was prompted using open-ended frequently asked questions that were web scraped and adapted from institutional medical websites. Common Questions About Wound Care from the American Academy of Family Physicians (AAFP) were also used. Agreement metrics between modalities were calculated and overlapping keywords were compared. The chatbot’s references were scrutinized. Images from published open access case reports pertaining to wound care were provided to GPT-4 to determine its assessment and recommendations for treatment support.
Results: Across 6 institutions and the prompts (22 questions) provided, 22 (100%) answers were in agreement with the chatbot’s response and 55/56 (98%) keywords were overlapping. The request for references retrieved 22 publications with 21 (95%) being legitimate (9 were open access). From the AAFP prompts, 8 (100%) agreed and 23 (74%) keywords were overlapping. A total of 10 images from 2 female and 4 male patients were provided from 5 published case reports with a mean (SD) age of 47 (12) years. GPT-4 demonstrated competence with initially assessing and providing global recommendations for hypothetical treatment approaches as well as highlighting negative pressure wound therapy settings, wound healing stages, and possible flap coverage, among other factors, but in the majority of cases diverted from offering exact specifications for management of the patient when questions were open ended. Binary prompting or offering a range of treatment options in conjunction with providing patient information, wound characteristics, and medical history was a superior approach.
Conclusion: ChatGPT demonstrated a comprehensive understanding of wound care and management. Although prone to possible hallucinations and varying performance based on prompting scheme, ChatGPT and its successor, GPT-4, have immense potential in serving as an adjunct to patients and providers. Fine tuning a chatbot using relevant articles and cases may be a valuable investment for more profound strides in AI-based involvement for wound care.