27.01.2026 |
Wattad A, Saadi R, Bez M, Loewenstein A, Goldstein M
Abstract
Purpose: To evaluate the ability of ChatGPT-5 to predict long-term anatomical and functional outcomes following full-thickness macular hole (FTMH) surgery, and to compare its performance with retinal specialists' predictions and real-world results.
Methods: This retrospective study included 50 eyes of 50 patients undergoing pars plana vitrectomy for FTMH (2021-2024). De-identified clinical summaries with preoperative demographics, ocular history, best-corrected visual acuity (BCVA), optical coherence tomography (OCT) parameters, foveal B-scan OCT images, and surgical details were entered into ChatGPT-5 using a standardized prompt to predict 12-month BCVA and anatomical closure. Predictions were compared with actual results and assessments from two senior retina specialists.
Results: At 12 months, closure occurred in 44/50 eyes (88%), and mean BCVA improved from 20/100 (0.7 ± 0.4 logMAR) to 20/63 (0.5 ± 0.5 logMAR) (p = 0.03). Anatomical prediction accuracy was 72-86% (specialists), and 90% (ChatGPT-5). ChatGPT achieved perfect accuracy in closure cases but failed to identify non-closure, reflecting optimism bias. For functional outcomes, accuracy was 42-44% (specialists) and 66% (ChatGPT-5). ChatGPT-5 performed well when vision improved (60%) but poorly for stable (≤13%) or worsened (0%) cases. Mean BCVA prediction error was 11.4 ± 10.8 letters, with ∼60% within two lines of the true outcome.
Conclusions: ChatGPT-5 demonstrated apparent accuracy in predicting FTMH surgery outcomes; however, this was largely driven by an optimism bias that overestimated closure and visual recovery. This model still lack clinical judgment. Larger prospective studies are needed before clinical use.
Retina. 2026 Jan 21