Paul Windisch, Carole Koechli, Fabio Dennstädt, Daniel M. Aebersold, Daniel R. Zwahlen, Robert Förster, Christina Schröder. Is one run enough? Reproducibility of flagship large language models across temperature and reasoning settings in biomedical text processing. JAMIA, 33(6):1179-1184, 2026. [doi]
Abstract is missing.