D. Sculley, William Cukierski, Phil Culliton, Sohier Dane, Maggie Demkin, Ryan Holbrook, Addison Howard, Paul Mooney, Walter Reade, Meg Risdal, Nate Keating. Position: AI Competitions Provide the Gold Standard for Empirical Rigor in GenAI Evaluation. In Forty-second International Conference on Machine Learning, ICML 2025, Vancouver, BC, Canada, July 13-19, 2025 - Position Paper Track. OpenReview.net, 2025. [doi]
Abstract is missing.