Twenty Years of Confusion in Human Evaluation: NLG Needs Evaluation Sheets and Standardised Definitions

David M. Howcroft, Anya Belz, Miruna-Adriana Clinciu, Dimitra Gkatzia, Sadid A Hasan, Saad Mahamood, Simon Mille, Emiel van Miltenburg, Sashank Santhanam, Verena Rieser. Twenty Years of Confusion in Human Evaluation: NLG Needs Evaluation Sheets and Standardised Definitions. In Brian Davis, Yvette Graham, John Kelleher, Yaji Sripada, editors, Proceedings of the 13th International Conference on Natural Language Generation, INLG 2020, Dublin, Ireland, December 15-18, 2020. pages 169-182, Association for Computational Linguistics, 2020. [doi]

Authors

David M. Howcroft

This author has not been identified. Look up 'David M. Howcroft' in Google

Anya Belz

This author has not been identified. Look up 'Anya Belz' in Google

Miruna-Adriana Clinciu

This author has not been identified. Look up 'Miruna-Adriana Clinciu' in Google

Dimitra Gkatzia

This author has not been identified. Look up 'Dimitra Gkatzia' in Google

Sadid A Hasan

This author has not been identified. Look up 'Sadid A Hasan' in Google

Saad Mahamood

This author has not been identified. Look up 'Saad Mahamood' in Google

Simon Mille

This author has not been identified. Look up 'Simon Mille' in Google

Emiel van Miltenburg

This author has not been identified. Look up 'Emiel van Miltenburg' in Google

Sashank Santhanam

This author has not been identified. Look up 'Sashank Santhanam' in Google

Verena Rieser

This author has not been identified. Look up 'Verena Rieser' in Google