Stefan Evert. Significance tests for the evaluation of ranking methods. In COLING 2004, 20th International Conference on Computational Linguistics, Proceedings of the Conference, 23-27 August 2004, Geneva, Switzerland. 2004. [doi]
Abstract is missing.