Evaluating topic models for digital libraries

David Newman, Youn Noh, Edmund M. Talley, Sarvnaz Karimi, Timothy Baldwin. Evaluating topic models for digital libraries. In Jane Hunter, Carl Lagoze, C. Lee Giles, Yuan-Fang Li, editors, Proceedings of the 2010 Joint International Conference on Digital Libraries, JCDL 2010, Gold Coast, Queensland, Australia, June 21-25, 2010. pages 215-224, ACM, 2010. [doi]

Abstract

Topic models could have a huge impact on improving the ways users find and discover content in digital libraries and search interfaces through their ability to automatically learn and apply subject tags to each and every item in a collection, and their ability to dynamically create virtual collections on the fly. However, much remains to be done to tap this potential, and empirically evaluate the true value of a given topic model to humans. In this work, we sketch out some sub-tasks that we suggest pave the way towards this goal, and present methods for assessing the coherence and interpretability of topics learned by topic models. Our large-scale user study includes over 70 human subjects evaluating and scoring almost 500 topics learned from collections from a wide range of genres and domains. We show how scoring model – based on pointwise mutual information of word-pair using Wikipedia, Google and MEDLINE as external data sources - performs well at predicting human scores. This automated scoring of topics is an important first step to integrating topic modeling into digital libraries