University of Illinois at Urbana Champaign
Topic models, which factor each document into different topics and represent each topic as a distribution of terms, have been widely and successfully used to better understand collections of text documents. However, documents are also associated with further information, such as the set of real-world entities mentioned in them. For example, news articles are usually related to several people, organizations, countries or locations. Since, those associated entities carry rich information, it is highly desirable to build more expressive, entity-based topic models, which can capture the term distributions for each topic, each entity, as well as each topic-entity pair.