Ranking Documents with Wordscores
In recent years, corporate juggernauts such as Google, have amassed enormous
fortunes by making use of relatively simple document processing algorithms. A
huge body of literature has sprung forth detailing techniques for information
retrieval, document ranking and classification of large sequences and vectors,
and countless algorithms have been developed to take advantage rapid
developments in this ever-changing realm. One such algorithm is Wordscores, a
simple document ranking algorithm recently developed by Michael Laver, Kenneth
Benoit and John Garry for extracting relative political policy positions from
documents. Despite its simplicity Wordscores performs remarkably well at
ranking political documents when compared to a variety of classic document
ranking algorithms. In this paper, we examine the performance of Wordscores
and illustrate its superiority at ranking political texts, proposing extensions
for improving its performance, detailing its assumptions, and specifying
conditions which must be met in order to use it to make substantiative claims.