Last minute comments?
In recent years, political methodologists, have produced innumerable automated
document ranking and classification systems. Many of these ignore word sequence
information, treating entire documents as mere collections of words. A subset of
these, including those based on the well-known Naive Bayes algorithm, assume
that word frequencies are unrelated and that word sequence information is
unimportant \cite{domingos96}. A recently developed algorithm known as
Wordscores makes an even wider set of assumptions \cite{wordscores2003}. In
this paper, we compare Wordscores to several more moderate document ranking,
classification, and summarization algorithms. Surprisingly, we find that
Wordscores shows remarkably improved performance at carrying out a small number
of carefully selected classifications on meticuously arranged collections of
political documents, demonstrate its performance at gauging the effects of news
headlines on S\&P 500 daily securities prices, and discuss its utility in other
applications.