That was supposed to be this:
In recent years, political methodologists, have produced innumerable automated
document ranking and classification systems. Many of these ignore word sequence
information, treating entire documents as mere collections of words. A subset of
these, including those based on the well-known Naive Bayes algorithm, assume
that word frequencies are unrelated as well \cite{domingos96}. A recently
developed algorithm known as Wordscores makes an even wider set of assumptions
\cite{wordscores2003}. In this paper, we compare Wordscores to several more
moderate document ranking, classification, and summarization algorithms.
Surprisingly, we find that Wordscores shows remarkably improved performance at
carrying out a small number of carefully selected classifications on
meticuously arranged collections of political documents, demonstrate its
performance at gauging the effects of news headlines on S\&P 500 daily
securities prices, and discuss its utility in other applications.
Quoting ghumphr(a)fas.harvard.edu:
Last minute comments?
In recent years, political methodologists, have produced innumerable
automated
document ranking and classification systems. Many of these ignore word
sequence
information, treating entire documents as mere collections of words. A subset
of
these, including those based on the well-known Naive Bayes algorithm, assume
that word frequencies are unrelated and that word sequence information is
unimportant \cite{domingos96}. A recently developed algorithm known as
Wordscores makes an even wider set of assumptions \cite{wordscores2003}. In
this paper, we compare Wordscores to several more moderate document ranking,
classification, and summarization algorithms. Surprisingly, we find that
Wordscores shows remarkably improved performance at carrying out a small
number
of carefully selected classifications on meticuously arranged collections of
political documents, demonstrate its performance at gauging the effects of
news
headlines on S\&P 500 daily securities prices, and discuss its utility in
other
applications.
_______________________________________________
gov2001-l mailing list
gov2001-l(a)lists.fas.harvard.edu
http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l