On Sun, 7 May 2006, ghumphr(a)fas.harvard.edu wrote:
Last minute comments?
you need a title
In recent years, political methodologists, have produced innumerable
innumerable? i don't know of any. tho there are some applications,
including wordscores.
automated document ranking and classification systems.
Many of these
ignore word sequence information, treating entire documents as mere
collections of words. A subset of these, including those based on the
well-known Naive Bayes algorithm, assume that word frequencies are
at least in political science, the 'naive bayes algorithm' is not
well-known. this terminology sound like computer science, not the
statistical terminology used in the soc sci's.
unrelated and that word sequence information is
unimportant
\cite{domingos96}. A recently developed algorithm known as Wordscores
makes an even wider set of assumptions \cite{wordscores2003}. In this
paper, we compare Wordscores to several more moderate document ranking,
classification, and summarization algorithms. Surprisingly, we find that
Wordscores shows remarkably improved performance at carrying out a small
number of carefully selected classifications on meticuously arranged
collections of political documents, demonstrate its performance at
gauging the effects of news headlines on S\&P 500 daily securities
prices, and discuss its utility in other applications.
i'd have thought that wordscores would be a good first attempt but one
that you could do much better than. if your results are right, it would
be good to explain why you're getting the results you are. i.e., the
results are interesting and surprising but explaining the surprise is good
too.
Gary
_______________________________________________
gov2001-l mailing list
gov2001-l(a)lists.fas.harvard.edu
http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l