Geoff and Chris,
Good abstract. I would second Abby's point and, in addition,
suggest you say a bit more about the substance of your results. What
kind of political documents are you using to demonstrate the
usefulness of the process? What specifically are the practical
implications? Also, the title seems very "vanilla", which is ironic
given the vanilla recursive hierarchical Dirichlet-multinomial mixture
model. I have left some possible ideas below. Goodluck.
Sheldon
Exploring the Hidden Transcript - Political Documents and Structures of Meaning
Political Grammar: How Meaning is Constructed in Political Documents
On 4/24/06, Abby Williamson <abby.williamson(a)gmail.com> wrote:
Dear Geoff & Chris,
I like your beginning a lot - very clear - but I would suggest breaking your
final sentence into 2 and perhaps getting rid of the final clause of the
second to last sentence, to maintain your clarity throughout.
Best,
Abby
Surprisingly, we find that the assumptions of Wordscores notwithstanding, it
shows dramatically increased performance at carrying out a small number of
carefully selected classifications on meticulously arranged collections of
political documents, comparable to some of the latest developments in
document classification. We conclude by discussing Wordscores' use in
practical applications.
-----Original Message-----
From: gov2001-l-bounces(a)lists.fas.harvard.edu
[mailto:gov2001-l-bounces@lists.fas.harvard.edu]On Behalf Of
ghumphr(a)fas.harvard.edu
Sent: Monday, April 24, 2006 10:19 PM
To: gov2001-l(a)lists.fas.harvard.edu
Subject: [gov2001-l] Preliminary Abstract
Geoff Humphreys and Chris Long
Classfying Political Documents
In recent years, political methodologists, have produced innumerable
automated
document classification systems. Many of these systems, such as those based
on
the well-known Naive Bayes algorithm, treat each word as a distinct entity,
ignoring complex interactions between them. While for some applications this
approach may appear reasonable, the precise arrangements of words in
political
documents often convey meanings which cannot be captured so easily. In this
paper, we investigate the success of such naive algorithms by comparing
Wordscores, a Naive Bayes derivative, to several well-known algorithms and a
new classification system based on vanilla recursive heirarchical
Dirichlet-multinomial mixture models, pointing out avenues for future
advancement. Surprisingly, we find that the assumptions of Wordscores
notwithstanding, it shows dramatically increased performance, comparable to
some of the latest developments in document classification, at carrying out
a
small number of carefully selected classifications on meticuously arranged
collections of political documents, and discuss its use in practical
applications.
_______________________________________________
gov2001-l mailing list
gov2001-l(a)lists.fas.harvard.edu
http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l
_______________________________________________
gov2001-l mailing list
gov2001-l(a)lists.fas.harvard.edu
http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l