Gov2001 April 2009

gov2001@lists.gking.harvard.edu

23 participants
36 discussions

[gov2001-l] clarification on problem 2 for extension final

by plam＠fas.harvard.edu

In the tobit model for problem 2, \mu_i is just the mean, not a vector of means. -- Patrick Lam Department of Government and Institute for Quantitative Social Science, Harvard University http://www.people.fas.harvard.edu/~plam

14 years, 11 months

[gov2001-l] An R Question for Code Flexibility

by bstewart＠fas.harvard.edu

Hi Everyone- I hope all your papers are going well. I am trying to write an R function that will produce a fairly complicated graph by taking in, amongst other things, a variable name and then at one point use setx in Zelig to produce a number of values. In setx the argument name is actually the variable name, and here is where I run into trouble. I can't seem to take an argument into my function and have it come out as an argument in another function. So for example imagine this reduced example: myfunction <- function(var) { result <- setx(modelobj, data=data, var =seq(from=xrange[1], to=xrange[2], length.out=100)) } You can see that I set the key piece of code out here. Its not the variable itself I need there but the name of the variable. I have tried a character string, assign(), paste(), substitute() and attribute(). For those that know Stata this would be fairly easily solved by calling a local (in this case the local is created automatically and you just write `var'). R is not a macro language though, so I assume there is a better way I just haven't found yet. Any thoughts? Brandon

14 years, 11 months

[gov2001-l] amelia

by charlotte.cavaille＠gmail.com

Dear Professor King, Miya and Patrick Just wanted to mention the following thing that happen to me: this piece of R code using Amelia only runs on one of the HMDC computers, I have tried 4 other computers and get an error message about unused arguments...(concerning that variable with NAs) I will stick to the working computer hoping that it is because it has the right Amelia package that it runs ( and not because there is something wrong that, for weird reasons, still goes our way!) Thought i would mention it just in case anyone had the same issue...really weird! model.1.new <- scrugg.f[, c("uercovch", "laguercov", "lagleftc", "lagtradeopen", "lagopenn", "laguerate", "laggrow", "hrsveto", "siaroff", "ggdeflag", "bbb", "blaguercov", "blagleftc", "blagtradeopen", "blagopenn", "blaguerate", "blaggrow", "bhrsveto", "bsiaroff", "bggdef", "dum1", "dum2", "dum3", "dum4", "dum5", "dum6", "dum7", "dum8", "dum10", "dum12", "dum13", "dum14", "dum18", "dum19", "dum20", "dum21", "counter", "year")] ## use amelia to impute missing data model.1.new.am.list <- amelia(x=model.1.new, m=5, idvars=c("dum1", "dum2", "dum3", "dum4", "dum5", "dum6", "dum7", "dum8", "dum10", "dum12", "dum13", "dum14", "dum18", "dum19", "dum20", "dum21"), ts="year", cs="counter", polytime=3)$imputations best regards Charlotte

14 years, 11 months

[gov2001-l] EXO 1

by charlotte.cavaille＠gmail.com

hi! Quick (but fundamental question) in exercise 1 gamma needs to be positive, so I re-parameterized...In order to get the right MLE from optim() i didn't forget to "re-parameterize" again...and i get the same result as the one i found analytically. However, things change when i look for the SE. Indeed, I get different answers analytically and with R (using the hessian). I know this comes from the re-parameterization because I don't have this issue when I change the whole function and do not re-parameterize gamma. So my question is: - if I re-parametterize, how do I apply the transformation to the hessian to get the right result how does that fit with the section notes that follows, why do we take "pnorm" (this is the first transformation that is applied in the ll.binom function) of "opt.1000 - 1.96*se" and not of "se" for instance??? #binomial log-likelihood (N = # of trials for each observation) ll.binom <- function(par, y, N){ # reparameterize pi; only search over [0,1] p <- pnorm(par) # log-likelihood out <- sum(y*log(p) + (N-y)*log(1-p)) return(out) } # compare to wald ci se <- sqrt(solve(-optim(par=2, fn=ll.binom, y=samp.1000, N=10, method="BFGS", control=list(fnscale=-1), hessian=T)$hessian)) #$ wald.ci <- c( pnorm(opt.1000 - 1.96*se), pnorm(opt.1000 + 1.96*se)) wald.ci # 0.7364839 0.7535663 - if i do not re-parameterize in order to be done with it and have both my analytical and my R result fit, how can i justify i am not re-parameterizing gamma?! thanks! charlotte

14 years, 11 months

[gov2001-l] Announcements Regarding Finals

by plam＠fas.harvard.edu

Hi all, A few announcements: 1) Please take the time to fill out course evaluations if you have not done so. They are very valuable to us. 2) For those doing a replication project, please remember that your papers are due on May 4 at 5pm EST. Please submit your papers to the Problem Set Dropbox under the folder "Final". We only need one copy per group, so only one of you has to submit it. Also, we only need your papers. We do not need any of your code or data, although you should all keep copies of it neatly formatted for future use. 3) For extension school students who are not doing a replication project, your final has been posted on the class website in the same folder as the problem sets. Your final is also due on May 4 at 5pm EST. Please submit your final writeup and R code file to the Dropbox under the "Final" folder just like any other problem set. You are NOT allowed to collaborate on the final. If you have any questions, please email both TFs, and NOT the class email list. -- Patrick Lam Department of Government and Institute for Quantitative Social Science, Harvard University http://www.people.fas.harvard.edu/~plam<http://www.people.fas.harvard.edu/%7Eplam>

14 years, 11 months

[gov2001-l] hi!

by sparshahoneysaha＠gmail.com

Hi, we are trying to get predicted values for a negative binomial model...we have our x mat, of observed values and some beta parameters (which we will draw from a mutivariate normal). our problem is the dispersion parameter, sigma squared....how do we draw that? so we should get one sigma squared per draw of k + 1 betas (k being the number of covariates), but we are a little confused...would sigma squared just be the variance of the mvrnorm distribution at each draw?? Then, how do we incorporate the sigma squared values into the link function? or is it the case that, when we are drawing our y's, we incorporate the mean of the draws of sigma squared values (so the mean of a 1000 values, right)?? thanks! best, sparsha

14 years, 11 months

[gov2001-l] Research in political methodology...

by ztownsend＠gmail.com

I have a question that I wanted to ask Gary that has to do with developing research questions in methodology itself, but I figured I might as well do it here on the list in case others are interested. I have a background in some "machine learning" (SVMs, the Ising model and exact sampling, MCMC, k-nearest neighbor, HMMs), but have seen and used those methods in relation to classic problems, e.g. vision, image processing, robotics, and natural language processing. I was wondering if you could provide insight into how to bring complicated, difficult to understand methods from more "technical" fields like computer science and statistics to political methodology? I see two possible routes: (1) Do you stay on top of the statistics and comp sci literature as it's developing and then say "oh this might apply to this problem in political science"? Or (2) is it more often the case that you see some problem in the political science lit or the real world, and then you search for solutions to those in other fields? How fruitful is it to look to what others have done outside political science vs. spending the time to try to come up with your own algorithm or model? You seem to develop many methods on your own, do computer scientists every come look at your work and say, "wow, this solves a problem we've been having"? I know we're all in universities because being around other scholars makes everyone more productive, but how much do the social scientists and the computer scientists interact, for example? To give a concrete example, I've been thinking about this as I go over hidden Markov models in my artificial intelligence class (here at Brown). So if you assume that some process is a Markov process with unobserved states, one of the conical problems is to figure out from some output sequence the most likely state transitions and output probabilities. To solve this most people use a special case of the EM algorithm, which is what brought these two separate classes together for me. So I have this model that has been shown to be useful in gene prediction, cryptanalysis, and such things, and it seems like there could be some real political science applications but I'm not sure what those are exactly. This is a case where I'm operating under the first method from above-- I've found some cool solution to a problem I'm not sure I have, although I often think the second method from above makes more sense. To give a slightly more personal explanation for my curiosity: substantively I'm interested in what are often thought of as political science questions, but I get this relative joy in reading statistics and computer science articles, and I'm trying to figure out how to bring that together. You've given us a lot of intuition and skills to take substantive empirical questions and develop methods to solve the wide variety of problems we might have, but how do wake up one day with the goal of creating a new clustering methodology or trying to model something using an HMM when few people in political science even know what those are? Cheers, Zac

14 years, 11 months

[gov2001-l] Reminders

by plam＠fas.harvard.edu

Hi all, A few reminders: 1. Like last week, we will be having only one section this week at 8-9pm. This will likely be our last section, so we encourage you all to attend. We will be covering a LOT of material. The general topic will be "Introduction to Bayesian Statistics", and we will try to cover (time permitting) missing data, hierarchical (random effects) models, and item-response theory (ideal point estimation) models from both the Bayesian and non-Bayesian perspectives. At the end, if there's time, I'll give a brief overview to using Bibtex with Latex. 2. The party at Gary's house is this Saturday at noon. Please RSVP if you haven't done so. 3. There will be no more problem sets for the rest of the semester. Your replication papers are due on May 4 by 5pm. For extension school students who are not doing the replication project, your final assignment will be emailed out on Monday April 27. It will also be due on May 4 by 5pm. 4. Last, but most important, the course evaluations for this class are now up online. Please take some time to fill them out. They are very valuable and important to all of us. -- Patrick Lam Department of Government and Institute for Quantitative Social Science, Harvard University http://www.people.fas.harvard.edu/~plam

14 years, 12 months

[gov2001-l] causality and regression

by m.fairbrother＠bristol.ac.uk

Dear all, In thinking about recent lectures, I'm a bit confused. If I understood correctly, Gary mentioned one day that, really, nobody is much interested in non-causal associations. But if that's true, then I'm unclear about the implications. In a garden-variety social science journal article with some sort of regression, the authors will go through the models they report, and comment on the various independent variables that appear to have significant "effects" on the dependent variable--which sounds like they're trying to talk about a number of causal relationships simultaneously, consistent with the idea that "nobody is much interested in non-causal associations". However, the recent lectures about matching, checking for balance, research design, post-treatment bias, counterfactuals, etc. suggest that to talk about even just a *single* causal effect you need to bear down, and check and do a whole bunch of things... which, as I see it, few journal articles generally do. So what gives? Is Gary saying that existing practice is just not up to snuff--that they're being wildly unrealistic in trying to parse out several causal relationships in a single article? What then is the implication for our own best practice? Should one just pick out a single covariate on which to focus (using matching, checking balance, etc.)? Or should one go through any single regression model and, for *every* (categorical) independent variable that appears to be significant, use matching on all the other covariates? As an example, supposing you stick in religion as a covariate in a regression with countries as the unit of analysis, and while religion isn't really what you're interested in, you happen find an effect for, say, being Catholic. To talk about the suprising, apparent "effect" of religion on your outcome of interest, should you then use matching, check whether counterfactuals are inside the convex hull, etc.? Any clarification would be much appreciated. - Malcolm

14 years, 12 months

[gov2001-l] [Fwd: party]

by king＠harvard.edu

An HTML attachment was scrubbed... URL: http://mailman.fas.harvard.edu/mailman/private/gov2001-l/attachments/200904…

14 years, 12 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

Gov2001 April 2009