Gov2001 March 2009

gov2001@lists.gking.harvard.edu

28 participants
47 discussions

[gov2001-l] deviance residuals

by sparshahoneysaha＠gmail.com

Hi! sorry to keep bothering everyone. Patrick, thank you for your last response, it was incredibly helpful. I have a question about deviance residuals here is an example: Deviance Residuals: Min 1Q Median 3Q Max -1.793 -1.037 -0.753 0.449 2.755 compared to another model we ran, using the same distribution (neg bin), but more explanatory variables Deviance Residuals: Min 1Q Median 3Q Max -1.997 -0.960 -0.644 0.428 2.879 so how do we know which one has a better "fit"? are bigger or smaller numbers better? sorry to keep bugging everyone on the list!

15 years, 1 month

[gov2001-l] optim question

by muller＠fas.harvard.edu

Hi all, Quick optim question: The paper Roman and I are replicating uses discrete-time event-history logistic regression. We've been able to get the right point estimates and standard errors thus far using Zelig's logit commands, but when we program our own logit function, we run into some problems. Specifically, the Zelig and numeric MLE estimates are nearly identical when we specify the model with all 16 covariates but one. Adding the final covariate radically shrinks all parameter estimates. The final covariate is on a greater scale (thousands) than the others (tenths to tens). If we divide it by 1000 and then run the regression, we get the right estimates for the other covariates and (unsurprisingly) an estimate 1000 times greater than the correct value for the final covariate. My guess is that this has something to do with how optim is performing optimization. Does anyone have any suggestions? Many thanks, chris

15 years, 1 month

[gov2001-l] reading packaged functions in R?

by m.fairbrother＠bristol.ac.uk

Hi all, I'm trying to read the source code for some functions that are included in the package "nlme". Specifically, I'd like to see the source code for "lme" and "nlme". However, instead of a long bunch of code (as I get for example from typing in "lm" or "survreg"), I just get the short bits of code I've pasted in below. Can anybody tell me how I can see the source code for these functions? I don't really get the idea behind "UseMethod(X)"--it just seems to hide what's really going on. I've even looked through the various files associated with R on my computer's hard drive, but I can't figure out where to find the functions. Thanks in advance for any assistance, Malcolm > lme function (fixed, data = sys.frame(sys.parent()), random, correlation = NULL, weights = NULL, subset, method = c("REML", "ML"), na.action = na.fail, control = list(), contrasts = NULL, keep.data = TRUE) UseMethod("lme") <environment: namespace:nlme> > nlme function (model, data = sys.frame(sys.parent()), fixed, random = fixed, groups, start, correlation = NULL, weights = NULL, subset, method = c("ML", "REML"), na.action = na.fail, naPattern, control = list(), verbose = FALSE) { UseMethod("nlme") } <environment: namespace:nlme>

15 years, 1 month

[gov2001-l] sorry sorry!

by sparshahoneysaha＠gmail.com

last thing, we get this error message when we run negbin in zelig 1: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace > : iteration limit reached 2: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace > : iteration limit reached does that mean there are not enough observations for the number of explanatory variables? because we don't get it when we use fewer explanatory variables

15 years, 1 month

[gov2001-l] rep question

by sparshahoneysaha＠gmail.com

Hi everyone! JeeHye and I are working on a paper that uses a negative binomial model. The dependent variable is the number of uses force in a given period of time (it might have been every three months or something like that...). So, this seems like it might be pretty NOT iid. So that is why they use the neg bin. So we ran in this R using zelig and did robust sandwich clustered errors and pretty much did not have their key explanatory variable come out significant (percent in congress from same political party as president during that period of interest). So we thought okay let's try stata. So we did and there was an option to cluster on a particular variable...they clustered on "president." Got their results, their exp var of interest came out significant. So, why was this? Why would you cluster your standard errors on something? When is it okay to do this? When is it not? To be honest, what does clustering on something for your standard errors even mean? Also, how would we go about checking for overdispersion...we want to try the poisson model and then compare it to the neg bin...see if there is even overdispersion (which there must be). But how do you check for it? And are there other ways of controling for dependence between observations in event count models? I mean with other regression types there are splices and time dummy variables, could you do something similar with an event count model? Sorry about all the questions...I guess it would be just be helpful to hear back from some of you regarding whatever it is that you know about in regards to a subset of them, all of them, none of them...whatever comments you have. Hope everyone is having a great break! sparsha

15 years, 1 month

[gov2001-l] office hours this week

by plam＠fas.harvard.edu

Hi all, There have been requests for help on possible problems with the replication projects. Because of this, I will hold office hours this week at my usual time, Wed 2-4. Feel free to come by. A couple notes: 1) you should continue to seek help from the listserv, even for questions related to your replications as it is likely that others may be having similar problems 2) if you're dealing with models that you are unfamiliar with, you should always start by reading up on it via other references or textbooks 3) if you run into coding problems, such as when using Zelig or other R packages, always try debugging yourselves first. it is likely that you made an error somewhere in the process. Let us know if there are other questions.

15 years, 1 month

[gov2001-l] running ordered probit using zelig

by sbarrows＠fas.harvard.edu

Dear list.... We are seeking to replicate an ordered probit model. Amongst the independent variables there are democratic id, republican id, independent id, and also north central, south, west, northeast. All of these are dummy variables. Below the table it states "The reference variable for party identification is "Independent" and for region it is "North East". In running our regression we have not specified which variable is the reference variable. Should we do so and, if so, do you know how? Also, we can get close to the estimates, with the correct significance on the different variables, but we can't get it quite there. Is it possible to give some indication as to how close is close enough? Thanks Kyle & Sam

15 years, 1 month

[gov2001-l] zelig

by sparshahoneysaha＠gmail.com

i was trying to use that xtable thing on zelig outputs, specifically 2.1, but it doesn't work for zelig. is there something similar that turns zelig summaries into sweet latex code so we don't have to do it manually?

15 years, 1 month

[gov2001-l] Fwd: PS6 1.3

by lauren.peritz＠gmail.com

Pardon the interruption. I meant for this question to go to the list. Thanks! ---------- Forwarded message ---------- From: Lauren Peritz <lauren.peritz at gmail.com> Date: Wed, Mar 18, 2009 at 7:31 PM Subject: Fwd: [gov2001-l] PS6 1.3 To: Patrick Lam <plam at fas.harvard.edu> Hi Patrick, Is there a good source for commonly used link functions? For the t-distribution, in question 1.3 I've found the pre-programmed R function. I've also found some generalized formulae employing the gamma function. However, i'm still confused, despite your hints below. Any other advice? Thanks, Lauren # "t-function" log-likelihood tfunct.ll <- function(par, X, y){ #par=parameters #X=matrix of covariates with intercept #y=dependent variable # how do we link parameters to distribution of y? delta <- (mu - m0) * sqrt(n)/sigma? delta <- par[1:ncol(X)] out<-sum(y*log(pt(X,3,delta))+(1-y)*log(pt(X,3,delta))) return(out) } ---------- Forwarded message ---------- From: Patrick Lam <plam at fas.harvard.edu> Date: Sun, Mar 15, 2009 at 1:45 AM Subject: Re: [gov2001-l] PS6 1.3 To: gov2001-l at lists.fas.harvard.edu Hi Olena, I'd go back and perhaps look at my section notes. Ask yourself the following questions: What are the parameters I'm trying to find here? How do they relate to x and y? How do I link my parameters to the distribution of y? Also, if you managed to do the scobit problem correctly, then it's not that much different here than that problem. On Sat, Mar 14, 2009 at 9:11 PM, Olena Ageyeva <eageeva at hotmail.com> wrote: > Have anyone figured out what the formula is? > I tried to implement it exactly as it says in the PS, at least as I > understood it and I really doubt that it's right. > > That is what from my understanding it should look like: > > ll.t<-function(par,x,y) > { > delta=par > out<-sum(y*log(pt(x,3, delta))+(1-y)*log(pt(x,3,delta))) > return(out) > > } > > I have tried x%*%beta instead of x as well and have gotten weird results > which did not look right to me. > > What am I doing wrong? > I have no studying partner and am really tight on time. Could anyone help > please? > Thanks! > Sincerely, > Olena Ageyeva > > > > ------------------------------ > Windows Live? Groups: Create an online spot for your favorite groups to > meet. Check it out.<http://windowslive.com/online/groups?ocid=TXT_TAGLM_WL_groups_032009> > > _______________________________________________ > gov2001-l mailing list > gov2001-l at lists.fas.harvard.edu > http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l > > -- Patrick Lam Department of Government and Institute for Quantitative Social Science, Harvard University http://www.people.fas.harvard.edu/~plam _______________________________________________ gov2001-l mailing list gov2001-l at lists.fas.harvard.edu http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l

15 years, 1 month

[gov2001-l] 1.2 in pset

by m.fairbrother＠bristol.ac.uk

> One thing jumps out...you seem to be missing a fourth "0" in your > optim() line. It should read: > optim.out <- optim(par = c(0,0,0,0), scobit.ll, y = y, X = X, > method="BFGS",control=list(fnscale = -1), hessian = T) In addition, I think you may have both some missing and extra negative signs in this part of your log-lik function: outA <- sum(-y*log((1+exp(-mu))^a)) outB <- sum((1-y)*log(1-(1+exp(-mu))^a))

15 years, 1 month

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

Gov2001 March 2009