Gov2001 March 2008

gov2001@lists.gking.harvard.edu

36 participants
58 discussions

[gov2001-l] including NAs in a residual vector

by ltai＠post.harvard.edu

Hello, Suppose I run an lm model on data with NAs. It won't (and shouldn't) produce residuals for entries that are excluded or that already have NAs in them. The residual vector skips the numbers for these entries. How could I produce a vector of residuals that includes NAs in the spots that have been skipped? Thanks, Laurence

16 years, 1 month

[gov2001-l] relative frequency histogram?

by keith.schnakenberg＠gmail.com

This is not for the assignment for this class, but it is an R question that I thought someone might know. I just want something like a histogram (or maybe a polygon), but with relative frequencies rather than frequencies or densities. Is there a straightforward way to make this happen?

16 years, 1 month

[gov2001-l] Replication

by jhainmueller＠gmail.com

the closer the better. ideally the results should match exactly. From: Aakanksha Pande [mailto:aaka.pande at gmail.com] Sent: Thursday, March 27, 2008 4:31 PM To: Jens Hainmueller Subject: Re: Replication thanks jens. i was wondering how close to the published estimates we have to be? i get almost similar estimates but not quite the same numbers. aaka On Thu, Mar 27, 2008 at 3:26 PM, Jens Hainmueller <jhainmueller at gmail.com> wrote: Aaka, It's best to send such question to the list. No clue why you are getting this error (hard to tell from the given information, maybe a variable that is missing in the dataset because the variable name is miss-spelled). However, I am also not clear on what you want to do exactly. In this model all but what is within the tag() would be a FEs and all what is within the tag() are REs. Take a look at the documentation on how FE and RE are specified. Also the Bates book has many examples of models that combine RE and FE. Hth, Jens From: Aakanksha Pande [mailto:aaka.pande at gmail.com] Sent: Thursday, March 27, 2008 3:37 PM To: Jens Hainmueller Subject: Re: Replication hi jens following up on your advice i am trying to do a FIXED effects model now in R and keep getting this error. any ideas? a<-zelig(eabs.100~male+years.here+years.here.sq+born.dist+doctor+night.shift +field.visit+provided.house+inspec.2m+dist.min.health+toilets.avail+water.av ail+dist.road+round1+round2+Monday.int+Tuesday.int+Wednesday.int+Thursday.in t+Friday.int+phc+tag(1|dist), data=data, model="ls.mixed") Error in `[.data.frame`(d, , all.vars(as.expression(formula))) : undefined columns selected the model is say y~x1+x2 with fe at x3(district level) thanks! Aaka On Wed, Mar 26, 2008 at 10:56 AM, Jens Hainmueller <jhainmueller at gmail.com> wrote: Zelig's ls.mixed does RE models, drawing from the lm4 package. The bibliography in the zelig documentation on this model also has many helpful references (ie. the Bates papers and book on this which many examples). there is plenty of documentation out there. you can also use the nlme package. I think lm4 is the latest version and they may have changed the syntax a little bit, but here is a quick example from nlme: library(nlme) # FE for age and experience, random intercept for classroom summary(lme(fixed=score~age+experience,random=~1|classroom,data=data2))\par # FE for experience, random coefficient for age and random intercept for classroom summary(lme(fixed=score~age+experience,random=~age|classroom,data=data2)) \par hth, jens From: Aakanksha Pande [mailto:aaka.pande at gmail.com] Sent: Wednesday, March 26, 2008 11:40 AM To: Jens Hainmueller Cc: larson.jenn at gmail.com; Gary King Subject: Re: Replication Thanks Jens.I am trying to replicate my work in STATA in R. I am trying to do a hierarchical linear model (with random effects) and was wondering how to do that in R? best Aaka

16 years, 1 month

[gov2001-l] sampling weights

by keith.schnakenberg＠gmail.com

I am still trying to work with sampling weights. I found that I can apply the weights using library(survey); svydesign() and then do parameter estimates using svymle(). This is fine, but I still wonder if there is an easier way to do this. Is there some argument that I can pass to zelig to make it estimate the model with these sampling weights? I think the likelihood that I will make an error when I try to simulate things is much smaller if I can do it in zelig.

16 years, 1 month

[gov2001-l] Random effects logit

by jdkuo＠fas.harvard.edu

Hey yall, The author of the paper we're replicating did a logit model in stata using xtlogit with, in his words, "random effects for country code." I haven't been able to replicate his numbers in either stata or R... In R, I've been doing: a<-zelig(y~x1+x2...+tag(1|ccode), data=dat, model="logit.mixed") But I'm not sure if finding a different intercept for each country (which is what the above code does) is the same as what the author did. Can anybody clarify what the author meant? (we don't have his exact stata code - only what he provided above.) thanks, didi & shahrzad

16 years, 1 month

[gov2001-l] multinomial logit questions

by jvaynman＠fas.harvard.edu

We are trying to estimate a multinomial logit model with robust standard errors (the cases are clustered by country). Can anyone help with these questions... 1) We have a dependent variable with three factors. It seems that Zelig takes one as the base and estimates coefficients and SE for the other two. Is there any way to know which ones it is taking? Is it possible to change which one the base should be and which should be estimated? The Zelig mlogit function does not seem to take a "robust=" argument. Does this mean it cannot do robust SE or is there some other way to set it up? 2) We also tried to use the multinom function in the nnet package. This function automatically gives us the right coefficients for the two factors we want, but the SE's are wrong, probably because they are not the robust ones. Is there a way to estimate robust SE's in the multinom function, or perhaps in another package or function? We have not found other options, but perhaps someone else has. Thanks!

16 years, 1 month

[gov2001-l] subsetting syntax error

by keith.schnakenberg＠gmail.com

I am trying to subset my sample based on the logical operators below. If anybody could shed some light on why I am getting syntax errors, I would be grateful. sample=brfss[HADHYST2==2 && CTYCODE != 777 && CTYCODE != 999 $$ AGE >= 40] Error: syntax error Thanks, Keith

16 years, 1 month

[gov2001-l] Logical Statements

by brown4＠fas.harvard.edu

Low priority question (I've found a work-around), but I'd like to know why I'm having problems with this R code: data3$epop.inequal.11[data3$ordepop == 1 & data3$ordinequality == 1] <- 1 data3$epop.inequal.11[data3$ordepop != 1 | data3$ordinequality != 1] <- 0 data3$epop.inequal.11[data3$ordepop == NA | data3$ordinequality == NA] <- NA What I'm trying to do with this is create a new dummy variable based on the values of two ordinal variables. The two that it's drawing from are scaled 1-3 (low, medium, high), and this particular dummy variable is supposed to be a 1 when both of the ordinal variables = 1, otherwise 0. It's correctly re-coding the 1s and 0s, but the second line is causing some of the missing values to be coded as missing and others to be coded as 0s. The following code works, but it's not as straightforward or as easy to describe. Why is the first set of commands working mostly correctly but not fully? data3$epop.inequal.11[data3$ordepop == 1 & data3$ordinequality == 1] <- 1 data3$epop.inequal.11[data3$ordinequality != 1] <- 0 data3$epop.inequal.11[data3$ordepop != 1 & data3$ordinequality == 1] <- 0

16 years, 1 month

[gov2001-l] SPSS data

by jill.hohenstein＠kcl.ac.uk

Hi all, We are working on our replication project and have loaded an SPSS file into R. We are attempting to select rows based on values of three variables. The first two have values of 1 and 2. We can easily select for the value of 1. The third variable has values of 0 and 1. When we try to select for any value (either 0 or != 1), we get an error message that reads: Error in if (plomin[i, 11] == TRUE) { : missing value where TRUE/FALSE needed This is what our code looked like: vec <- c() k <- 0 for(i in 1:nrow(plomin)){ if (plomin[i,7] == 1) & plomin[i,3] == 1 & plomin[i,11] == 0 ) { k <- k + 1 vec[k] <- i} } mat.corr.gesl <- plomin[vec,c(20,21)] Does anyone have any idea why we should be getting this message? Or, alternatively, is there a better way of selecting for values of our variables? Thanks! Best wishes, Jill and Jeremy (Hodgen)

16 years, 1 month

[gov2001-l] by argument in Zelig

by keith.schnakenberg＠gmail.com

Let's say I have a variable called "variable," which is a dummy variable, and I want to do a logistic regression only for cases in which variable=1. I give Zelig the following command: m1 <- zelig(Y ~ x1 + x2 + x3, model="logit", data="mydata", by="variable") And Zelig tells me: "Error in dat[, by] : incorrect number of dimensions" Can anybody shed some light on what I'm doing wrong? Thanks, Keith

16 years, 1 month

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

Gov2001 March 2008