Hey Jens,
When we are simulating the pooled expected probabilities of yi*
falling into any of the four categories, we need to simulate the betas
in order to get 1000 predicted yi*s. However, if we use the expected
taus to determine which category each yi* falls into, this seems like
an underestimate of the variance of the expected probabilities because
the taus are also uncertain. Do we need to simulate the taus as well
and use a new set of taus for every set of betas?
PS--In part b of question 1, you ask us to simulate first differences
as COOP goes from 0 to 1. Did you mean 1 to 2 (COOP never equals 0)?
Jon and Jane
Hi all,
I have two questions about dealing with missing values in R - any advice would
be appreciated!
1. Is there a way to force model.matrix() to include cases with NAs? I want to
automatically create dummy variables for an entire data set with a lot of
missing values (by default, model.matrix drops all those cases with missing
values).
2. I am trying to run a logit model in Zelig that includes weights. It works
fine if there are no missing values, but if there are any missing values
(either for the outcome or a predictor), I can't run the model with weights. I
hope there is an easy way to handle this that doesn't involve pre-deleting all
cases in the data set (or the weight vector) with missing observations as we
need to loop over several models and have a lot of covariates. The following
code demonstrates the problem:
library(Zelig)
#make up some data
y <- rep(c(1,0),60)
x <- rep(c(1,1,0),40)
wt <- rep(c(1,2,3,4),30)
d<-data.frame(y,x,wt)
#run analyses on data without any missing values
zelig(y~x,model="logit",data=d) #simple model works
zelig(y~x,model="logit",weights=wt,data=d) #adding weights works
#now, add a missing value, and re-run analyses
d2<-d
d2[4,1]<-NA
zelig(y~x,model="logit",data=d2) #works
zelig(y~x,model="logit",weights=wt,data=d2) #fails
Thanks in advance and I'm sorry if some received this twice,
Dan
No you cannot test the exclusion restriction; if you invent such a test that
would be noble prize material. That is why IV papers always end up in
endless discussions about why an instrument is valid or not
What you can test is:
First stage identification: Regress the instrumented variable on the
instrument and the covariates and see whether the instrument has some
predictive power (some say an F test above 10 is good, others say it
depends). But if your instrument is not at all correlated with the
instrumented variable, you enter the ugly world of weak instruments and you
have to switch the estimators (see Imbens and Rosenbaum: Robust, accurate
confidence intervals with a weak instrument: quarter of birth and education)
Sensitivity to violations of the exclusion restriction: induce some
correlation between the instrument and the outcome (conditional on the other
vars) and see how much violation of the exclusion restriction you need to
make the effect go away. There are a couple of such tests available (see
Wand, J. 2002. Evaluating the Consequences of Assumptions Using
Simulations," The Political Methodologist, vol. 11, no. 1, 21, or Robins JM,
Scharfstein D, Rotnitzky A. (1999). Sensitivity Analysis for Selection Bias
and Unmeasured Confounding in Missing Data and Causal Inference Models.
Statistical Models in Epidemiology: The Environment and Clinical Trials.
Halloran, M.E. and Berry, D., eds. NY: Springer-Verlag, pp. 1-94).
jens
From: Viridiana R?os [mailto:viridianarios at gmail.com]
Sent: Tuesday, April 08, 2008 7:34 PM
To: Jens Hainmueller; Jose Luis Romo Cruz
Subject: Instrument validity
Hi Jens,
I am trying to use a new instrument that theoretically should work better. I
run my new model and apparently, results are better than in the original
paper. However, there is a possibility that instrument is in fact correlated
with y. Is there a causality tests or something that I can use to prove the
validity of my instrument?
Best,
Viridiana R?os
617-997-2471
Greetings 2001,
Now everyone who needs a replication assignment should have their
replication assignment. A couple of notes:
Everyone should have received a replication project who is either (1)
writing a replication paper for the course, or (2) a distance student taking
the course for credit. If you expected to receive a project to replicate
but did not receive one, send me an email.
Also, if something seems fishy, like you received a different replication
assignment by email than in class, or you were assigned your own project to
re-replicate, send me an email.
If any information you need to replicate the replications is missing from
the packet (the code, the writeup, etc.), contact the replication authors.
Your assignment is to replicate the replications. Next Monday, April 7, you
will bring to class (or submit to the dropbox if necessary) at least one
hardcopy of a 1-2 page memo. The goal of the memo is to offer constructive
feedback to the replication authors. Point out places where their code
didn't work or where their work was unclear, or especially how they can
improve the original papers. The one hard copy of this memo you bring to
class will be the copy we grade. You should also give each of the
replication authors a copy of the memo. You may give them either a hard
copy or an electronic copy.
Each person was assigned a replication project. Don't be confused if you
were emailed along with another person or two- I was trying to be efficient
in my tens of emails today. This is an individual project. That is not to
say that you cannot work with others- discussing projects with others is a
great way to find potential developments. But you are not required to work
on this replication with your coauthors, and the memo ought to be your own
work.
This week we will have section as usual. We'll be focusing on some of the
models presented in today's class and last week's class. To keep everyone
on track with the lecture material, there will be a problem set assigned
this Thursday. We know you have a thing or two on your plates already,
though, so this problem set will be lighter than the normal problem set.
Congratulations, you've already cleared a huge hurdle in the replication
process. Now all you have to do is generate a publishable paper. Piece of
cake.
Jenn