hi!
Quick (but fundamental question)
in exercise 1 gamma needs to be positive, so I re-parameterized...In
order to get the right MLE from optim() i didn't
forget to "re-parameterize" again...and i get the same result as the
one i found analytically. However, things change when i look for the
SE. Indeed, I get different answers analytically and with R (using the
hessian). I know this comes from the re-parameterization because I
don't have this issue when I change the whole function and do not
re-parameterize gamma. So my question is:
- if I re-parametterize, how do I apply the transformation to the
hessian to get the right result
how does that fit with the section notes that follows, why do we take
"pnorm" (this is the first transformation that is applied in the
ll.binom function) of "opt.1000 - 1.96*se" and not of "se" for
instance???
#binomial log-likelihood (N = # of trials for each observation)
ll.binom <- function(par, y, N){
# reparameterize pi; only search over [0,1]
p <- pnorm(par)
# log-likelihood
out <- sum(y*log(p) + (N-y)*log(1-p))
return(out)
}
# compare to wald ci
se <- sqrt(solve(-optim(par=2, fn=ll.binom, y=samp.1000, N=10, method="BFGS",
control=list(fnscale=-1), hessian=T)$hessian)) #$
wald.ci <- c( pnorm(opt.1000 - 1.96*se), pnorm(opt.1000 + 1.96*se))
wald.ci # 0.7364839 0.7535663
- if i do not re-parameterize in order to be done with it and have
both my analytical
and my R result fit, how can i justify i am not re-parameterizing gamma?!
thanks!
charlotte
Hi All,
April 6, 2009, 2pm is the due date for your replication of another student
groups' replication. You should email me and Patrick as well as all of the
members of the student group whose work you replicated a *pdf* memo pointing
out ways to make their paper and analysis better. You will be evaluated
based on how helpful, not how destructive, you are.
Readings for this week are: (1) King, Gary and Langche Zeng. ?The Dangers of
Extreme Counterfactuals,? Political Analysis, 14, 2, (2007): 131-159. (2)
Daniel Ho, Kosuke Imai, Gary King, and Elizabeth Stuart.?Matching as
Nonparametric Preprocessing for Reducing Model Dependence in Parametric
Causal Inference,? Political Analysis, 15 (2007): 199-236.
My office hours this Thursday will only be from 2 till 3 pm.
Thanks,
Miya Woolfalk
--
Miya Woolfalk
Ph.D. Student
Harvard University
Government and Social Policy
Hey folks,
Don't know if anyone else is trying to use the "weights" argument in Zelig.
Sam and I are using weighted survey data to try to estimate an oprobit
model. For one of the models we are estimating, when we run Zelig with a
vector of observation weights entered, we get coefficient estimates that are
very close to our author's, but our standard errors are extremely small (far
smaller than the author's). We aren't sure why this is or how to correct the
problem. For another very similar model, including a vector of weights just
results in the following error message:
zelig.us2 <- zelig(as.factor(jobcomm) ~ demid + repid + hhunion + female +
nonwhite + income + educ + educinc + age + hhlayoffs + ncentral + south +
west, model="oprobit", weights=starting2$weight, data=starting2)
Error in function (formula, data, weights, start, ..., subset, na.action, :
attempt to find suitable starting values failed
In addition: Warning messages:
1: In glm.fit(X, y1, wt, family = binomial("probit"), offset = offset) :
algorithm did not converge
2: In glm.fit(X, y1, wt, family = binomial("probit"), offset = offset) :
fitted probabilities numerically 0 or 1 occurred
If anyone has any thoughts as to what might be going on, we would love to
hear them.
Thanks for your help,
Kyle
Hi all,
Just a reminder that your replication draft is due tomorrow. We want 6
things from your group. (1) A pdf of the article you are replicating. (2)
A *PDF *of the draft of your paper. This draft must include the names (and
email addresses) of your coauthors, yourreplication of the main
tables/figures in the paper you are replicating and a brief discussion of
how you intend to improve the existing analysis. (3) A file containing the
data you used to perform the replication. (4) A code book (or key) for the
data used in your replication. (5) A txt or r file containing all of the
code used in your replication (the code should be well commented so others
can easily follow along). (6) Anything else that may be required for
another group to successfully replicate your replication of the paper you
have selected.
Please put all 6 things into one zip file and email it to both Miya and I by
2pm tomorrow.
--
Patrick Lam
Department of Government and Institute for Quantitative Social Science,
Harvard University
http://www.people.fas.harvard.edu/~plam
Hi all,
I'm struggling with this problem and I am wondering if any of you could help me out:
So my response variable y is in the open interval (0,1), and I want to model a dataset using the Beta distribution. My question deals with how to reparameterize the shape parameters a and b and link it to my covariates (x's) using a logit function.
The best thing I've come up with so far is:
1) Define a new parameter mu, such that logit(mu) = X %*% coefficients
2) Define a new parameter s = a + b. We also know that var(y) = (mu*(1-mu))/(1+s).
3) Through this, I can solve for a and b, this is my reparamterization.
The problem I keep running into is that, by solving for s from the equation var(y) = (mu*(1-mu))/(1+s), s turns out to be negative, since mu is always less than 1 and var(y) is also less than 1. Since the beta distribution is only defined for s>0, my optim routine just stops. Could someone who has experience with this tell me what piece of insight am I missing? There must be something I'm missing since my s values are all negative.
Thanks a bunch!
Clarence
Problem set 6 has been graded. If you did not turn in a paper copy, grades
are available in the dropbox. Just a couple notes:
1. Don't forget to set the variables to their medians (Zelig defaults to
the means). Also, when you have interaction terms, you need to set it to
the interaction of the medians, not the median of the interactions. For
example, it should be median(target)*median(coop), not median(target*coop).
Luckily Zelig does it correctly if you specify the lower order terms.
2. Scientific notation on tables is not good practice.
3. The cutpoints are also parameters in the ordered probit model.
--
Patrick Lam
Department of Government and Institute for Quantitative Social Science,
Harvard University
http://www.people.fas.harvard.edu/~plam
Hi:
We are replicating an article that uses a random effects model.
The authors show a likelihood ratio test that Stata reports in its output.
A Google search taught me that the null model in this test "corresponds to
the last iteration from Fitting constant-only model."
Can I ask Zelig to show me such a thing?
Thanks!
Eitan
sorry, so obviously you want smaller residuals...but is this how we should
think about fit? so if the residuals are smaller with one model
specification (say 10 explanatory variables) does that make it automatically
better than the one with 12 explanatory variables? what are other
diagnostics that we can run (on negative binomial models or in general that
work for all types) to check to see what specification is actually "better"?
Hello All,
After running into some difficulties with missing data, we are looking into
replicating one of our backup choices for papers. This paper employs a
two-stage tobit model. We gather the model is similar to a two-stage least
squares model in that one uses instrumental variables to deal with the
situation when at least some independent variable (xi) is correlated with at
least some stochastic component. We also understand the tobit model allows
one to deal with censored data (e.g. non-negative y).
We are wondering if anyone has experience running the tobit model in Zelig,
or programming in R a two-stage model using this concept of instrumental
variables. If so, perhaps we can ask you some specific questions that have
arisen in our replication.
The authors of our paper used Stata, which unfortunately for us, has a
pre-programmed IV tobit function and provides little clue to what the
function actually does (something of a 'black box').
Thank you in advance!
Lauren
(+ John Polley)