Hi all,
Alisha and I are interested in saving the object produced by a for
loop so that we are not obligated to re-run the loop every time we
want to work with our data, as it takes a really long time and uses a
lot of memory. Has anyone needed to do this?
Thanks a lot,
Diane
Hi all,
Sorry for the second email. We would also like to be able to bind two
datasets together in an efficient way, but apparently rbind doesn't
work for datasets. Is there an equivalent command for matrices?
Thanks,
Diane
When describing the balance on covariates should we look at unidimensional
imbalance or at the difference in means? Can we just look at one of
these values to check balance or should we compare them to something (ie the
range)?
Thanks
Quick Question,
For #3 on the Problem Set, different version numbers in R generate
different L1 values. I know that we were told to install the newest
version (2.8.1), but some of us are apprehensive about updating our
versions of R. If we use the older version of R, will it make a
difference for any of our reported estimates when we use the MatchIt
and cem packages? Should we all install the new version now before
continuing with the problem set?
Thanks in advance.
-Bernard
-----------------------
Bernard L. Fraga
Ph.D. Student, Harvard University
Government and Social Policy
bfraga at fas.harvard.edu
-----------------------
Dear Miya and Patrick,
Quick question on 3.c
I am not sure how to understand " alter the propensity score model in any
way you see fit"
indeed, do we have to change the *type* of distance measures used ( and see
if it improves the imbalance) or improve the specific type we chose in 3.a.
if it is the latter we have to do, I do not see how we can do that without
knowing the data set and the name of the variables. Don't we need a bit of
theory to play with the IV in the propensity score model?
thanks
ps: also, can we use matchit for 3.b?
Charlotte
I will be out of the office starting 04/08/2009 and will not return until
04/15/2009.
I will respond to your message when I return. In the meantime please
contact Roger Mathisen (rmathisen at unicef.org)
Hello all,
Sometimes you have an event whose outcome is only observed if another event
happens. Say for example that you only observe whether someone votes
Democratic if they decide to vote at all. You could think of this as a pair
of probits, where the party variable is P(D=1 | V=1).
In such cases, we often think that the first stage is endogenous to the
second stage; that is, you're less likely to go out and vote at all if you
think that your party is going to lose. This is correlated data, and there
are a set of techniques for dealing with it, like bivariate probits (which I
think I understand pretty well) and generalized estimating equations (which
I am reading up on).
Christopher Zorn from Emory has a review of this stuff in the 2001 *APSR*.
He mentions that GEE models in particular "offer a number of advantages for
researchers interested in modeling correlated data, including applicability
to continuous, dichotomous, polychotomous, ordinal and event-count response
variables."
My question: I have data where the first "stage" is best modeled using an
event-history framework. The second "stage" is a dichotomous response
variable. I don't see duration variables listed in Zorn's lsit of
applications, and so I'm wondering: is there any work on techniques for
correlated data where one or more stages are duration models?
Best,
JP
John-Paul Ferguson
PhD Candidate, Economic Sociology
MIT Sloan School of Management
50 Memorial Drive, E52-533
Cambridge, MA 02142
617.253.3940 (w)
617.549.8482 (c)
I have a simple question for PS7.
I figured out in problem 2.3 how to install pscl, namely install.packages("pscl"), but even though I see the BioChemists.rda data, and have even copied this onto the directory where I store my R files, I can't seem to load it.
I've tried load, read.data, data, and none of these seem to work.
What have people found that works?
Many thanks.
Best.
Tom
Still can not get clear the idea of convex hull. We create it based on the data we observed and if we ask model a question that is distant from the observed data the answer either fall inside the convex hull or fall outside. If it's outside that means that the model can not answer this question, right? Or it means that the result is too far from a model to be certain about it? That is somewhat not understandable how can we estimate how right our predicted results are if we have just one model and nothing else. What we compare the model with? From my understanding any result we get from the same model should be in it's convex hull. Where am I wrong?
Sincerely,
Olena Ageyeva
_________________________________________________________________
Rediscover Hotmail?: Get quick friend updates right in your inbox.
http://windowslive.com/RediscoverHotmail?ocid=TXT_TAGLM_WL_HM_Rediscover_Up…
Hi all,
I'm a bit confused on what we should use as the data and what we should use as the counterfactual in problem 3.1.
This is what I have:
data = biochem
Then for the counterfactual. I have the two individuals from 2.4 define as:
single.subject <- c("art"=mean(Exp.Y.single), "kid5"=median(bioChemists$kid5), "phd"=median(bioChemists$phd), "ment"=median(bioChemists$ment), "fem.binary"=median(bioChemists$fem.binary), "mar.binary"=0)
married.subject <- c("art"=mean(Exp.Y.married), "kid5"=median(bioChemists$kid5), "phd"=median(bioChemists$phd), "ment"=median(bioChemists$ment), "fem.binary"=median(bioChemists$fem.binary), "mar.binary"=1)
then I rbind these two observations to be the counterfactual.
Could someone let me know if I'm on the right path?
Thanks,
Clarence
Clarence Lee | Doctoral Student | Harvard Business School | 857.998.2034