Hello list members!
I am writing to ask about methods of pooling Amelia outputs for standard
deviation, Cohen's d, and model fit statistics such as F-statistic and
R-squared.
Specifically: (1) For SD, can I use mi.meld() to pool SDs estimated from
individual imputed datasets, similarly to pooling standard errors for
regression coefficients?
(2) For Cohen's d, can I use zelig-ls to pool the t-statistic for the dummy
predictor, and then transform the pooled t-statistic into Cohen's d?
Alternatively, can I calculate Cohen's d by each imputed dataset and then
calculate the mean of the ds? Or a third approach, to calculate Cohen's d
based on pooled mean and SD? - These approaches do not always lead to
identical results, which one is the best? Or is there yet another better
approach?
(3) For R-squared - I understand that Dr. King recommends not to focus on
model fit statistics - but just out of curiosity: mice has a function that
uses the procedure proposed by Harel (2009):
http://www.tandfonline.com/doi/pdf/10.1080/02664760802553000
a) In each ‘complete’ data,
• calculate R2 • take its squared root - R • use Fisher z-transformation to
> evaluate the normalized estimate and its variance (Q(i), V (i))
2) With the m sets of estimates and variances, • combine results using
> Rubin’s rules • the confidence interval (CI) for Q is QT ± z(α/2)√(QT) •
> inverse transform for the proportion scale • square your results.
Is this approach superior to taking the mean of estimated R-squared's from
the imputed datasets directly?
(4) For the F-statistic - Is there any recommendation other than taking the
mean of Fs from the imputed datasets?
My apologies for the many questions! Thank you in advance for any of your
help! :)
Best wishes,
Gu
--
Gu Li, MS
PhD Candidate
University of Cambridge
Department of Psychology
Free School Lane, Cambridge, CB2 3RQ
United Kingdom
Dear Amelia users/creators,
I want to write a stack of Amelia imputed data sets into a Stata format for some specific analyses and tests that I find easier in Stata.
I know that write.amelia enables this when the separate argument is set to false, and have tried the following code:
write.amelia(am.output, format="dta", file.stem="outdata", separate=FALSE,
orig.data=TRUE)
However, I get an error message: “Error in write.dta(dataframe= list…) empty string is not valid in Stata's documented format”.
Stack overflow has a thread on this error for write.dta, which suggests overwriting a data frame, however I cannot do this with the Amelia output: http://stackoverflow.com/questions/27574055/converting-r-file-to-stata-with…
Any advice?
Grateful for this great MI package, and for any suggestions!
Sophie
Sophie Moullin
Sociology & Social Policy PhD Student
Princeton University
smoullin(a)princeton.edu<mailto:smoullin@princeton.edu>
After updating to the newest version of Amelia (1.7.4), I tried
overimputing a dataset that has incorrect values in one of its variables.
All of the error observations are measured identically (as zeros, where
they should be positive). The code I originally used is below, and it
triggers a warning of the type: "Some observations estimated with negative
measurement error variance. Set to gold standard."
dat<-data.frame(A, B, C, VS)
mopd<-moPrep(dat, VS~VS, subset=VS<.0001)
I looked through the github code as to what causes this error (other than,
of course, the negative error variance), and more importantly, how to
activate the gold.standard (which for my purposes is the rest of the values
for VS) and presumably fix this issue. After trying quite a few different
possible codings, I can't get it to work. I either receive the same error,
or a host of errors surrounding how I've included gold.standard in the
code. I would think it should be easy, since I'm basically bifurcating my
data (all data under some amount is the subset measured with error; all
data over the amount can be considered gold-standard data), but can't
figure it out. Thanks for any help you can give,
Sean
Hi there,
I have basically 2 questions related to setting observation-level priors on nominal variables.
I am trying to do an overimputation on a dichotomous variable, say y1.
My 1st question:
I am aware that using the argument “priors” and “overimp”, I could specify observation-level priors by 4-column matrix (row, column, prior.mean, prior.sd) or 5-column matrix (row, column, lower confidence range, upper confidence range, confidence level). I am attempting the 4-column matrix but I am not sure how do I specify prior.mean and prior.sd when my prior is the dichotomous variable itself. I read somewhere prior.mean can be set to y1 itself? Is prior.sd similar to the proportion of variance attributable to measurement error? Would need advice on how do I specify prior.sd in this case.
My 2nd question:
I am also aware of generating prior using the command “moPrep” from the Amelia package. The argument “error.proportion” from “moPrep” command is rather easy to understand (proportion of variance attributable to measurement error). But what is the difference setting priors using “moPrep" and “priors”? Should the output be the same?
Please kindly advice. Many many thanks !
Huiying