Hi Gu,

See below for responses.

On Thu, Jul 14, 2016 at 8:44 AM, Gu Li <ligu.sysu@gmail.com> wrote:

Dear all,

I am re-sending my questions to see if you have any thoughts. I have spoken to several colleagues and found that they had similar problems... Any help from you are very much appreciated!

Best,
Gu

2016-06-23 13:13 GMT+01:00 Gu Li <ligu.sysu@gmail.com>:
Hello list members!

I am writing to ask about methods of pooling Amelia outputs for standard deviation, Cohen's d, and model fit statistics such as F-statistic and R-squared.

Specifically: (1) For SD, can I use mi.meld() to pool SDs estimated from individual imputed datasets, similarly to pooling standard errors for regression coefficients?

If you just want to report the descriptive SD of a variable, you can just take the average of the within imputation SDs. The more complicated formula for the SEs of regression coefficients and means is for estimates of the uncertainty of an estimate. But the sample SD is itself just an estimate.

(2) For Cohen's d, can I use zelig-ls to pool the t-statistic for the dummy predictor, and then transform the pooled t-statistic into Cohen's d? Alternatively, can I calculate Cohen's d by each imputed dataset and then calculate the mean of the ds? Or a third approach, to calculate Cohen's d based on pooled mean and SD? - These approaches do not always lead to identical results, which one is the best? Or is there yet another better approach?

The Rubin rules generally state to estimate a pooled statistic you should take the average of the within-imputation statistics. Then use the variance formula to get a pooled variance estimate for the statistic.

(3) For R-squared - I understand that Dr. King recommends not to focus on model fit statistics - but just out of curiosity: mice has a function that uses the procedure proposed by Harel (2009): http://www.tandfonline.com/doi/pdf/10.1080/02664760802553000

a) In each ‘complete’ data,
• calculate R2 • take its squared root - R • use Fisher z-transformation to evaluate the normalized estimate and its variance (Q(i), V (i))
2) With the m sets of estimates and variances, • combine results using Rubin’s rules • the confidence interval (CI) for Q is QT ± z(α/2)√(QT) • inverse transform for the proportion scale • square your results.

Is this approach superior to taking the mean of estimated R-squared's from the imputed datasets directly?

I'm not very familiar with this approach but it sounds reasonable. I'm sure the two procedures will lead to very similar estimates of the R^2.

(4) For the F-statistic - Is there any recommendation other than taking the mean of Fs from the imputed datasets?

The average is probably an ok way to do this, but more generally you might want to look to likelihood ratio tests to assess model fit. With those, you can use the procedure of Meng and Rubin (1992, Biometrika). Here's a link:

http://biomet.oxfordjournals.org/content/79/1/103

Hope that helps!

Cheers,

Matt

~~~~~~~~~~~

Matthew Blackwell

Assistant Professor of Government

Harvard University

url: http://www.mattblackwell.org

My apologies for the many questions! Thank you in advance for any of your help! :)

Best wishes,
Gu

--
Gu Li, MS
PhD Candidate
University of Cambridge
Department of Psychology
Free School Lane, Cambridge, CB2 3RQ
United Kingdom

--
Gu Li, MS
PhD Candidate
University of Cambridge
Department of Psychology
Free School Lane, Cambridge, CB2 3RQ
United Kingdom

--
Amelia mailing list served by HUIT
[Un]Subscribe/View Archive: http://lists.gking.harvard.edu/?info=amelia
More info about Amelia: http://gking.harvard.edu/amelia
Amelia mailing list
Amelia@lists.gking.harvard.edu

To unsubscribe from this list or get other information:

https://lists.gking.harvard.edu/mailman/listinfo/amelia