Amelia June 2016

amelia@lists.gking.harvard.edu

4 participants
4 discussions

Pooling standard deviation, Cohen's d, F-statistic, and R-squared

by Gu Li

Hello list members! I am writing to ask about methods of pooling Amelia outputs for standard deviation, Cohen's d, and model fit statistics such as F-statistic and R-squared. Specifically: (1) For SD, can I use mi.meld() to pool SDs estimated from individual imputed datasets, similarly to pooling standard errors for regression coefficients? (2) For Cohen's d, can I use zelig-ls to pool the t-statistic for the dummy predictor, and then transform the pooled t-statistic into Cohen's d? Alternatively, can I calculate Cohen's d by each imputed dataset and then calculate the mean of the ds? Or a third approach, to calculate Cohen's d based on pooled mean and SD? - These approaches do not always lead to identical results, which one is the best? Or is there yet another better approach? (3) For R-squared - I understand that Dr. King recommends not to focus on model fit statistics - but just out of curiosity: mice has a function that uses the procedure proposed by Harel (2009): http://www.tandfonline.com/doi/pdf/10.1080/02664760802553000 a) In each ‘complete’ data, • calculate R2 • take its squared root - R • use Fisher z-transformation to > evaluate the normalized estimate and its variance (Q(i), V (i)) 2) With the m sets of estimates and variances, • combine results using > Rubin’s rules • the confidence interval (CI) for Q is QT ± z(α/2)√(QT) • > inverse transform for the proportion scale • square your results. Is this approach superior to taking the mean of estimated R-squared's from the imputed datasets directly? (4) For the F-statistic - Is there any recommendation other than taking the mean of Fs from the imputed datasets? My apologies for the many questions! Thank you in advance for any of your help! :) Best wishes, Gu -- Gu Li, MS PhD Candidate University of Cambridge Department of Psychology Free School Lane, Cambridge, CB2 3RQ United Kingdom

7 years, 9 months

writing to .dta: empty string

by Sophie C. Moullin

Dear Amelia users/creators, I want to write a stack of Amelia imputed data sets into a Stata format for some specific analyses and tests that I find easier in Stata. I know that write.amelia enables this when the separate argument is set to false, and have tried the following code: write.amelia(am.output, format="dta", file.stem="outdata", separate=FALSE, orig.data=TRUE) However, I get an error message: “Error in write.dta(dataframe= list…) empty string is not valid in Stata's documented format”. Stack overflow has a thread on this error for write.dta, which suggests overwriting a data frame, however I cannot do this with the Amelia output: http://stackoverflow.com/questions/27574055/converting-r-file-to-stata-with… Any advice? Grateful for this great MI package, and for any suggestions! Sophie Sophie Moullin Sociology & Social Policy PhD Student Princeton University smoullin(a)princeton.edu<mailto:smoullin@princeton.edu>

7 years, 9 months

Continued Problems with Subsetting/Gold-Standard Data in Overimputation

by Sean Kates

After updating to the newest version of Amelia (1.7.4), I tried overimputing a dataset that has incorrect values in one of its variables. All of the error observations are measured identically (as zeros, where they should be positive). The code I originally used is below, and it triggers a warning of the type: "Some observations estimated with negative measurement error variance. Set to gold standard." dat<-data.frame(A, B, C, VS) mopd<-moPrep(dat, VS~VS, subset=VS<.0001) I looked through the github code as to what causes this error (other than, of course, the negative error variance), and more importantly, how to activate the gold.standard (which for my purposes is the rest of the values for VS) and presumably fix this issue. After trying quite a few different possible codings, I can't get it to work. I either receive the same error, or a host of errors surrounding how I've included gold.standard in the code. I would think it should be easy, since I'm basically bifurcating my data (all data under some amount is the subset measured with error; all data over the amount can be considered gold-standard data), but can't figure it out. Thanks for any help you can give, Sean

7 years, 11 months

Overimputation - Setting observation-level priors on nominal variables

by HuiYing Chua

Hi there, I have basically 2 questions related to setting observation-level priors on nominal variables. I am trying to do an overimputation on a dichotomous variable, say y1. My 1st question: I am aware that using the argument “priors” and “overimp”, I could specify observation-level priors by 4-column matrix (row, column, prior.mean, prior.sd) or 5-column matrix (row, column, lower confidence range, upper confidence range, confidence level). I am attempting the 4-column matrix but I am not sure how do I specify prior.mean and prior.sd when my prior is the dichotomous variable itself. I read somewhere prior.mean can be set to y1 itself? Is prior.sd similar to the proportion of variance attributable to measurement error? Would need advice on how do I specify prior.sd in this case. My 2nd question: I am also aware of generating prior using the command “moPrep” from the Amelia package. The argument “error.proportion” from “moPrep” command is rather easy to understand (proportion of variance attributable to measurement error). But what is the difference setting priors using “moPrep" and “priors”? Should the output be the same? Please kindly advice. Many many thanks ! Huiying

7 years, 11 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Amelia June 2016