Amelia January 2008

amelia@lists.gking.harvard.edu

7 participants
3 discussions

Re: RE : [amelia] longitudinal data imputation

by Joseph Cohen

I have encountered this question in my own research recently, although with large-T, small-N data. For what it's worth, this is what I've pieced together from the manuals and paper, and my other readings. Can anyone tell me if I'm way off on this? There will always be some uncertainty about our estimates, because they are simulations that represent possible values of data that we do not have. The only sure-fire means of validating the imputations is to have the actual values, which would eliminate the need for imputation. Ultimately, you have to make a judgment about the credibility of the imputation model itself — does it create reasonable estimates? AMELIA II offers two diagnostics tools for judging imputed values — compare and overimpute (both explained in Prof. King's recommended readings). The former command lets you compare the distribution of reported and imputed variable’s values. Ask yourself, should the missing values have the same distribution as reported values, or should their distribution have a different central tendency, dispersion and/or skew than the reported values? The graphs produced here will allow you to assess your imputation's conformity to these expectations. Compare will allow you to check whether these expectations about imputed value distributions are fulfilled. Imputed values’s distribution do not have to match the distribution of reported values, but the differences between the two should be explainable. Overimpute treats a sub-sample of your reported data as if it were missing, then allows you to compare simulated and actual values. For this test, you are basically trying to see whether your imputation model renders predictions that approximate actual values. Under some circumstances, your model will not predict extreme values well. This has happened to me some times. My principle concern here is that these extreme values do not constitute much of your sample, or do not present data points with undue influence on your results. If such observations influence your model, then you have a concern with which I have not yet dealt. In addition, you can use your preferred spreadsheet or statistical package to graph reported and imputed values within panels. Compare the imptued and reported values within panels, and ask yourself whether the imputed models make sense. If the variable is expected to take the form of a random walk, then do the imputed values also suggest such a walk? If the variable is one that maintains stable trends within panels over time, do the imputed values roughly approximate this stable trend? There is always judgment involved, and it is important that you are able to make a case for the reader to believe in your imputations. This is how I've interpreted the materials recommended by Prof. King, but I am not a leading expert in missing data imputation. If I am completely mistaken, someone please tell me. Joe Joseph Nathan Cohen Assistant Professor of Sociology City University of New York, Queens College Powdermaker 252CC 65-30 Kissena Blvd Flushing, NY 11367 e-mail: joseph.cohen(a)qc.cuny.edu web: www.josephncohen.com ----- Original Message ---- From: "Blais, Martin" <blais.martin(a)uqam.ca> To: king(a)harvard.edu; amelia(a)lists.gking.harvard.edu Cc: "Raymond, Sarah" <raymond.sarah(a)uqam.ca> Sent: Tuesday, January 29, 2008 8:56:30 PM Subject: RE : [amelia] longitudinal data imputation RE : [amelia] longitudinal data imputation Thank you, I'll have a look to it! Martin -------- Message d'origine-------- De: Gary King [mailto:king@harvard.edu] Date: mar. 2008-01-29 18:32 À: Blais, Martin; amelia(a)lists.gking.harvard.edu Cc: Raymond, Sarah Objet : Re: [amelia] longitudinal data imputation Have a look at the paper by Honaker and King on this subject at the web site. That and the manual should do it. --- Sent from my phone; please excuse the terse note. Gary King http://gking.harvard.edu -----Original Message----- From: "Blais, Martin" <blais.martin(a)uqam.ca> Date: Tue, 29 Jan 2008 17:06:03 To:amelia@lists.gking.harvard.edu Cc:"Raymond, Sarah" <raymond.sarah(a)uqam.ca> Subject: [amelia] longitudinal data imputation Hello We are using longitudinal data (3 time-point) to evaluate the effects of an intervention program. We have missing data (about 80% of missing cases at T2 for the control group and some missing cases for both experimental and control group at T3) and want to impute missing data using Amelia II. I am looking for detail procedures for imputation of longitudinal missing data with Amelia II to answer simple questions like: Should the data be organised in long or wide format? Are there any tutorials (website, technical papers) available for this purpose beside the Amelia II documentation? Any reference on imputation of longitudinal data (mostly about the options and the best practices in the context of program evaluation) are also welcome. Very many thanks for any help, Martin Blais, Ph.D. Professeur Département de sexologie Université du Québec à Montréal C.P. 8888, succ. Centre-ville Montréal (Québec) Canada H3C 3P8 Vox : (514) 987-3000 poste 4031 Fax : (514) 987-6787 - Amelia mailing list served by Harvard-MIT Data Center [Un]Subscribe/View Archive: http://lists.gking.harvard.edu/?info=amelia

16 years, 2 months

error code 1 from Lapack routine 'dgesdd'

by Ari Ho-Foster (ari＠ciet.org)

Hello I am a new user of Amelia II, and multiple imputation in general. In trying to run an imputation on a rather large dataset, I have come across the error below. Looking through the list's archives, I see it is advisable to work with pared-back datasets containing subsets of variables, to see where the problem is. I will try that, however I wondered if this error might have arisen from another issue? I would be grateful for any assistance anyone could offer. Thanks Ari Ho-Foster Research Associate CIET > x<-read.dbf("2c2qa") > > amelia(x, m=5, p2s=2, frontend=FALSE, + idvars = c(1L, 17L, 93L, 111L, 134L, 135L, 136L, 137L, 138L, 139L, + 140L, 141L), + logs=NULL, ts=NULL, cs=NULL, casepri=NULL, priors=NULL, empri=NULL, tolerance=0.0005, + polytime=NULL, startvals=1, lags=NULL, leads=NULL, intercs=FALSE, archive=TRUE, sqrts=NULL, + lgstc=NULL, + noms = c(3L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 18L, + 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, + 32L, 33L, 34L, 35L, 36L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, + 45L, 46L, 47L, 48L, 49L, 50L, 51L, 52L, 53L, 54L, 55L, 56L, 57L, + 58L, 59L, 60L, 61L, 62L, 63L, 64L, 65L, 66L, 67L, 68L, 69L, 70L, + 71L, 72L, 73L, 74L, 75L, 76L, 77L, 78L, 79L, 80L, 81L, 82L, 83L, + 84L, 85L, 86L, 87L, 88L, 89L, 90L, 91L, 92L, 94L, 95L, 96L, 97L, + 99L, 100L, 101L, 103L, 104L, 105L, 106L, 107L, 108L, 109L, 110L, + 112L, 113L, 114L, 115L, 116L, 117L, 118L, 119L, 120L, 121L, 122L, + 123L, 124L, 125L, 126L, 127L, 128L, 129L, 130L, 131L, 132L, 133L), + incheck=T, + ords = c(2L, 4L, 5L, 6L, 98L, 102L), collect=FALSE, outname="2c2qaMI", write.out=TRUE, arglist=NULL, keep.data=TRUE) amelia starting beginning prep functions running bootstrap -- Imputation 1 -- setting up EM chain indicies 1(22546) 2(16875) 3(5120) 4(2514) 5(1999) 6(1816!) 7Error in La.svd(x, nu, nv) : error code 1 from Lapack routine 'dgesdd' No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.5.516 / Virus Database: 269.19.16/1250 - Release Date: 29/01/2008 22:20 - Amelia mailing list served by Harvard-MIT Data Center [Un]Subscribe/View Archive: http://lists.gking.harvard.edu/?info=amelia

16 years, 2 months

NAs are not allowed in subscripted assignments

by Anders Schwartz Corr

Hi, I got the following error message from Amelia: > ameliaoutput10<-amelia(merge24, p2s=2,lgstc=c(8,9,33,40,82:96),ords=c(10,37,38,39,49,50,51,52,53,72:81), logs=c(3:7,25,26,29,36,41:47,54:71),ts=1,cs=2,polytime=2,intercs=TRUE, archive=TRUE) amelia starting beginning prep functions running bootstrap -- Imputation 1 -- setting up EM chain indicies 1(347357!) 2(51802!) 3(45797!) 4(40634!) 5(46795!) 6(40806!) 7(40043!) 8(35876!) 9(33598!)10(34136!) 11(28776!)12(25746!)13(23438!)14(21293!)15(30009!)16(32754!)17(24128!)18(23706!)19 Loading required package: foreign Error in e[e > tol] <- 1/e[e > tol] : NAs are not allowed in subscripted assignments Calls: amelia -> emarch -> emfred -> amsweep -> mpinv Execution halted Any suggestions? Thanks, Anders - Amelia mailing list served by Harvard-MIT Data Center [Un]Subscribe/View Archive: http://lists.gking.harvard.edu/?info=amelia

16 years, 2 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Amelia January 2008