[amelia] Missing values not all imputed for ordinal variable

19 Jul 2007

Hi all,

I'm using Amelia to try and impute missing values on a large-ish dataset
(~100,000 observations and 18 variables).  Several of the variables are
ordinal, two are id variables, and two are nominal.  After running Amelia,
in the five imputed datasets that are outputted, one or more of the ordinal
variables still has missing values.  If I allow Amelia to treat the ordinal
variables as continuous, it imputes all of the missing values.  I receive no
error messages and one warning because one of my nominal variables has more
than 10 categories.  (That var represents US States.)

Does anyone know offhand what might be causing this?  Here is my call to
Amelia:

aout <- amelia( data=subdata,
		    p2s=2,
		    m=num.imputed.datasets,
  		    noms=c("state", "delyr"),

ords=c("meduc6","gpc","mager8","mrace4","tobacco",

		    	     "alcohol", "chyper", "phyper",
"eclamp",
	   		     "sex", "congen", "singleton" ),
		    idvars=c("recwt", "ourid"),
		    write.out=FALSE,
		    tolerance=0.0005
		   )

Here is the (presumably unrelated) warning message:

Warning message:

The number of catagories in one of the variables marked nominal has greater
than 10 categories. Check nominal specification.

 in: amcheck(data = data, m = m, idvars = numopts$idvars, priors = priors,

And here is a summary of the imputed data

...
  summary(aout[[1]])      delyr         outcome   
         state           mager8
mrace4          meduc6           gpc             sex        
 Min.   :1997   Min.   :0.000000   Min.   : 1.00   Min.   :1.000   Min.
:1.000   Min.   :1.000   Min.   :17.00   Min.   :0.0000  
 1st Qu.:1999   1st Qu.:0.000000   1st Qu.:12.00   1st Qu.:3.000   1st
Qu.:1.000   1st Qu.:3.000   1st Qu.:38.00   1st Qu.:0.0000  
 Median :2000   Median :0.000000   Median :27.00   Median :4.000   Median
:1.000   Median :3.000   Median :39.00   Median :1.0000  
 Mean   :2000   Mean   :0.007362   Mean   :27.30   Mean   :4.033   Mean
:1.318   Mean   :3.399   Mean   :38.54   Mean   :0.5117  
 3rd Qu.:2001   3rd Qu.:0.000000   3rd Qu.:41.00   3rd Qu.:5.000   3rd
Qu.:1.000   3rd Qu.:4.000   3rd Qu.:40.00   3rd Qu.:1.0000  
 Max.   :2002   Max.   :1.000000   Max.   :56.00   Max.   :9.000   Max.
:4.000   Max.   :5.000   Max.   :47.00   Max.   :1.0000  

     chyper               phyper           eclamp             tobacco
alcohol             recwt            lbw       
 Min.   :-8.674e-19   Min.   :0.0000   Min.   :0.000e+00   Min.   :0.000e+00
Min.   :0.000000   Min.   :1.000   Min.   :5.425  
 1st Qu.: 0.000e+00   1st Qu.:0.0000   1st Qu.:0.000e+00   1st Qu.:0.000e+00
1st Qu.:0.000000   1st Qu.:1.000   1st Qu.:8.008  
 Median : 0.000e+00   Median :0.0000   Median :0.000e+00   Median :0.000e+00
Median :0.000000   Median :1.000   Median :8.115  
 Mean   : 7.651e-03   Mean   :0.0397   Mean   :2.981e-03   Mean   :1.228e-01
Mean   :0.009487   Mean   :1.000   Mean   :8.083  
 3rd Qu.: 0.000e+00   3rd Qu.:0.0000   3rd Qu.:0.000e+00   3rd Qu.:0.000e+00
3rd Qu.:0.000000   3rd Qu.:1.000   3rd Qu.:8.212  
 Max.   : 1.000e+00   Max.   :1.0000   Max.   :1.000e+00   Max.   :1.000e+00
Max.   :1.000000   Max.   :1.178   Max.   :8.842  
                                       NA's   :1.283e+03   NA's   :1.549e+04

     congen          singleton          ourid         
 Min.   :0.00000   Min.   :0.0000   Min.   :      80  
 1st Qu.:0.00000   1st Qu.:1.0000   1st Qu.: 4966824  
 Median :0.00000   Median :1.0000   Median : 9951050  
 Mean   :0.01376   Mean   :0.9687   Mean   : 9989422  
 3rd Qu.:0.00000   3rd Qu.:1.0000   3rd Qu.:15000000  
 Max.   :1.00000   Max.   :1.0000   Max.   :20000000  

Notice that eclamp and tobacco still have missing values.

I suppose I can just continuously impute the ordinal variables and sort them
back into categories afterwards, but that doesn't really seem optimal.  

Thanks in advance,

Dennis
-
Amelia mailing list served by Harvard-MIT Data Center
[Un]Subscribe/View Archive: http://lists.gking.harvard.edu/?info=amelia

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

[amelia] Missing values not all imputed for ordinal variable