Re: Question about Multiple Imputation - Amelia

11 Feb 2003

below...

On Tue, 11 Feb 2003, Paul Miller wrote:

...

 Dear Dr. King,

 Im writing because I have a question about multiple imputation that I
 thought you might be able to answer. I am working on an analysis
 involving an interaction between two latent variables. Specifically, I
 want to use peoples reports of their spouses affection, peoples
 reports of their spouses negativity, and the interaction between these
 variables (all latent) to predict peoples ratings of their partners
 responsiveness. In testing this model, Ive decided to use a 2SLS
 approach developed by Ken Bollen. So my analysis will make use of
 several interaction terms, most of which will function as instrumental
 variables.

 In the past, Ive used Joe Schafers program Norm to do multiple
 imputation. However, I was thinking that I might give Amelia a try this
 time since it is supposed to work well with small samples. The
 documentation for Norm indicates that imputations based on a
 multivariate normal model do not preserve interactions among variables.
 However, in Analyzing Incomplete Political Science Data. . . you
 indicate that the user should Include interaction terms that will be
 used in the analysis model and that might explain the data
 distribution. This is the case despite the fact that Amelia also
 appears to be based on a multivariate normal model. To me this suggests
 that it is acceptable to use multiple imputation for an analysis that
 will use interaction terms, provided that these terms are included in
 the dataset that is submitted to the multiple imputation program. Is
 this correct? Or have I misunderstood what you wrote?

 Paul 

What you'd do is to include (say) X, Z and the product of the two as
variables that are input into Amelia, and let it impute.  This approach
has two difficulties.  The first is that the imputations for Z times the
imputations for X are not constrained by the program to equal the
imputations for the product term.  if I were you, i'd run the program and
plot "the imputations of X times the imputations of Z" by the imputations
of "Z*X" to check.  If they're far off, you might consider changing the 
specification some.  But either way, I'd suggest that you only take out of 
Amelia the imputed X and the imputed Z and then do with them as you see 
fit, including taking their product.  I.e., I'd suggest that you discard 
the imputations of the product.

The other difficulty is that if X is missing for one observation and Z is 
observed, then X*Z will be missing even though part of it is really 
observed.  This means there is some loss of information, but no one has 
figured out a method to deal with this (perhaps because no one has really 
tried!).

Both of these issues should affect both Joe's program and Amelia, in the
same way.  Amelia should run a good deal faster and it should work more
frequently.  But other than that, they give basically the same
imputations.

Incidentally, if you're conducting the survey yourself, I'd suggest you 
have a look at the paper at http://gking.harvard.edu/vign/ which may be of 
use to you.

Best of luck,
Gary

     : Gary King, King(a)Harvard.Edu    http://GKing.Harvard.Edu :
     : Center for Basic Research      Direct    (617) 495-2027 :
     :   in the Social Sciences       Assistant (617) 495-9271 :
     : 34 Kirkland Street, Rm. 2      HU-MIT DC (617) 495-4734 :
     : Harvard U, Cambridge, MA 02138    eFax   (928) 832-7022 :