below...
On Tue, 11 Feb 2003, Paul Miller wrote:
Dear Dr. King,
Im writing because I have a question about multiple imputation that I
thought you might be able to answer. I am working on an analysis
involving an interaction between two latent variables. Specifically, I
want to use peoples reports of their spouses affection, peoples
reports of their spouses negativity, and the interaction between these
variables (all latent) to predict peoples ratings of their partners
responsiveness. In testing this model, Ive decided to use a 2SLS
approach developed by Ken Bollen. So my analysis will make use of
several interaction terms, most of which will function as instrumental
variables.
In the past, Ive used Joe Schafers program Norm to do multiple
imputation. However, I was thinking that I might give Amelia a try this
time since it is supposed to work well with small samples. The
documentation for Norm indicates that imputations based on a
multivariate normal model do not preserve interactions among variables.
However, in Analyzing Incomplete Political Science Data. . . you
indicate that the user should Include interaction terms that will be
used in the analysis model and that might explain the data
distribution. This is the case despite the fact that Amelia also
appears to be based on a multivariate normal model. To me this suggests
that it is acceptable to use multiple imputation for an analysis that
will use interaction terms, provided that these terms are included in
the dataset that is submitted to the multiple imputation program. Is
this correct? Or have I misunderstood what you wrote?
Paul
What you'd do is to include (say) X, Z and the product of the two as
variables that are input into Amelia, and let it impute. This approach
has two difficulties. The first is that the imputations for Z times the
imputations for X are not constrained by the program to equal the
imputations for the product term. if I were you, i'd run the program and
plot "the imputations of X times the imputations of Z" by the imputations
of "Z*X" to check. If they're far off, you might consider changing the
specification some. But either way, I'd suggest that you only take out of
Amelia the imputed X and the imputed Z and then do with them as you see
fit, including taking their product. I.e., I'd suggest that you discard
the imputations of the product.
The other difficulty is that if X is missing for one observation and Z is
observed, then X*Z will be missing even though part of it is really
observed. This means there is some loss of information, but no one has
figured out a method to deal with this (perhaps because no one has really
tried!).
Both of these issues should affect both Joe's program and Amelia, in the
same way. Amelia should run a good deal faster and it should work more
frequently. But other than that, they give basically the same
imputations.
Incidentally, if you're conducting the survey yourself, I'd suggest you
have a look at the paper at
http://gking.harvard.edu/vign/ which may be of
use to you.
Best of luck,
Gary
: Gary King, King(a)Harvard.Edu
http://GKing.Harvard.Edu :
: Center for Basic Research Direct (617) 495-2027 :
: in the Social Sciences Assistant (617) 495-9271 :
: 34 Kirkland Street, Rm. 2 HU-MIT DC (617) 495-4734 :
: Harvard U, Cambridge, MA 02138 eFax (928) 832-7022 :