Fabricio,
The imputation model should contain all the information that the analysis model contains.
So, indeed, if there are going to be interactions in the analysis, then the interactions
should be constructed as variables in the imputation dataset. From a forecasting
perspective, if you're putting interactions into your analysis, you think
interactions might possibly be predicting something, so they belong in the imputation.
From stricter perspective, imputation is attempting to create a rectangularized dataset
(no incomplete rows) that contains the exact same information and statistical
relationships that exist in the incomplete data. That is, nothing is added, we just have
a (set of) completed datasets that has the same sufficient statistics as the incomplete
observed data. If we want the interactions in the imputed datasets to have the same
relationships with all variables as in the incomplete data, then we need the interactions
constructed in the imputation dataset, to "preserve interaction effects" as you
so describe.
A pragmatic issue arises if there is missingness in either of the variables that go into
the interactions. In that case, it is considered best practice to create the
interactions, imputed the data, and then recreate the interactions. If either variable is
missing, the interaction is also missing, but only one of the constituent parts needs to
be imputed, so reconstructing the interaction after imputation creates interactions that
are logically consistent with the data, and utilize more of the observed information.
Currently, we don't have Amelia create interactions from declared arguments (in the
way that we can have Amelia create logged terms, or break up nominal variables into
dummies from setting arguments in the call to the amelia function). There is, however,
the ability to set up these transformations (like interactions) and record them, and then
reconstruct them after imputation. See section 4.8 "Post-Imputation
Transformations" in the manual for the utility functions Amelia provides for creating
these:
http://r.iq.harvard.edu/docs/amelia/amelia.pdf
Best,
James.
--
James Honaker, Senior Research Scientist
//// Institute for Quantitative Social Science, Harvard University
________________________________
From: amelia-bounces(a)lists.gking.harvard.edu [amelia-bounces(a)lists.gking.harvard.edu] on
behalf of Fabrício Mendes Fialho [fabriciofialho(a)gmail.com]
Sent: Wednesday, May 08, 2013 12:57 PM
To: amelia(a)lists.gking.harvard.edu
Subject: [amelia] Including interaction in Amelia MI
Hi all,
Craig Enders, in his Applied Missing Data Analysis (Guilford Press, 2010), suggests that,
if your regression model includes interaction terms, such terms must also be included in
the imputation model as a way to preserve interaction effects (p. 265). Is there a way to
set an interaction term *within* Amelia code or should I generate interactions before the
imputation?
Suppose I am interested in the differential effect of education on attitudes for different
racial groups (say, African-Americans, Asians, Latinos, and Whites). Can I set the
ethnicity*education interaction straight into Amelia code or should I create different
variables for African-Americans*education, Latino*education, and so on to catch the
race*education interaction?
Another possible question: Shouldn't I (or is it not necessary to) include
interactions in the imputation?
Best,
Fabricio.