Hi all,
I've a question about multiple imputation for a data set that will be later
analysed using a selection models. I am re-analysing the chapter 5 of
Przeworski at all Democracy and Development, on the effects of political
regime on demography, where the authors use selection models (dynamic probit
version of the Heckman models) to account for regime selection effects.
There is a massive number of missing data in their analysis (sometimes they
use less than a 1/3 of the observations). However, my main concern is that
both, the missing data mechanism and the selection process are not enough
independent.
From my understanding both selection models and
multiple imputation are
trying to account for missing data, but, perhaps, in
different ways. Yet, it
is not clear to me how to compare them. One way is this: Multiple imputation
is trying to help us is using all information available in the data set,
without creating actually new information. In fact the missing cells should
be fulfilled based on the information available from the other cells. On the
other hand, selection models are actually supposed to generate new data, as
if there is no selection process going on and thus the assignment of the
treatment and the control groups are actually random. If this is the right
direction about how to think, then maybe running selection models after
multiple imputation with Amelia would not be a problem. But I am not
sure...any suggestions?
Help and advice really appreciated,
Best,
Antonio.