Hi all,

I've a question about multiple imputation for a data set that will be later analysed using a selection models.  I am re-analysing the chapter 5 of Przeworski at all Democracy and Development, on the effects of political regime on demography, where the authors use selection models (dynamic probit version of the Heckman models) to account for regime selection effects. There is a massive number of missing data in their analysis (sometimes they use less than a 1/3 of the observations). However, my main concern is that both, the missing data mechanism and the selection process are not enough independent.


From my understanding both selection models and multiple imputation are trying to account for missing data, but, perhaps, in different ways. Yet, it is not clear to me how to compare them. One way is this: Multiple imputation is trying to help us is using all information available in the data set, without creating actually new information. In fact the missing cells should be fulfilled based on the information available from the other cells. On the other hand, selection models are actually supposed to generate new data, as if there is no selection process going on and thus the assignment of the treatment and the control groups are actually random.  If this is the right direction about how to think, then maybe running selection models after multiple imputation with Amelia would not be a problem. But I am not sure...any suggestions?

Help and advice really appreciated,

Best,

Antonio.