How to handle "conditional variables" in survey data with intentional NAs? - Amelia

29 Feb 2008

Hello

I guess that is a common problem when imputing data, but I am rather
confused by it. The availability (= not-NA-ness) of my survey data looks
like this (-> screenshot). It shows the number of non-NA entries for each
variable in the data frame.

Areas A and C have rather complete answer sets. B is white because these
questions have only been asked conditional on answers given beforehand
(in A). But a part of the white area (in D) -could- be imputed.

How does the imputation process look like? What to do with B? I could
think of two variants that are both more or less unclear to me:

A) Should I cut the dataframe and impute in two steps? But how do I then
reintegrate the results from the first imputation?

i) remove B+D, impute A+C
ii) reinsert B/D and impute D.

B) How do I specify the process for one MI step without cutting?

Thank you very much for your thoughts on this.

Best regards,
Marcus

-- 

Marcus M. Dapp | PhD student | ETH Zurich | www.ib.ethz.ch/people/mdapp
Prof. Thomas Bernauer, International Relations | www.ib.ethz.ch

On the shoulders of giants?  http://science.creativecommons.org