On Wed, 12 Mar 2003, Julia Lynch wrote:
Gary,
I'm stumped. Can I ask what you would do in the following situation?
I have an index composed of 4 opinion items. For each of the 4 items, I
have DK responses from 5-20% of respondents, but typically the DK is on only
1 of the 4 items (i.e. DK responses are not highly correlated across the
items of the index). Do I:
a. delete listwise and lose 20% of my respondents, losing efficiency and
introducing massive bias (my DK respondents are poor, female, low educ, low
political interest, etc.) -- clearly not my preferred option
b. Impute the missing data for each item and construct the index using the
imputed values. I wouldn't normally want to impute for an opinion variable,
but if the respondent was able to answer 3 other closely related questions,
why should I believe that s/he couldn't also answer the fourth? But does
EMis still work if I've transformed the imputed data eg. by smushing it into
an index?
c. impute the index score. Again, some of these DKs are legitimate, but
others aren't, and this would get around the issue of transforming the
imputed data. But it would also mean throwing out information from the
three items that the respondent DID answer. (OK, not throwing out, because
I'd use that info to impute, but still...)
d. compute the index score for respondents with one DK out of 4 items by
setting the value of missing items at the mean of the remaining items. This
seems to me to be taking less than full advantage of the other information
in the dataset about how these people might have responded.
e. treat each item separately as a categorical variable. Messy and not
nearly as much fun as working with this index.
I'm not crazy about any of these choices. What do you think? Any advice
much appreciated...
Julie
Great question
definitely b. you will add a lot of power to the imputation model by
using three observed answers to impute a fourth. much better than using
the index and having to assume that all 4 (i.e., the index value) is
either missing or biased because its now based on only 3 of the questions.
You can always smush or do whatever you think is appropriate with the
simulations/imputations, such as creating an index.
Gary
: Gary King, King(a)Harvard.Edu
http://GKing.Harvard.Edu :
: Center for Basic Research Direct (617) 495-2027 :
: in the Social Sciences Assistant (617) 495-9271 :
: 34 Kirkland Street, Rm. 2 HU-MIT DC (617) 495-4734 :
: Harvard U, Cambridge, MA 02138 eFax (928) 832-7022 :
Julia Lynch
Assistant Professor
Department of Political Science
University of Pennsylvania
202 Stiteler Hall
Philadelphia, PA 19104
tel 215 898 4240
fax 215 573 2073
email jflynch(a)sas.upenn.edu