I agree with the main point below. refusals to answer that are actual
answers (I don't have an opinion about the national helium reserve) should
not be imputed. but things like income can be imputed since everyone has
an income (pos or neg) even if they won't tell us. some of this is
discussed in our APSR article.
optionally, we would know which was which before we do imputation, and
we'd only impute where the missing value actually exists, tho is missing.
In this case, its pretty straightforward. We just use Amelia to deal with
all the real cases and somehow model the others separately. (One way to
model the others would be to use a pair of conditional models. E.g.,
first a logit to model whether or not someone ventures an opinion on the
natl helium reserve, and then conditional on having an opinion, you could
use something like an ordinal probit to analyze the opinions).
There is another more difficult situation where the category "refuse to
answer" is a mix of DK's and variables that don't exist. This would be
harder to model and I don't know of a paper that models data like this.
To do it seems hard but straightforward (some kind of mixture model), but
whether it would be worth the extra trouble programming it is another
question.
Incidentally, if you're doing cross-cultural research, you might have a
look at our Anchoring Vignettes idea at
http://gking.harvard.edu/vign/
Gary
On Wed, 25 Sep 2002, Randy Stevenson wrote:
Gary,
Thanks for your note about panel attrition. Sorry to bother you again,
but preparing all of this survey data has foced me to think about some
common problems in light of the imputation tecnology and I thought this
one would iterest you.
The question is what to do with don't know responses in surveys when the
DK is potentially meaningful (i.e., it doesn't simply hide a real
response). Below is an idea on how imputation may help with this
problem. I havent found any discussion of this possility in any
literature and was wondering if you had thought of this or thought it
was an idea worth pursuing. At this point it is basically my intuition
and I would obviously have to pursue a more rigorous exploration if it's
a worthwhile idea.
Consider, for example, DK responses to a question asking people to place
themselves on a left/right scale. There are a variety of reasons why
people might answer DK in this situation. I would group these into two
main categories. The first are situations in which we expect that there
is really an underlying answer but that it is not revealed. This
category includes more than just situations in which the person wants to
hide the answer. It also includes cases (1) in which the person just
des not want to invest the cogntive energy to come up with a response,
(2) in which they don't understand the question, or don't know the
meaning of the words. Finally, people who answer don't know instead of
putting themselves in the middle of the scale also belong here. The key
idea here is that if we could probe the person's attitude further
(explaining the meaning of the question or assuring them they could
reveal it to us, we could obtain a meaningful answer).
A second set of reason's for DK responses are when there is not an
underlying answer that is being denied us. This can happen because the
people answer DK when the respondent is really uncertain where they fit
on the scale with which they are presented. Maybe they haven't thought
about the issue at all. Maybe in a case like left/right self placement,
they understand what the scale means but cant reconcile their
conflicting policy views in a way that gives them a placement and are
unwilling to say they are middle of the road, because they don't think
they are (for example they could be policy extremists on some left and
right policies and this doesn't jibe with what they think a centerist on
a left/right scale is).
In the first set of cases it seems to me that it is perfectly
appropriate to impute the values of the DK category, but in the second
it is not. In the second case, we would want to include DK as a valid
response and model how this response contributes to the dependent
variable in the explanatory model. I would suggest that we can adopt an
assumption that since these are non-attitudes, we would expect that they
can not have an impact on any kind of behavior that we are modelling and
that they should be poorly predicted from other variables in the
imputation model.
If we adopt these assumptions, then it seems to me there is a reasonable
way to proceed which is as follows:
(1) impute all the DK's
(2) in the analysis model interact the imputed variable with a dummy
marking the DK's
(3) If category 2 dominates the DK's then the interaction should
indicate no relationship between variable and the DV for the imputed
cases.
(4) if category 1 dominates then they interaction will be insignificant.
The real trick for this to be true is that the imputation model for
people with non attitudes cannot systematically impute values that then
predict the dependent variable in the same way that people with real
attitudes (whether missing or not) do.
Anyway, if you have any thoughts or have done any work on this, I would
love to know. Again, ray and I are currently adopting a methodology to
reanalyze a large number of elections studies in a lot of countires to
provide a more definitive picture of cross national difference in the
sources of voting behavior (and especially economic voting) and so we
are trying to adopt the best practice on all these issues. Surprisingly,
I thought there would be more concensus on this issue, but the DK
literature seems to be more concerned with pointing out the trouble they
can cause then in proposing remedies.
Finally, since we will be estimating models with a fair number of
variables and since we are using multinomial models for multiparty
elections, inluding a full set of dummies for every categorical or quasi
continuous varibel (with a don't know category included) doesn't seem
like a feasible approach.
Thanks,
Randy
___________________________________________
Randy T. Stevenson
Albert Thomas Associate Professor of Political Science
Dept. of Political Science /MS 24
Rice University
6100 Main St.
Houston, Texas 77005
phone: 713 348-2104
fax: 713 348-5273
email: stevenso(a)ruf.rice.edu
-
amelia mailing list served by Harvard-MIT Data Center
List Address: amelia(a)latte.harvard.edu
Subscribe/Unsubscribe:
http://lists.hmdc.harvard.edu/?info=amelia