Re: Another question - Amelia

26 Sep 2002

I agree with the main point below.  refusals to answer that are actual
answers (I don't have an opinion about the national helium reserve) should
not be imputed.  but things like income can be imputed since everyone has
an income (pos or neg) even if they won't tell us.  some of this is
discussed in our APSR article.

optionally, we would know which was which before we do imputation, and 
we'd only impute where the missing value actually exists, tho is missing.  
In this case, its pretty straightforward.  We just use Amelia to deal with 
all the real cases and somehow model the others separately.  (One way to 
model the others would be to use a pair of conditional models.  E.g., 
first a logit to model whether or not someone ventures an opinion on the 
natl helium reserve, and then conditional on having an opinion, you could 
use something like an ordinal probit to analyze the opinions).

There is another more difficult situation where the category "refuse to
answer" is a mix of DK's and variables that don't exist.  This would be
harder to model and I don't know of a paper that models data like this.  
To do it seems hard but straightforward (some kind of mixture model), but
whether it would be worth the extra trouble programming it is another
question.

Incidentally, if you're doing cross-cultural research, you might have a 
look at our Anchoring Vignettes idea at http://gking.harvard.edu/vign/

Gary

On Wed, 25 Sep 2002, Randy Stevenson wrote:

...

 Gary,

 Thanks for your note about panel attrition.  Sorry to bother you again,
 but preparing all of this survey data has foced me to think about some
 common problems in light of the imputation tecnology and I thought this
 one would iterest you.  

 The question is what to do with don't know responses in surveys when the
 DK is potentially meaningful (i.e., it doesn't simply hide a real
 response).  Below is an idea on how imputation may help with this
 problem.  I havent found any discussion of this possility in any
 literature and was wondering if you had thought of this or thought it
 was an idea worth pursuing. At this point it is basically my intuition
 and I would obviously have to pursue a more rigorous exploration if it's
 a worthwhile idea.

 Consider, for example, DK responses to a question asking people to place
 themselves on a left/right scale.  There are a variety of reasons why
 people might answer DK in this situation.  I would group these into two
 main categories.  The first are situations in which we expect that there
 is really an underlying answer but that it is not revealed. This
 category includes more than just situations in which the person wants to
 hide the answer.  It also includes cases (1) in which the person just
 des not want to invest the cogntive energy to come up with a response,
 (2) in which they don't understand the question, or don't know the
 meaning of the words.  Finally, people who answer don't know instead of
 putting themselves in the middle of the scale also belong here. The key
 idea here is that if we could probe the person's attitude further
 (explaining the meaning of the question or assuring them they could
 reveal it to us, we could obtain a meaningful answer).

 A second set of reason's for DK responses are when there is not an
 underlying answer that is being denied us.  This can happen because the
 people answer DK when the respondent is really uncertain where they fit
 on the scale with which they are presented.  Maybe they haven't thought
 about the issue at all.  Maybe in a case like left/right self placement,
 they understand what the scale means but cant reconcile their
 conflicting policy views in a way that gives them a placement and are
 unwilling to say they are middle of the road, because they don't think
 they are (for example they could be policy extremists on some left and
 right policies and this doesn't jibe with what they think a centerist on
 a left/right scale is).

 In the first set of cases it seems to me that it is perfectly
 appropriate to impute the values of the DK category, but in the second
 it is not.  In the second case, we would want to include DK as a valid
 response and model how this response contributes to the dependent
 variable in the explanatory model.  I would suggest that we can adopt an
 assumption that since these are non-attitudes, we would expect that they
 can not have an impact on any kind of behavior that we are modelling and
 that they should be poorly predicted from other variables in the
 imputation model.

 If we adopt these assumptions, then it seems to me there is a reasonable
 way to proceed which is as follows:

 (1) impute all the DK's 
 (2) in the analysis model interact the imputed variable with a dummy
 marking the DK's
 (3) If category 2 dominates the DK's then the interaction should
 indicate no relationship between variable and the DV for the imputed
 cases.
 (4) if category 1 dominates then they interaction will be insignificant.

 The real trick for this to be true is that the imputation model for
 people with non attitudes cannot systematically impute values that then
 predict the dependent variable in the same way that people with real
 attitudes (whether missing or not) do.

 Anyway, if you have any thoughts or have done any work on this, I would
 love to know.  Again, ray and I are currently adopting a methodology to
 reanalyze a large number of elections studies in a lot of countires to
 provide a more definitive picture of cross national difference in the
 sources of voting behavior (and especially economic voting) and so we
 are trying to adopt the best practice on all these issues. Surprisingly,
 I thought there would be more concensus on this issue, but the DK
 literature seems to be more concerned with pointing out the trouble they
 can cause then in proposing remedies.

 Finally, since we will be estimating models with a fair number of
 variables and since we are using multinomial models for multiparty
 elections, inluding a full set of dummies for every categorical or quasi
 continuous varibel (with a don't know category included) doesn't seem
 like a feasible approach.

 Thanks,

 Randy
 ___________________________________________
 Randy T. Stevenson
 Albert Thomas Associate Professor of Political Science
 Dept. of Political Science /MS 24
 Rice University
 6100 Main St.
 Houston, Texas 77005

 phone: 713 348-2104
 fax: 713 348-5273
 email: stevenso(a)ruf.rice.edu 

-
amelia mailing list served by Harvard-MIT Data Center
List Address: amelia(a)latte.harvard.edu
Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=amelia