Follow-up q to the answers by King n Blackwell.
First of all, thanks f/the help.
The follow-up is whether we can/should use linear interpolation to
check against the multiple imputations. More below.
Going back to my story (skip this para. if u remember): I got a
time-series, cross-sectional dataset. 10 years, 50 countries. 6
independent vars. Of the 6, 3 have 65% missingness. Yet, these 3
independent vars with 65% missingness have significant relationships
with the rest of the vars, and Amelia 2 was able to give me a
decent-looking imputation. i.e. diagnostics look fine.
Of the 3 independent vars with missingness, 2 have this feature: Out
of 10 possible years for data, these two variables each have values in
2 years only. (Note I say "each", so the previous feature applies to
each individual variable). Further, the years with values for these 2
variables are 8 years apart. Remember though, that for the other vars
in the dataset, there is data for all those in-between years, that
these other vars have significant relationships with the vars
w/missingness, and that Amelia has been able to give me good results
so far. Finally, because these 2 vars w/missingness really came from 2
cross-sectional datasets (at t and t+8), I didn't treat the data as
time-series in Amelia 2 (that is, I performed the MIs without a time
indicator).
A suggestion has been to linearly interpolate between t and t+8 for
these 2 vars w/65% missingness, kind of as a check of the Amelia
results. Is it necessary in principle? I see it as an inexpensive
"just to make sure" test. Any thoughts or particular cautions I should
take? I'm implementing the previous advice I got here, btw.
My position on this stuff, as mentioned before, is that the best way
to know whether MI can work, is to try it and then diagnose the
results. That is, we should not dismiss a dataset for MI treatment
just because it looks to have a worrisome pattern of missingness - in
fact, these are some of the best opportunities to put MI to work. Is
this on the money?
Thanks.
Quoting Gary King <king(a)harvard.edu>du>:
increasing the number of imputations will help with simulation error if
you have lots of missingness.
but the big problem in this situation is model-dependence. you don't
want your answers to depend heavily on your choices of an imputation
model. but the more missingness you have, the more model dependent your
inferences will be. this is true whether you use Amelia II or any
other method. there isn't much you can do about this other than either
(a) go out and collect some of the missing observations, and/or (b)
remove imputations that require inferences outside of or far from the
convex hull (see the first 2 papers at
http://gking.harvard.edu/projects/cause.shtml)
Gary
On Thu, 24 Apr 2008, Gustavo de las Casas wrote:
> Is there a cut-off for rate of missingness, past which we should
> employ other methods (i.e. Not Amelia 2)? Or does it depend on the
> diagnostic results?
>
> More specifically, if my imputations:
> a) don't give me error 34 (which says there is not enough data to
> do imputations
> properly) and;
> b) my diagnostics seem kosher (distributions of imputed/actual
> observations overlap nicely, there is convergence, etc.),
>
> can I relax about the rate of missingness in the original data?
>
> Simply: I got a time-series, cross-sectional dataset. 10 years, 50
> countries. 6 independent vars. Of the 6, 3 have 65% missingness.
> Yet, these 3 independent vars with 65% missingness have significant
> relationships with the rest of the vars, and Amelia 2 was able to
> give me a decent-looking imputation. [I can offer the misschk
> results from Stata if necessary to answer this question.]
>
> Is there a cut-off in the fraction of missingness past which I must
> worry? Or Amelia would have already told me so?
>
> King also mentions that upping the imputations (to, say, 10) can
> help deal with higher rates of missingness. Something I should do
> just to make sure?
>
> You can also direct me to somewhere in the literature where you
> think this is specifically addressed. Thanks much.
> -
> Amelia mailing list served by Harvard-MIT Data Center
> [Un]Subscribe/View Archive:
http://lists.gking.harvard.edu/?info=amelia -
Amelia mailing list served by Harvard-MIT Data Center
[Un]Subscribe/View Archive:
http://lists.gking.harvard.edu/?info=amelia