Hello,
Some colleagues and I are trying to use Amelia to impute missing data
in a time-series-cross-section (TSCS) dataset (what some might call a
panel or pooled dataset). We have been able to run the imputation
successfully, but we are getting somewhat odd results within
countries.
The results of our imputations produce means, standard deviations,
etc. that are pretty consistent with our original data. That is,
they seem to be pretty good "on average." But for this particular
project, we are most interested in using these data for descriptive
purposes, and especially to show trends over time within countries
and groups of countries.
The problem we are encountering is that we are getting some very odd
results within each country's trend. For example, a country may have
values that look like this in one of the imputed datasets:
30
[20.16830738]
[47.24110787]
[-15.5455354]
[-35.74856172]
45.9
(Brackets indicate imputed values. These particular data are for
telephone mainlines in Barbados from 1960 to 1965.)
Later years show a clear increasing trend with less missing data, but
these imputations are all over the place with respect to the time
trend of the country.
Our dataset has 8 variables (plus the country and year variables) and
7742 observations.
Here are the options we used in the imputation:
_AMcs=2 (our country ID)
_AMts=1 (year variable)
_AMtstep=1
_AMlagvs=10 (we arbitrarily chose one of our variables to lag)
We have tried it setting _AMusets=1 and =0 and have gotten similar
results either way.
Does anyone have any suggestions or thoughts on this? Are we using
Amelia incorrectly? Is it perhaps the wrong tool for this particular
job?
Thanks,
Strom
--
Strom C. Thacker
Associate Professor of International Relations
Director, Latin American Studies Program
Boston University
152 Bay State Road
Boston, MA 02215
Tel: 617.353.7160
Fax: 617.353.9290
sthacker(a)bu.edu
http://www.bu.edu/sthacker/
Hello:
Have standards evolved regarding how to report results derived from imputed
data generated by Amelia or other EM / MI methods?
At a minimum, it seems that the analyst should report (or at least have
readily available in appendix form):
* the proportion of data missing on each variable in the original
dataset, and the extent to which each variable is jointly observed with
other variables
* the variables used in the imputation models, and documentation that
findings are robust to different imputation models
* the number of datasets imputed and the estimated corresponding gain
in efficiency (per Rubin 1987)
* any ridge priors employed, and any changes to the convergence
criteria used (e.g., changing the tolerance for convergence, or limiting the
number of iterations of EM)
What do Amelia's authors think of these criteria? Are there others we
should be aware of? Or, alternately, is this overkill?
Patrick Egan
----------------------------------------------------------------------
Patrick J. Egan
Ph.D. Candidate
Department of Political Science
University of California, Berkeley
http://socrates.berkeley.edu/~pjegan
Hi all,
I do not have a licence for stata to combine the imputations for amelia. I have thought about using an aggregate function in SPSS to generate means and standard deviations for cases across all the five files (My gut instinct is that this is way too simple). I have also thought about using NORM to combine the five file parameters (Again seems to simple). Any advice on how to combine the amelia data files without stata would be much appreciated.
Kind regards Paul