Hi,
I'm working with a very large dataset that has a relatively small amount of
missingness in a few of the variables. (Any one variable has at most, say,
10% missingness). Amelia won't run on the entire thing since R runs out of
memory. This happens even when I pare the dataset down to only those
variables used in the analysis. I can get Amelia to run on 5% subsets of
the data, but even 10% subsets are too large.
So, is the best thing to do here to randomly split the data into 20 5%
chunks and impute separately within each chunk? If so, how should I
recombine the subsets of imputed data to perform my analysis?
Thanks in advance for your help,
Dennis
-
Amelia mailing list served by Harvard-MIT Data Center
[Un]Subscribe/View Archive:
http://lists.gking.harvard.edu/?info=amelia