Hi Mari,
With 87 variables, you are asking Amelia to estimate almost 4,000
parameters (means, variances, and covariances). That it would take a
few hours is unsurprising, especially if there is high correlation in
the data. Any variable set to "nominal" is converted into a series of
dichotomous variables, which can add a lot of variables to your 87 and
increase the number of parameters drastically. This will lead to
slower imputations (inverting big matrices is slow) and longer chains
(takes longer to estimate more parameters). You probably want to think
about paring down the imputation model to a set of variables that (a)
make the MAR assumption plausible and/or (b) will be in your analysis
model.
variables with skewed distributions. However, when I
tried to set Amelia to
also transform all nominal and ordinal variables, I received this error
message:
Error in La.svd(x, nu, nv) : error code 1 from Lapack routine 'dgesdd'
This is almost certainly due to near-singularities in the data.
Certain categories of your "nominal" variables likely have
correlations very close to 1 or -1. You want to make sure that you are
not including any duplicate variables in your data.
I decided to set only some of the ordinal and nominal
variables to
transform, and Amelia began its work. Well, it's been working for 84 hours
and it's only on imputation 2! Imputation 1 had a chain length of 266 and
imputation 2 is currently at 229.
To be honest, this seems slow for the chain lengths that you report.
What kind of machine are you running this on?
I also set another large file, with 286 variables and
3,305 cases, to be
imputed by Amelia on a different computer. I started the imputation process
at roughly the same time (84 hours ago) and it appears that it has finally
made it to imputation 3, with chain lengths of over 2,000 for the first two
imputations.
The number of cases here is probably too large for the number of
variables. With 286 variables, you are looking at over 40k parameters,
which is well over your 3k cases.
Also, I had downloaded the newer Amelia program this
past Saturday but it
would not run these larger files--it appeared to freeze when I hit "impute."
So I went back to the earlier version of Amelia I had downloaded in January,
and this version is imputing the large data files, but obviously very
slowly.
It is likely not frozen, but simply hard at work on your imputations.
The new GUI does not show the output by default, but we might change
that given your experience.
Hope that helps.
Cheers,
matt.
On Wed, Feb 23, 2011 at 7:03 AM, Mari <maricunnington(a)gmail.com> wrote:
> Hello,
>
> I am using Amelia to impute missing values in two large data files and a
> third smaller file. The smaller file, which contains 19 variables and 866
> cases, imputed fairly easily and quickly both with and without
> transformations. The two larger files are another story completely. One of
> them, which has 87 variables and 16,491 cases, was imputed over the course
> of several hours when I set Amelia only to transform the continuous
variables with skewed distributions. However, when I
tried to set Amelia to
also transform all nominal and ordinal variables, I received this error
message:
Error in La.svd(x, nu, nv) : error code 1 from Lapack routine 'dgesdd'
I decided to set only some of the ordinal and nominal
variables to
transform, and Amelia began its work. Well, it's been working for 84 hours
and it's only on imputation 2! Imputation 1 had a chain length of 266 and
imputation 2 is currently at 229.
>
I also set another large file, with 286 variables and
3,305 cases, to be
imputed by Amelia on a different computer. I started the imputation process
at roughly the same time (84 hours ago) and it appears that it has finally
made it to imputation 3, with chain lengths of over 2,000 for the first two
imputations.
>
Also, I had downloaded the newer Amelia program this
past Saturday but it
would not run these larger files--it appeared to freeze when I hit "impute."
So I went back to the earlier version of Amelia I had downloaded in January,
and this version is imputing the large data files, but obviously very
slowly.
>
> Could there be something wrong with my settings causing this extreme
> slowness? Or does it simply take this long to impute large data files?
>
> Thanks,
>
> Mari Cunnington
> Doctoral Student
> Teachers College, Columbia University
>
> --
> -Mari
>
-
Amelia mailing list served by Harvard-MIT Data Center
[Un]Subscribe/View Archive:
http://lists.gking.harvard.edu/?info=amelia
More info about Amelia:
http://gking.harvard.edu/amelia