Thanks! -Don
On Tue, May 27, 2008 at 8:04 PM, Gary King <king(a)harvard.edu> wrote:
what you suggest would probably work, but i would
first look at your data
to see if you can find the collinearities. maybe you're including a
variable that is nearly equal to another, or have a set of dummy variables
without leaving out the constant term, etc. you could take one of the
imputed data sets and run a correlation matrix to get a feel for things. If
you have 2 highly collinear variables, then dropping one of the 2 will not
cause you to lose much predictive power, so its a reasonable procedure.
Gary
http://gking.harvard.edu
Donald Braman wrote:
I'm attempting to use AmeliaII in the following way:
vars_to_impute <- gundata[,c("progun", "egalitarianism",
"individualism",
"crfear", "victim", "female", "RACE",
"income", "URBANKID", "URBANNOW",
"RELIGION", "iss", "democrat",
"conservative")]
imputed <- amelia(data=vars_to_impute, p2s=2,
noms=c("RACE","RELIGION", "URBANKID","URBANNOW"),
outname="imputation", ords=c("democrat", "conservative")
The problem that I run into is that in about half of the imputation
attempts fail due to non-invertible covariance matrices. I'm curious if
there is any way to deal with this aside from removing the most highly
covariant variables? For example, given that it produces imputed data about
1/5 of the time, would it be acceptable to use successful imputations?
E.g., can I just set m=100 and use as many imputations as I like from the
resulting set of ~25 imputations?
If I do need to remove the covariate variables, do you know of a simple way
to check for that among a given set of variables?