Hi Andres,
As a quick workaround, you can always use "amcheck=FALSE" to skip the
error checking and force amelia to run on your data.
In general, the collinearity problems for imputations are either due
to highly correlated variables or variables that have little or no
overlap in their observed values---what you call "orthogonal." In both
cases, the imputation model is not very well identified and this can
lead to numerical issues with the algorithm. Given the error messages
you are seeing, it seems like it might be the case that among the
fully observed data, there is perfect collinearity in those variables.
It might be overkill to error out of amelia in this case, but it's
supposed to prevent the type of non-invertible covariance matrix
errors that are common in those situations. I might change this to a
warning instead of a error in future versions of Amelia.
Cheers,
Matt
~~~~~~~~~~~
Matthew Blackwell
Assistant Professor of Government
Harvard University
url:
On Sun, Jan 11, 2015 at 9:59 PM, Andres Colubri
<andres(a)broadinstitute.org> wrote:
Hello,
I was using Amelia 1.7.2 to impute a clinical dataset, and recently updated
to 1.7.3.
With 1.7.3, I started getting the "The variable X is perfectly collinear
with another variable in the data."
In order to understand this error better, I removed variables in my model
until the error went away. For the sake of the argument, let's call the
remaining non-collinear variables X, Y, and Z. Then I put back one of the
removed variables, called W, and several imputation runs gave different
collinearity errors, sometimes:
"The variable X is perfectly collinear with another variable in the data."
, sometimes
"The variable Z is perfectly collinear with another variable in the data."
, and sometimes
"The variable W is perfectly collinear with another variable in the data."
This implies that some of the pairs are perfectly correlated, right?
However, looking at all the combination of scatter plots that doesn't seem
to be the case.
I found the up-to-date amelia repo here:
https://github.com/cran/Amelia
and an inspection of the 1.7.3 commit reveals the additional check that
generates the error:
if (is.data.frame(x)) {
lmcheck <- lm(I(rnorm(AMn))~ ., data = x[,idcheck, drop = FALSE])
} else {
lmcheck <- lm(I(rnorm(AMn))~ ., data = as.data.frame(x[,idcheck, drop =
FALSE]))
}
if (any(is.na(coef(lmcheck)))) {
bad.var <- names(x[,idcheck])[which(is.na(coef(lmcheck))) - 1]
bar.var <- paste(bad.var, collapse = ", ")
stop(paste("The variable ",bad.var,"is perfectly collinear with another
variable in the data.\n"))
}
(in R/amcheck.r)
One potential issue with my dataset is the following, though: the
missingness of some groups of variables is nearly "orthogonal", by this I
mean that almost all the available values for some variables correspond to
the missing values in others. Could this be the reason for the error?
Any insight will be greatly appreciated!
Andres
--
Amelia mailing list served by HUIT
[Un]Subscribe/View Archive:
http://lists.gking.harvard.edu/?info=amelia
More info about Amelia:
http://gking.harvard.edu/amelia
Amelia mailing list
Amelia(a)lists.gking.harvard.edu
To unsubscribe from this list or get other information:
https://lists.gking.harvard.edu/mailman/listinfo/amelia