Hello,
I was using Amelia 1.7.2 to impute a clinical dataset, and recently updated
to 1.7.3.
With 1.7.3, I started getting the "The variable X is perfectly collinear
with another variable in the data."
In order to understand this error better, I removed variables in my model
until the error went away. For the sake of the argument, let's call the
remaining non-collinear variables X, Y, and Z. Then I put back one of the
removed variables, called W, and several imputation runs gave different
collinearity errors, sometimes:
"The variable X is perfectly collinear with another variable in the data."
, sometimes
"The variable Z is perfectly collinear with another variable in the data."
, and sometimes
"The variable W is perfectly collinear with another variable in the data."
This implies that some of the pairs are perfectly correlated, right?
However, looking at all the combination of scatter plots that doesn't seem
to be the case.
I found the up-to-date amelia repo here:
https://github.com/cran/Amelia
and an inspection of the 1.7.3 commit reveals the additional check that
generates the error:
if (is.data.frame(x)) {
lmcheck <- lm(I(rnorm(AMn))~ ., data = x[,idcheck, drop = FALSE])
} else {
lmcheck <- lm(I(rnorm(AMn))~ ., data = as.data.frame(x[,idcheck, drop =
FALSE]))
}
if (any(is.na(coef(lmcheck)))) {
bad.var <- names(x[,idcheck])[which(is.na(coef(lmcheck))) - 1]
bar.var <- paste(bad.var, collapse = ", ")
stop(paste("The variable ",bad.var,"is perfectly collinear with another
variable in the data.\n"))
}
(in R/amcheck.r)
One potential issue with my dataset is the following, though: the
missingness of some groups of variables is nearly "orthogonal", by this I
mean that almost all the available values for some variables correspond to
the missing values in others. Could this be the reason for the error?
Any insight will be greatly appreciated!
Andres