Hello,

I was using Amelia 1.7.2 to impute a clinical dataset, and recently updated to 1.7.3.

With 1.7.3, I started getting the "The variable X is perfectly collinear with another variable in the data."

In order to understand this error better, I removed variables in my model until the error went away. For the sake of the argument, let's call the remaining non-collinear variables X, Y, and Z. Then I put back one of the removed variables, called W, and several imputation runs gave different collinearity errors, sometimes:

"The variable X is perfectly collinear with another variable in the data."

, sometimes

"The variable Z is perfectly collinear with another variable in the data."

, and sometimes

"The variable W is perfectly collinear with another variable in the data."

This implies that some of the pairs are perfectly correlated, right? However, looking at all the combination of scatter plots that doesn't seem to be the case.

I found the up-to-date amelia repo here:

https://github.com/cran/Amelia

and an inspection of the 1.7.3 commit reveals the additional check that generates the error:

if (is.data.frame(x)) {
lmcheck <- lm(I(rnorm(AMn))~ ., data = x[,idcheck, drop = FALSE])
} else {
lmcheck <- lm(I(rnorm(AMn))~ ., data = as.data.frame(x[,idcheck, drop = FALSE]))
}

if (any(is.na(coef(lmcheck)))) {
bad.var <- names(x[,idcheck])[which(is.na(coef(lmcheck))) - 1]
bar.var <- paste(bad.var, collapse = ", ")
stop(paste("The variable ",bad.var,"is perfectly collinear with another variable in the data.\n"))
}

(in R/amcheck.r)

One potential issue with my dataset is the following, though: the missingness of some groups of variables is nearly "orthogonal", by this I mean that almost all the available values for some variables correspond to the missing values in others. Could this be the reason for the error?

Any insight will be greatly appreciated!

Andres