Hello,
I was using Amelia 1.7.2 to impute a clinical dataset, and recently updated
to 1.7.3.
With 1.7.3, I started getting the "The variable X is perfectly collinear
with another variable in the data."
In order to understand this error better, I removed variables in my model
until the error went away. For the sake of the argument, let's call the
remaining non-collinear variables X, Y, and Z. Then I put back one of the
removed variables, called W, and several imputation runs gave different
collinearity errors, sometimes:
"The variable X is perfectly collinear with another variable in the data."
, sometimes
"The variable Z is perfectly collinear with another variable in the data."
, and sometimes
"The variable W is perfectly collinear with another variable in the data."
This implies that some of the pairs are perfectly correlated, right?
However, looking at all the combination of scatter plots that doesn't seem
to be the case.
I found the up-to-date amelia repo here:
https://github.com/cran/Amelia
and an inspection of the 1.7.3 commit reveals the additional check that
generates the error:
if (is.data.frame(x)) {
lmcheck <- lm(I(rnorm(AMn))~ ., data = x[,idcheck, drop = FALSE])
} else {
lmcheck <- lm(I(rnorm(AMn))~ ., data = as.data.frame(x[,idcheck, drop =
FALSE]))
}
if (any(is.na(coef(lmcheck)))) {
bad.var <- names(x[,idcheck])[which(is.na(coef(lmcheck))) - 1]
bar.var <- paste(bad.var, collapse = ", ")
stop(paste("The variable ",bad.var,"is perfectly collinear with another
variable in the data.\n"))
}
(in R/amcheck.r)
One potential issue with my dataset is the following, though: the
missingness of some groups of variables is nearly "orthogonal", by this I
mean that almost all the available values for some variables correspond to
the missing values in others. Could this be the reason for the error?
Any insight will be greatly appreciated!
Andres
Hi,
In my study, I want to use the imputed database to perform a generalized method of moments, and I want to know does combining the datasets is a better option since you get more information, or we can simply retain one of the database imputed as Zelig function, which, in my opinion allows a simple linear regression. Thank you so muchfor your kind help Mouna kessentiniPh, D StudentUniversity of Tunisia& University Paris 8
Le Samedi 3 janvier 2015 18h00, "amelia-request(a)lists.gking.harvard.edu" <amelia-request(a)lists.gking.harvard.edu> a écrit :
Send Amelia mailing list submissions to
amelia(a)lists.gking.harvard.edu
To subscribe or unsubscribe via the World Wide Web, visit
https://lists.gking.harvard.edu/mailman/listinfo/amelia
or, via email, send a message with subject or body 'help' to
amelia-request(a)lists.gking.harvard.edu
You can reach the person managing the list at
amelia-owner(a)lists.gking.harvard.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Amelia digest..."
Today's Topics:
1. question about combining datasets (Heidy Col?n-Lugo)
----------------------------------------------------------------------
Message: 1
Date: Fri, 2 Jan 2015 12:04:43 -0600
From: Heidy Col?n-Lugo <heidycolon(a)gmail.com>
To: amelia(a)lists.gking.harvard.edu
Subject: [amelia] question about combining datasets
Message-ID:
<CAM7znKO1LMbHtZ1YKiHzzEvbnZbtTJ+Fv9C26Tz-h0dYwzDRWA(a)mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
I am having a hard time finding an example on how to combine the
imputed datasets from Amelia in Zelig. As a result, I'm using the 10th
imputed dataset as the one that I use to run my regression models (see
below). Is this ok? Or is combining the datasets a better option since you
get more information? And if so, what is the code to combine the datasets?
outa<-amelia(psub, m=10, noms=noms, ords=ords, idvars=idvars)
write.amelia(obj=outa, file.stem="outdata", extension=NULL, format="csv")
outdata10<-read.table("S:/.../Imp outdata/outdata10", header=T, sep=",",
na.strings="NA", dec=".", strip.white=T)
I appreciate all of the help I can get.
Respectfully,
Heidy Colon-Lugo
PhD Candidate
I am having a hard time finding an example on how to combine the
imputed datasets from Amelia in Zelig. As a result, I'm using the 10th
imputed dataset as the one that I use to run my regression models (see
below). Is this ok? Or is combining the datasets a better option since you
get more information? And if so, what is the code to combine the datasets?
outa<-amelia(psub, m=10, noms=noms, ords=ords, idvars=idvars)
write.amelia(obj=outa, file.stem="outdata", extension=NULL, format="csv")
outdata10<-read.table("S:/.../Imp outdata/outdata10", header=T, sep=",",
na.strings="NA", dec=".", strip.white=T)
I appreciate all of the help I can get.
Respectfully,
Heidy Colon-Lugo
PhD Candidate
Good day,
I have a question about missmap() that might be fairly simple, but it is
not so for me. When I use missmap() the legend and the plot overlap. How
can I move the legend upwards so that the overlapping doesn’t happen?
I have used the locator() function to locate the point where I want the
legend to be, but nothing happens when I write this: missmap(psub,
main="Missingness Map of Raw Data", xlim=c(87.66, 18),
ylim=c(3289.589,3327)).
I’m using R Studio if that makes any difference.
I appreciate any suggestions.
Thank you,
*Heidy Colón-Lugo, M.S.*