Hi Matt,

Now it works! I will conduct more tests using my current set of independent variables, and other sets as well.

Thanks so much for your work with Amelia, it is an amazing tool!

Andres

On Mon, Jan 12, 2015 at 1:58 PM, Matt Blackwell <mblackwell@gov.harvard.edu> wrote:

Apologies, that argument should be "incheck = FALSE".

I think your approach might have a problem if within each group there
are variables that are almost completely missing. Either separating
the groups or imputing them together is going to have the same
problem: the imputations will be poorly estimated since there is
little data to estimate the relevant parameters. My intuition is that
you should probably estimate them all together in order to maximize
the amount of information that you have available to each group.

Cheers,
Matt

On Mon, Jan 12, 2015 at 1:40 PM, Andres Colubri

<andres@broadinstitute.org> wrote:
> Hello Matt,
>
> Thanks a lot for your prompt reply, and for suggesting the amcheck=FALSE
> option. However, I just tested it and doesn't seem to disable the error
> checking. Should I add it to the amelia(...) call, i.e.: out <- amelia(data,
> ..., amcheck=FALSE)?
>
> Anyways, I run separate imputations on the groups that have little overlap
> in their observed values, and the collinearity error went away. Do you think
> that an acceptable solution in this case could be to generate a complete
> imputed dataset by merging the imputations obtained for each group
> independently?
>
> Andres
>
>
> On Mon, Jan 12, 2015 at 11:11 AM, Matt Blackwell
> <mblackwell@gov.harvard.edu> wrote:
>>
>> Hi Andres,
>>
>> As a quick workaround, you can always use "amcheck=FALSE" to skip the
>> error checking and force amelia to run on your data.
>>
>> In general, the collinearity problems for imputations are either due
>> to highly correlated variables or variables that have little or no
>> overlap in their observed values---what you call "orthogonal." In both
>> cases, the imputation model is not very well identified and this can
>> lead to numerical issues with the algorithm. Given the error messages
>> you are seeing, it seems like it might be the case that among the
>> fully observed data, there is perfect collinearity in those variables.
>> It might be overkill to error out of amelia in this case, but it's
>> supposed to prevent the type of non-invertible covariance matrix
>> errors that are common in those situations. I might change this to a
>> warning instead of a error in future versions of Amelia.
>>
>> Cheers,
>> Matt
>>
>> ~~~~~~~~~~~
>> Matthew Blackwell
>> Assistant Professor of Government
>> Harvard University
>> url: http://www.mattblackwell.org
>>
>> On Sun, Jan 11, 2015 at 9:59 PM, Andres Colubri
>> <andres@broadinstitute.org> wrote:
>> > Hello,
>> >
>> > I was using Amelia 1.7.2 to impute a clinical dataset, and recently
>> > updated
>> > to 1.7.3.
>> >
>> > With 1.7.3, I started getting the "The variable X is perfectly
>> > collinear
>> > with another variable in the data."
>> >
>> > In order to understand this error better, I removed variables in my
>> > model
>> > until the error went away. For the sake of the argument, let's call the
>> > remaining non-collinear variables X, Y, and Z. Then I put back one of
>> > the
>> > removed variables, called W, and several imputation runs gave different
>> > collinearity errors, sometimes:
>> >
>> > "The variable X is perfectly collinear with another variable in the
>> > data."
>> >
>> > , sometimes
>> >
>> > "The variable Z is perfectly collinear with another variable in the
>> > data."
>> >
>> > , and sometimes
>> >
>> > "The variable W is perfectly collinear with another variable in the
>> > data."
>> >
>> > This implies that some of the pairs are perfectly correlated, right?
>> > However, looking at all the combination of scatter plots that doesn't
>> > seem
>> > to be the case.
>> >
>> > I found the up-to-date amelia repo here:
>> >
>> > https://github.com/cran/Amelia
>> >
>> > and an inspection of the 1.7.3 commit reveals the additional check that
>> > generates the error:
>> >
>> > if (is.data.frame(x)) {
>> > lmcheck <- lm(I(rnorm(AMn))~ ., data = x[,idcheck, drop = FALSE])
>> > } else {
>> > lmcheck <- lm(I(rnorm(AMn))~ ., data = as.data.frame(x[,idcheck, drop
>> > =
>> > FALSE]))
>> > }
>> >
>> > if (any(is.na(coef(lmcheck)))) {
>> > bad.var <- names(x[,idcheck])[which(is.na(coef(lmcheck))) - 1]
>> > bar.var <- paste(bad.var, collapse = ", ")
>> > stop(paste("The variable ",bad.var,"is perfectly collinear with
>> > another
>> > variable in the data.\n"))
>> > }
>> >
>> > (in R/amcheck.r)
>> >
>> > One potential issue with my dataset is the following, though: the
>> > missingness of some groups of variables is nearly "orthogonal", by this
>> > I
>> > mean that almost all the available values for some variables correspond
>> > to
>> > the missing values in others. Could this be the reason for the error?
>> >
>> > Any insight will be greatly appreciated!
>> >
>> > Andres
>> >
>> > --
>> > Amelia mailing list served by HUIT
>> > [Un]Subscribe/View Archive: http://lists.gking.harvard.edu/?info=amelia
>> > More info about Amelia: http://gking.harvard.edu/amelia
>> > Amelia mailing list
>> > Amelia@lists.gking.harvard.edu
>> >
>> > To unsubscribe from this list or get other information:
>> >
>> > https://lists.gking.harvard.edu/mailman/listinfo/amelia
>
>
>
> --
> Amelia mailing list served by HUIT
> [Un]Subscribe/View Archive: http://lists.gking.harvard.edu/?info=amelia
> More info about Amelia: http://gking.harvard.edu/amelia
> Amelia mailing list
> Amelia@lists.gking.harvard.edu
>
> To unsubscribe from this list or get other information:
>
> https://lists.gking.harvard.edu/mailman/listinfo/amelia