Hi Nick,
Yeah, a non-invertible matrix usually means that there is a linear
combination of variables in the data. Sometimes this is as simple as
having the same variable twice, but more likely it is having a set of
dummy variables without leaving one category out. Other times, it is
including a linear scale and all the components of the scale. You
might want to poke around your data to see if there is anything like
that.
Cheers,
matt.
On Mon, Apr 26, 2010 at 12:05 PM, Nick Menzies <nmenzies(a)fas.harvard.edu> wrote:
Hi Matt - I think you are right, some of the variables
I was marking as
nominal had a large number of categories. When I reclassified these
variables I could get Amelia to run. My problem now seems to be that the
resulting matrices are non-invertible, though I assume that is a problem
with my starting dataset rather than anything to do with Amelia (let me know
if it might be otherwise). I ran and pasted the results of your code below,
and am using R version 2.10.1 and Amelia 1.2-16.
tmp <- sapply(dataHIV, var, na.rm = TRUE)
sum(tmp == 0)
[1] 0
sum(is.na(tmp))
[1] 0
Thanks - Nick
On Mon, Apr 26, 2010 at 11:52 AM, Matt Blackwell <blackwel(a)fas.harvard.edu
wrote:
> Hi Nick,
> We need just a bit more information to
figure what is going on. Which
> version of R and Amelia are you using?
> Amelia seems to be having a problem
checking to see if your data has
> any variables that do not vary. It crashing on this is usually not a
> good thing. I would make the following observations. First, Gary is
> right that you may have too many variables. Second, it seems that you
> are marking a lot of these variables as nominal. Note that this add a
> dummy variable for each category of the nominal variable. Thus, if you
> had 10 variables, all with 4 categories, that would put an additional
> 20 covariates into the imputation model. With the number of nominal
> variables you have, that could get problematic.
> It might also be helpful if you could
post the output to the following R
> code:
tmp <- sapply(dataHIV, var, na.rm = TRUE)
sum(tmp == 0)
sum(is.na(tmp))
> This will give us a sense for what the
invariance of the data looks like.
> Cheers,
> matt.
> On Sun, Apr 25, 2010 at 9:31 AM, Nick
Menzies <nmenzies(a)fas.harvard.edu
>
wrote:
> > Hi Matt, I am having trouble implementing Amelia in a dataset of 50,000
> > observations and ~200 variables.
> > After running the code:
> > a.out1 <- amelia(dataHIV, m = 5, noms = c("v000", "v001",
"v003",
> > "v004",
> > "v006",
> > "v014", "v016", "v101", "v102",
"v116", "v113", "v119", "v120",
> > "v121", "v122", "v123", "v124",
"v125", "v127", "v128", "v129",
> > "religion", "currentmaritalshort",
"morethanoneunion", "region",
> > "cluster", "v103", "ethny",
"radioteleall", "transportall", "country"),
> > ords = c("v013", "v106", "v105"), idvars =
c("caseid", "idnumber",
> > "hivid",
> > "hcaseid", "acaseid"))
> > I get the output:
> > Error in if (sum(non.vary == 0)) { :
> > argument is not interpretable as logical
> > In addition: Warning message:
> > In FUN(X[[4L]], ...) : NAs introduced by coercion
> > I implemented the suggestion made in post below -- I have downloaded and
> > am
> > using the most recent Amelia II version -- but am still unable to run
> > the
> > program, even when I trim the dataset down to a small number of
> > variables,
> > and delete out invariant variables. I also get warnings that some
> > nominal
> > variables have many categories, but assume this is unrelated
>
> >
http://lists.gking.harvard.edu/lists/amelia_at_lists_gking_harvard_edu/2010…
> > Do you have any suggestions?
> > Thanks for your help - Nick
> > --
>
> > Nick Menzies
> > nick.menzies(a)gmail.com
> > 404 217 1076
>
>
>
> > --
>
> > Nick Menzies
> > nick.menzies(a)gmail.com
> > 404 217 1076
>
--
Nick Menzies
nick.menzies(a)gmail.com
404 217 1076
-
Amelia mailing list served by Harvard-MIT Data Center
[Un]Subscribe/View Archive: