Hi -
I ran across this problem when helping someone attempting to do imputation
on a large dataset. To conserve memory, she would like to write each
imputed dataset to a file and then discard it before moving on to the next
imputation. I think that should be possible using the keep.data=F and
write.out = T arguments. Setting these options appears to erase the
imputed datasets before they are sent to a file, and return NAs instead:
if (keep.data) {
impdata[[i]] <- impfill(x.orig = data, x.imp = ximp,
noms = prepped$noms, ords = prepped$ords)
names(impdata)[i] <- paste("m", i, sep = "")
}
else {
impdata[[i]] <- NA
}
if (write.out) {
write.csv(impdata[[i]], file = paste(prepped$outname,
i, ".csv", sep = ""))
}
Cheers,
Mike
---
Michael Kellermann
Ph.D Candidate, Department of Government
Harvard University
kellerm(a)fas.harvard.edu
http://people.fas.harvard.edu/~kellerm/
-
Amelia mailing list served by Harvard-MIT Data Center
[Un]Subscribe/View Archive: http://lists.gking.harvard.edu/?info=amelia
Dear list members,
I am a complete novice with Amelia. I can't find what the missing data
value should be when I look in the input dataset section of the
documentation and there doesn't seem to be anywhere in the program to
specify what value the missing values take. For example, should the
observations given a n/a, . or -999 value?
Thanks for your time,
Jamie
-
Amelia mailing list served by Harvard-MIT Data Center
[Un]Subscribe/View Archive: http://lists.gking.harvard.edu/?info=amelia
Hi everyone, for those Mac users, here's a useful reference.
(Many thanks Raffael.)
Gary
On Fri, 20 Jul 2007, Raffael.Himmelsbach(a)unil.ch wrote:
> Dear Gary,
> I wrote a little tutorial for people using Amelia under OSX
> with R. It isn't a full tutorial, more like some additional
> comments to the manual. If you have time, have a look at it.
>
> http://www.publicdimension.ch/?page_id=7
>
> Kind regards,
>
> Raffael
-
Amelia mailing list served by Harvard-MIT Data Center
[Un]Subscribe/View Archive: http://lists.gking.harvard.edu/?info=amelia
Hi all,
I'm using Amelia to try and impute missing values on a large-ish dataset
(~100,000 observations and 18 variables). Several of the variables are
ordinal, two are id variables, and two are nominal. After running Amelia,
in the five imputed datasets that are outputted, one or more of the ordinal
variables still has missing values. If I allow Amelia to treat the ordinal
variables as continuous, it imputes all of the missing values. I receive no
error messages and one warning because one of my nominal variables has more
than 10 categories. (That var represents US States.)
Does anyone know offhand what might be causing this? Here is my call to
Amelia:
aout <- amelia( data=subdata,
p2s=2,
m=num.imputed.datasets,
noms=c("state", "delyr"),
ords=c("meduc6","gpc","mager8","mrace4","tobacco",
"alcohol", "chyper", "phyper", "eclamp",
"sex", "congen", "singleton" ),
idvars=c("recwt", "ourid"),
write.out=FALSE,
tolerance=0.0005
)
Here is the (presumably unrelated) warning message:
Warning message:
The number of catagories in one of the variables marked nominal has greater
than 10 categories. Check nominal specification.
in: amcheck(data = data, m = m, idvars = numopts$idvars, priors = priors,
And here is a summary of the imputed data
> summary(aout[[1]])
delyr outcome state mager8
mrace4 meduc6 gpc sex
Min. :1997 Min. :0.000000 Min. : 1.00 Min. :1.000 Min.
:1.000 Min. :1.000 Min. :17.00 Min. :0.0000
1st Qu.:1999 1st Qu.:0.000000 1st Qu.:12.00 1st Qu.:3.000 1st
Qu.:1.000 1st Qu.:3.000 1st Qu.:38.00 1st Qu.:0.0000
Median :2000 Median :0.000000 Median :27.00 Median :4.000 Median
:1.000 Median :3.000 Median :39.00 Median :1.0000
Mean :2000 Mean :0.007362 Mean :27.30 Mean :4.033 Mean
:1.318 Mean :3.399 Mean :38.54 Mean :0.5117
3rd Qu.:2001 3rd Qu.:0.000000 3rd Qu.:41.00 3rd Qu.:5.000 3rd
Qu.:1.000 3rd Qu.:4.000 3rd Qu.:40.00 3rd Qu.:1.0000
Max. :2002 Max. :1.000000 Max. :56.00 Max. :9.000 Max.
:4.000 Max. :5.000 Max. :47.00 Max. :1.0000
chyper phyper eclamp tobacco
alcohol recwt lbw
Min. :-8.674e-19 Min. :0.0000 Min. :0.000e+00 Min. :0.000e+00
Min. :0.000000 Min. :1.000 Min. :5.425
1st Qu.: 0.000e+00 1st Qu.:0.0000 1st Qu.:0.000e+00 1st Qu.:0.000e+00
1st Qu.:0.000000 1st Qu.:1.000 1st Qu.:8.008
Median : 0.000e+00 Median :0.0000 Median :0.000e+00 Median :0.000e+00
Median :0.000000 Median :1.000 Median :8.115
Mean : 7.651e-03 Mean :0.0397 Mean :2.981e-03 Mean :1.228e-01
Mean :0.009487 Mean :1.000 Mean :8.083
3rd Qu.: 0.000e+00 3rd Qu.:0.0000 3rd Qu.:0.000e+00 3rd Qu.:0.000e+00
3rd Qu.:0.000000 3rd Qu.:1.000 3rd Qu.:8.212
Max. : 1.000e+00 Max. :1.0000 Max. :1.000e+00 Max. :1.000e+00
Max. :1.000000 Max. :1.178 Max. :8.842
NA's :1.283e+03 NA's :1.549e+04
congen singleton ourid
Min. :0.00000 Min. :0.0000 Min. : 80
1st Qu.:0.00000 1st Qu.:1.0000 1st Qu.: 4966824
Median :0.00000 Median :1.0000 Median : 9951050
Mean :0.01376 Mean :0.9687 Mean : 9989422
3rd Qu.:0.00000 3rd Qu.:1.0000 3rd Qu.:15000000
Max. :1.00000 Max. :1.0000 Max. :20000000
Notice that eclamp and tobacco still have missing values.
I suppose I can just continuously impute the ordinal variables and sort them
back into categories afterwards, but that doesn't really seem optimal.
Thanks in advance,
Dennis
-
Amelia mailing list served by Harvard-MIT Data Center
[Un]Subscribe/View Archive: http://lists.gking.harvard.edu/?info=amelia
Dear Authors
I am preparing a document in LaTeX where I am citing Amelia and Amelia
II a couple times. I would love to use the fancy Amelia and Amelia II
logo you have in the documentation. How can I do this?
Thanks
L
-
Amelia mailing list served by Harvard-MIT Data Center
[Un]Subscribe/View Archive: http://lists.gking.harvard.edu/?info=amelia
My imputation gets very close to completion, and then suddenly gives me the
following error message:
{Amelia Error Code: NA
c(1102, 1102, 1102, 1102, 1102, 1102, 1102, 1102, 1102, 1102, 1102, 1102,
1102, 1102, 1102, 1102, 1102, 1102, 1102, 1102, 1102, 1102, 1102, 1102,
1102, 1102, 1102, 1102, 1102, 1102, 1102, 1102, 1102, 1102, 1102, 1102,
1104, 1104, 1104, 1104, 1104, 1104, 1104, 1104, 1104, 1104, 1104, 1104,
1104, 1104, 1104, 1104, 1104, 1104, 1104, 1104, 1104, 1104, 1104, 1104,
1104, 1104, 1104, 1104, 1104, 1104, 1104, 1104, 1104, 1104, 1104, 1104,
1106, 1106, 1106, 1106, 1106, 1106, 1106, 1106, 1106, 1106, 1106, 1106, }
This basic pattern is then repeated dozens of times, with different numbers
following the "c(". I'm trying to impute a variable with over 90%
missing-could that be part of the problem?
Eric McGhee
Assistant Professor
Department of Political Science
1284 University of Oregon
1415 Kincaid Street
Eugene, OR 97403-1284
(541) 346-4861
Hi,
This is a basic question to which I'm sure there is a "Uh Duh!" answer:
Where does AmeliaView save my imputed data sets? I have run the
imputation successfully, ruin through some diagnostics, and then... no
data sets saved anywhere that I can see or find on my computer.
thanks,
Christopher Kam
Assistant Professor
Department of Political Science
UBC
-
Amelia mailing list served by Harvard-MIT Data Center
[Un]Subscribe/View Archive: http://lists.gking.harvard.edu/?info=amelia
Hi,
I have been able to upload data into Amelia II and get the imputations
started, but at the end of the run I get an error message:
{Amelia Error Code: c(1992, 1987, 1992, .....
You have recieved an error. You can close this window and reset
various options to correct the error.
This error code appears to be repeated for every variable in the file.
I cannot find this error code in the documentation (the listed error
codes are numeric), and I have not seen a thread on the listserv that
describes this problem.
Any thoughts as to what might be the problem here would be much appreciated.
Best,
Christopher Kam
Assistant Professor
Department of Political Science
University of British Columbia
-
Amelia mailing list served by Harvard-MIT Data Center
[Un]Subscribe/View Archive: http://lists.gking.harvard.edu/?info=amelia