Amelia November 2011

amelia@lists.gking.harvard.edu

8 participants
7 discussions

by Viridiana Rios

Hi Amelia community, I am imputing 185,112 observations. I have 2,571 different cases (i.e. "countries" in the language of Amelia documentation), 120 time periods and 6 variables. I am allowing for interactions, having a bound, and using a prior for each of the cases. It seems like my memory (Windows; 3583Mb) cannot take it. I am getting the error: "Reached total allocation of 3583Mb: see help(memory.size)" Any ideas on how to go around this? Thank you for your help, Viridiana Rios PhD Candidate in Government Harvard University http://www.gov.harvard.edu/people/viridiana-rios-contreras

12 years, 5 months

Minor bug in ameliabind and patch

by Jeffrey Arnold

I came across a small bug in ameliabind. The returned amelia object does not include an overvalues element. The following code (a modified version of the example in moPrep) produces the error > library(Amelia) > Loading required package: foreign > ## > ## Amelia II: Multiple Imputation > ## (Version 1.5-4, built: 2011-08-21) > ## Copyright (C) 2005-2011 James Honaker, Gary King and Matthew Blackwell > ## Refer to http://gking.harvard.edu/amelia/ for more information > ## > > data(africa) > > m.out <- moPrep(africa, trade ~ trade, error.proportion = 0.1) > > > > ## Without ameliabind > > a.out1 <- amelia(m.out, m=2, ts = "year", cs = "country") > -- Imputation 1 -- > 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 > -- Imputation 2 -- > 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 > > a.out1$overvalues > [1] 29.69 31.31 35.22 40.11 37.76 41.11 37.71 40.21 43.09 43.64 > [11] 42.62 40.11 42.72 44.63 41.83 41.56 37.92 35.19 38.37 38.75 > [21] 26.81 24.35 25.29 27.28 30.38 34.89 31.93 36.88 32.10 31.18 > [31] 37.12 33.79 35.06 31.67 35.48 38.55 38.42 46.48 48.24 48.23 > [41] 50.01 51.78 49.32 54.25 61.55 59.59 61.93 63.53 65.02 46.01 > [51] 37.32 31.99 38.48 37.71 34.63 80.10 75.00 112.70 99.64 106.93 > [61] 110.85 104.79 96.04 120.14 134.11 123.77 109.71 107.25 112.79 93.50 > [71] 80.36 81.05 83.81 97.91 90.79 65.20 67.23 91.44 78.37 80.62 > [81] 94.29 75.34 73.88 72.26 83.56 80.70 78.12 85.34 70.63 58.45 > [91] 55.49 51.87 58.76 58.91 56.38 86.04 82.45 91.26 92.85 81.72 > [101] 81.27 70.48 81.86 86.80 69.78 64.16 64.41 68.30 73.64 86.02 > [111] 75.16 59.47 60.63 72.47 71.86 > > > > ## With ameliabind > > a.out2 <- do.call(ameliabind, > + replicate(2, amelia(m.out, m=1, ts = "year", cs = > "country"), > + simplify=FALSE)) > -- Imputation 1 -- > 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 > -- Imputation 1 -- > 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 > 21 > > print(a.out2$overvalues) > NULL > > print(try(plot(a.out2))) > Error in x[[jj]][iseq] <- vjj : replacement has length zero > [1] "Error in x[[jj]][iseq] <- vjj : replacement has length zero\n" > attr(,"class") > [1] "try-error" This patch should fix the bug. Additionally, this patch adds a feature to ameliabind. Rather than throwing an error if there is only one amelia object as an argument, it simply returns that object. This way something like do.call(ameliabind, replicate(n, amelia(...))) can work for n = 1, e.g. debugging and testing, without altering the code. diff -rupNb -x .git -x '*.pdf' -x '*.Rout' Amelia/man/ameliabind.Rd > /home/jeff/workspace/AmeliaTest2/Amelia/man/ameliabind.Rd > --- Amelia/man/ameliabind.Rd 2011-08-21 09:00:21.000000000 -0400 > +++ /home/jeff/workspace/AmeliaTest2/Amelia/man/ameliabind.Rd 2011-11-11 > 15:59:48.498846781 -0500 > @@ -10,7 +10,7 @@ > ameliabind(...) > } > \arguments{ > - \item{...}{two or more objects of class \code{amelia} with the same > + \item{...}{one or more objects of class \code{amelia} with the same > arguments and created from the same data.} > } > \details{ > diff -rupNb -x .git -x '*.pdf' -x '*.Rout' Amelia/R/emb.r > /home/jeff/workspace/AmeliaTest2/Amelia/R/emb.r > --- Amelia/R/emb.r 2011-08-21 09:00:21.000000000 -0400 > +++ /home/jeff/workspace/AmeliaTest2/Amelia/R/emb.r 2011-11-09 > 12:11:39.763371625 -0500 > @@ -865,7 +865,7 @@ am.inv <- function(a,tol=.Machine$double > ## > ## ameliabind - combines multiple Amelia outputs > ## > -## INPUTS: >1 amelia output > +## INPUTS: >=1 amelia output > ## > ## OUTPUTS: a merged amelia output with all inputted lists > ## > @@ -873,12 +873,11 @@ am.inv <- function(a,tol=.Machine$double > ameliabind <- function(...) { > args <- list(...) > > - if (length(args) < 2) > - stop("We need at least two amelia outputs to bind") > - > - if (any(lapply(args, class)!="amelia")) > + if (any(!sapply(args, is, "amelia"))) > stop("All arguments must be amelia output.") > > + if (length(args) > 1) { > + > ## test that data is the same. we'll just compare the missMatrices. > ## this will allow datasets with the same size and missingness > ## matrix to be combined unintentionally, but this seems unlikely. > @@ -905,6 +904,9 @@ ameliabind <- function(...) { > mu = matrix(NA, nrow = k, ncol = newm), > covMatrices = array(NA, dim = c(k,k,newm)), > code = integer(0), > + ## overvalues for all args assumed to be the same > + ## so only use overvalues from first object > + overvalues = args[[1]]$overvalues, > message = character(0), > iterHist = list(), > arguments = list()) > @@ -935,6 +937,9 @@ ameliabind <- function(...) { > } > class(out) <- "amelia" > class(out$imputations) <- c("mi","list") > + } else { > + out <- args > + } > return(out) > } > Jeff --- Jeffrey Arnold Department of Political Science University of Rochester http://jrnold.me jeffrey.arnold(a)gmail.com jeffrey.arnold(a)rochester.edu

12 years, 5 months

multiple lags or just one?

by Scott Otterson

Hello, When Amelia is given a lags or lead argument, does it include in its model only one lag or lead, advanced or delayed by one sample? Or does it instead consider multiple lags or leads and pick the best subset with some algorithm? Thanks, Scott Dr. Scott Otterson Abteilung Energiemeteorologie und Netzintegration Fraunhofer IWES Tel: +49 (0)561 7294-252 E-Mail: scott.otterson(a)fraunhofer.iwes.de

12 years, 5 months

Using Amelia and the Central Limit Theorem

by Fernando Mayer

Hello Amelia list, nearly one year ago I've posted a question in this list (it can be seen here [1]) about a dataset I was making imputations using Amelia. The objective of such imputations is only to complete missing data and stop there. No further analysis should be made. Now I have a similar dataset which I'm trying to impute, with the same objective. In summary I'm using Amelia to impute, say m = 15, and using the mean and variance of these 15 imputations as my final result (more specifically, the mean is the variable of interest). However I've noticed that when I make two or more Amelia runs with m = 15, I can have very different final results (the means), most possibly due to the high variability of the data itself. I understand this is normal, since Amelia uses bootstrapped data to generate each imputation, so the results are expected to differ. However the high discrepancy I'm getting with different Amelia runs is a problem since I'm using only the mean of m imputations as the final result. So, if I use one Amelia run, my result is totally dependent of what happened in this unique run. What I'm trying to do now is to get more "consistent" results, in the sense that my final result is not dependent of only one Amelia run. To achieve this, I thought in using the Central Limit Theorem (CLT) to get my final mean, as follows: 1) Run the same Amelia model 1000 times, with m = 15 2) Within each of the 1000 runs I extract the mean of the m imputations, so I have 1000 means (assuming that each Amelia run is independent from each other, so I treat the means as iid random variables) 3) Calculate the mean from the distribution of these 1000 means, which should be normally distributed by the CLT (and that E(\bar{X}) = \mu and Var(\bar{X}) = \sigma^2/n). I've made a few runs of 1000 Amelia runs following this pseudo-code, and the final result is very similar among them (i.e. they have almost identical normal distributions and very similar final means). For me this sounds more reasonable to use than one only Amelia run to extract a mean, but I would like to hear yours opinion about this, and in particular if this is a valid methodology to do what I'm trying to do with Amelia. Thank you very much in advance. [1] http://lists.gking.harvard.edu/lists/amelia/2010_09/msg00012.html --- Fernando Mayer URL: http://sites.google.com/site/fernandomayer e-mail: fernandomayer [@] gmail.com - Amelia mailing list served by Harvard-MIT Data Center [Un]Subscribe/View Archive: http://lists.gking.harvard.edu/?info=amelia More info about Amelia: http://gking.harvard.edu/amelia

12 years, 5 months

Multiple Imputation

by Sera Choma

Hello, I was wondering if it is possible to "pause" multiple imputation, save my workspace, and resume it? The imputation is taking quite long and I am wary about leaving my laptop on for too long. I once imputed data for about 24hrs. Also, is it standard for the first three imputations to take less time than the 4th and 5th? Thank you for your help. Sera

12 years, 5 months

Re: [amelia] Re: Error message "formal argument "empri" matched by multiple actual arguments"

by Matt Blackwell

Hi Deryl, Yeah, this is just an artifact of the AmeliaView output. In the function call, you'll see that you have "empri = 0" and "empri = 45". I assume that you added the "45" argument after switching to R, so you can just remove the "empri = 0," and re-run. Everything should work then. Cheers, matt.

12 years, 5 months

Error message "formal argument "empri" matched by multiple actual arguments"

by Deryl Hatch

I am a newcomer to the Amelia mailing list and to Amelia. I have a very large dataset (obs = 4696) with 232 variables (45 of which I am marking as idvars to remove them from the imputation model). I am receiving an error message related to my use of the "empri" argument: Error in amelia.default(x = SENSE.3, m = 5, p2s = 2, idvars = c("X", "srvagain", : formal argument "empri" matched by multiple actual arguments because I am a novice user of R, I am not sure where this may be coming from. The code I used was copied from the AmeliaView() option, then modified, so perhaps there are syntax artifacts left over from copying that auto-produced code that I need to change? Thanks for any indications. -Deryl H. PhD student, Univ. of Texas at Austin

12 years, 5 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Amelia November 2011