Amelia October 2020

amelia@lists.gking.harvard.edu

3 participants
6 discussions

Applying the imputation mode to new data

by Paul Dunmore

My use case is to build a model on a training data set and then demonstrate its performance on a separate test data set; both data sets may contain missing data. Amelia seems to assume that missing values will be imputed on the same data set used to estimate the imputation model itself. Is there an interface, or a reasonably discrete section of the code in the package, which allows an imputation model developed from a training set to be used to impute missing values in a test set? -- Paul Dunmore 100 Marine Parade Paraparaumu 5032 New Zealand

3 years, 6 months

Re: [amelia] Principal Components of Amelia Datasets

by Matt Blackwell

Hi Stuart, Unfortunately, we don't have any code for implementing PCA with Amelia output. I was more providing a high-level idea for how one could implement this. If separate PCA analyses don't work, then maybe combine the data first and then run PCA on the stacked data (weighting each row by 1/64). Cheers, Matt ~~~~~~~~~~~ Matthew Blackwell Associate Professor of Government Harvard University url: http://www.mattblackwell.org On Mon, Oct 12, 2020 at 10:18 PM Dr Stuart Reece <asreece(a)bigpond.net.au> wrote: > Thanks Matt. > > > > Can you please provide code to work out the PCA in each imputed dataset??? > > Actually I went through and did this by hand in all 64 imputed datasets > (for 50% missing data) – and then the code for analyzing it would not work > at all…. > > Extremely frustrating…. > > > > I tried this with missMDA and factoMineR and PCA – but it only gave one > dataset at the end and the results were not robust…. > > > > But I really liked the Amelia framework and wanted to use it – but could > not make the code run after constructing PCA’s in each dataset as noted > earlier. > > > > Thanks for your advice, > > > > Stuart. > > > > > > > > > > > > > > > > *From:* Matt Blackwell [mailto:mblackwell@gov.harvard.edu] > *Sent:* Tuesday, 13 October 2020 11:59 AM > *To:* stuart.reece(a)bigpond.com > *Cc:* amelia(a)lists.gking.harvard.edu; Gary King; James Honaker; Stuart > Reece > *Subject:* Re: Principal Components of Amelia Datasets > > > > Hi Stuart, > > > > Probably the most straightforward way to do this would be to apply PCA to > each of the imputed data sets and then use those in whatever analysis > models you want. As an alternative, you could use the stacked dataset of > all imputation (see my earlier email) and run PCA giving each of the rows > of the stacked data (1/m) weight where m is the number of imputed datasets. > This would ensure that all of the imputed data sets use the same factor > loadings. > > > > Cheers, > Matt > > > > ~~~~~~~~~~~ > > Matthew Blackwell > > Associate Professor of Government > > Harvard University > > url: http://www.mattblackwell.org > <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.mattblackwell.org&d…> > > > > > > On Fri, Oct 9, 2020 at 10:38 PM <stuart.reece(a)bigpond.com> wrote: > > Hi Amelia Users. > > > > I was wondering if anyone would advise how I can add principal components > to imputed datasets – and how to correctly combine them from all the > imputations ??? > > > > I was not able to find anything on this online…. > > > > Thanks so much, > > > > Stuart Reece. > >

3 years, 6 months

Re: [amelia] Query on Amelia::combine.output()

by Matt Blackwell

Hi Stuart, Sorry, no, ameliabind is also designed like the combine.output I described: it combines multiple runs of Amelia in one output object of class "amelia". Hope that helps! Cheers, Matt ~~~~~~~~~~~ Matthew Blackwell Associate Professor of Government Harvard University url: http://www.mattblackwell.org On Mon, Oct 12, 2020 at 10:13 PM Dr Stuart Reece <asreece(a)bigpond.net.au> wrote: > Thanks Matt. > > Yes I have used the do.call(rbind code many times. > > > > But I thought ameliabind was to do.call(rbind like dplyr’s bind_rows was > to the do.call rbind…. > > ameliabind doesn’t work that way???? > > I could not find the syntax listed anywhere online… but I don’t mind using > do.call..(rbind > > Thankyou so much, > > Stuart. > > > > > > > > > > *From:* Matt Blackwell [mailto:mblackwell@gov.harvard.edu] > *Sent:* Tuesday, 13 October 2020 11:56 AM > *To:* stuart.reece(a)bigpond.com > *Cc:* amelia(a)lists.gking.harvard.edu; Gary King; James Honaker; Stuart > Reece > *Subject:* Re: Query on Amelia::combine.output() > > > > Hi Stuart, > > > > Ah, `combine.output()` actually just takes multiple Amelia runs that were > done separately and combines them into one object, as if you didn't them > all together. This is helpful if you want to run additional imputations > after a first batch. Here is some code that will take Amelia output and > create a stacked data frame of all imputations with a column for imputation > numbers: > > > > library(Amelia) > data(africa) > imps <- 5 > a.out <- amelia(africa, cs = "country", ts = "year", m = imps) > > stacked_df <- do.call(rbind, a.out$imputations) > stacked_df$imp_number <- rep(1:imps, each = nrow(africa)) > > > > Having said all of that, you probably don't want to do this. Instead, you > probably want to apply your analysis model to each of the imputed data sets > and then combine the coefficients/model parameters using the Rubin rules > described in the various Amelia papers. > > > > Cheers, > Matt > > > > ~~~~~~~~~~~ > > Matthew Blackwell > > Associate Professor of Government > > Harvard University > > url: http://www.mattblackwell.org > <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.mattblackwell.org&d…> > > > > > > On Fri, Oct 9, 2020 at 10:33 PM <stuart.reece(a)bigpond.com> wrote: > > Hi Amelia Users. > > > > I am running a Windows computer i9-9900K CPU 3.6GHz, 64MB RAM, 64-bit > system. > > > > I have the R Studio 1.3.1093 based on R 4.0.2, just re-installed today. > > > > Amelia works on my data and runs models very nicely. The parallel > routines work really well, which I very much appreciate on my 16 CPU’s. > > > > However I use complex geospatial models and would love to model the > complete imputed geospatial data in R::splm. > > > > So combining all the imputations into one df would be a fantastic > assistance. > > > > I think combine.output should do this very nicely. > > > > I think the syntax for combine.output is probably like that of ameliabind > – really simple…. Can’t find the syntax online…. > > > > But whenever I run combine.output – with whatever syntax – I always get > the same error message which reads: > > > > CombAmelia1616 <- Amelia::combine.output(a.r.CS.Raw.ETOPFA.LIR.02.16, > a.r.CS.Raw.ETOPFA.LIR.02.16e) > > Error: 'combine.output' is not an exported object from 'namespace:Amelia' > > > > I was wondering please if something is wrong?? > > > > Also – could someone please confirm that the correct syntax for > combine.output is the same as ameliabind – super simple???? > > > > Thanks so much, > > > > Stuart Reece. > > > > > > > >

3 years, 6 months

Graphing from Amelia Iterative Datasets

by stuart.reece＠bigpond.com

Hi Amelia Users, I was also wondering how one makes graphs from iterative Amelia datasets??? Or even changes a variable onto a factor across all datasets??? Or is this best done prior to running Amelia?? I mostly use ggplot2. Thanks again, Stuart Reece.

3 years, 6 months

Re: [amelia] Principal Components of Amelia Datasets

by Matt Blackwell

Hi Stuart, Probably the most straightforward way to do this would be to apply PCA to each of the imputed data sets and then use those in whatever analysis models you want. As an alternative, you could use the stacked dataset of all imputation (see my earlier email) and run PCA giving each of the rows of the stacked data (1/m) weight where m is the number of imputed datasets. This would ensure that all of the imputed data sets use the same factor loadings. Cheers, Matt ~~~~~~~~~~~ Matthew Blackwell Associate Professor of Government Harvard University url: http://www.mattblackwell.org On Fri, Oct 9, 2020 at 10:38 PM <stuart.reece(a)bigpond.com> wrote: > Hi Amelia Users. > > > > I was wondering if anyone would advise how I can add principal components > to imputed datasets – and how to correctly combine them from all the > imputations ??? > > > > I was not able to find anything on this online…. > > > > Thanks so much, > > > > Stuart Reece. >

3 years, 6 months

Re: [amelia] Query on Amelia::combine.output()

by Matt Blackwell

Hi Stuart, Ah, `combine.output()` actually just takes multiple Amelia runs that were done separately and combines them into one object, as if you didn't them all together. This is helpful if you want to run additional imputations after a first batch. Here is some code that will take Amelia output and create a stacked data frame of all imputations with a column for imputation numbers: library(Amelia) data(africa) imps <- 5 a.out <- amelia(africa, cs = "country", ts = "year", m = imps) stacked_df <- do.call(rbind, a.out$imputations) stacked_df$imp_number <- rep(1:imps, each = nrow(africa)) Having said all of that, you probably don't want to do this. Instead, you probably want to apply your analysis model to each of the imputed data sets and then combine the coefficients/model parameters using the Rubin rules described in the various Amelia papers. Cheers, Matt ~~~~~~~~~~~ Matthew Blackwell Associate Professor of Government Harvard University url: http://www.mattblackwell.org On Fri, Oct 9, 2020 at 10:33 PM <stuart.reece(a)bigpond.com> wrote: > Hi Amelia Users. > > > > I am running a Windows computer i9-9900K CPU 3.6GHz, 64MB RAM, 64-bit > system. > > > > I have the R Studio 1.3.1093 based on R 4.0.2, just re-installed today. > > > > Amelia works on my data and runs models very nicely. The parallel > routines work really well, which I very much appreciate on my 16 CPU’s. > > > > However I use complex geospatial models and would love to model the > complete imputed geospatial data in R::splm. > > > > So combining all the imputations into one df would be a fantastic > assistance. > > > > I think combine.output should do this very nicely. > > > > I think the syntax for combine.output is probably like that of ameliabind > – really simple…. Can’t find the syntax online…. > > > > But whenever I run combine.output – with whatever syntax – I always get > the same error message which reads: > > > > CombAmelia1616 <- Amelia::combine.output(a.r.CS.Raw.ETOPFA.LIR.02.16, > a.r.CS.Raw.ETOPFA.LIR.02.16e) > > Error: 'combine.output' is not an exported object from 'namespace:Amelia' > > > > I was wondering please if something is wrong?? > > > > Also – could someone please confirm that the correct syntax for > combine.output is the same as ameliabind – super simple???? > > > > Thanks so much, > > > > Stuart Reece. > > > > > > >

3 years, 6 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Amelia October 2020