Hi Amelia, I attach your signed form, and also include here a photo of the
gas meter. Can you let us know about the upgrade? Thanks Gary King, 9
Beecher Road, Brookline MA 02445
[image: 20220602_102803.jpeg]
--
*Gary King* - Albert J. Weatherhead III University Professor - Director,
IQSS <http://iq.harvard.edu/> - Harvard University
GaryKing.org <http://garyking.org/> - King(a)Harvard.edu - @KingGary
<https://twitter.com/kinggary> - 617-500-7570 - Assistant
<king-assist(a)iq.harvard.edu>: 617-495-9271
Hello users of Amelia,
We are moving the discussion of this mailing list to our Github. For discussing Amelia, asking questions, or suggesting features, please use our Github discussions, found at https://github.com/IQSS/Amelia/discussions. To report any bugs, please fill out this form: https://github.com/IQSS/Amelia/issues/new.
Thank you,
Zagreb Mukerjee
Research Data Scientist
IQSS
Hello Amelia users
I have tried to use the amelia package, and I got the following error. I am
dealing with a macro panel dataset for 126 countries and 16 years. I want
to impute using the entire dataset, the values of time to export, time to
import (measured in hours), and cost of exports and imports in USD.
The proportion of missingness is quite large around 45 and 65 % for the
variables. For that purpose, I use the option empri, read Amelia manual on
page 31.
am = amelia(final_datos1, m = 5, intercs = TRUE, empri =
.01*nrow(final_datos1), cs = "country", ts = "year")
After running this, it gives me the error
Warning message:
In amcheck(x = x, m = m, idvars = numopts$idvars, priors = priors, :
The variable year is perfectly collinear with another variable in the
data.
It seems to me that this is due to the fact that for the combination of
country and year, there are missing values for all variables
Any help is appreciated
--
Member, Editorial Committee, *The Economic and Labour Relations Review* (a
SAGE journal)
http://elr.sagepub.com/
Member, Editorial Committee, African Journal of Economic and Management
Studies
http://emeraldgrouppublishing.com/products/journals/editorial_team.htm?id=a…https://www.researchgate.net/profile/Antonio_Andres (Research Gate profile)
I have a multiply imputed dataset (imputed using Amelia) and would like to derive a function which runs several regressions (with the same outcome, and different predictor variables) in one go. The reason is that I need to run several regressions (with the same predictors) on different outcome variables.
I have defined the function but when I run it, I get an error message:
Error in model.frame.default(formula = outcome ~ country, data = as.data.frame(.), : variable lengths differ (found for 'country')
I think it is because the imputed datasets are an "amelia" object and not a typical data frame, so it is hard to index the variable "country". But I'm not sure how to do this.
A reproducible example is below. I would really appreciate any advice.
library(Amelia)
library(Zelig)
# Use africa dataset
data(africa)
# Impute data
imp.out <- amelia(x = africa, cs = "country", ts = "year", logs = "gdp_pc", m=5)
summary(imp.out)
# Define function to run regression predicting an outcome from country, gdp_pc, civlib, and population
reg_function <- function(outcome, data) {
# Run regressions using the zelig function
country <- zelig(outcome ~ country, model = "normal", data=data, cite=FALSE)
gdp_pc <- zelig(outcome ~ gdp_pc, model = "normal", data=data, cite=FALSE)
population <- zelig(outcome ~ population, model = "normal", data=data, cite=FALSE)
# Put results into a vector
results <- ( c(combine_coef_se(country)[2,1], combine_coef_se(country)[2,2],
combine_coef_se(gdp_pc)[2,1], combine_coef_se(gdp_pc)[2,2],
combine_coef_se(population)[2,1], combine_coef_se(population)[2,2]))
# Return results in a matrix
return(matrix(results, nrow=1, ncol=6, dimnames=list(c(""),
c("Est_country", "SE_country",
"Est_gdp_pc", "SE_gdp_pc",
"Est_population", "SE_population"))))
}
# Run regression for outcome variables "year", "infl", and "trade"
# This is where I get the error messages that the variable lengths differ for 'country'
year <- reg_function(year, imp.out$imputations)
year <- reg_function(infl, imp.out$imputations)
year <- reg_function(trade, imp.out$imputations)
Hello,
I am working with data from a cluster-randomized, longitudinal control trial and need to impute missingness within individuals over time. It doesn’t look like there is an option in Amelia to specify clusters. I have set the cs to be the participant ID, and currently am just including the cluster variable as a time-invariant categorical covariate. Is there a better way to specify my Amelia model to account for the clustering?
Thank you,
Emma
--
Emma Gause, MS, MA
Research Scientist
Harborview Injury Prevention & Research Center
Firearm Injury and Policy Research Program
egause(a)uw.edu<mailto:egause@uw.edu>
<mailto:egause@uw.edu>
[signature_380473688]
My use case is to build a model on a training data set and then
demonstrate its performance on a separate test data set; both data sets
may contain missing data.
Amelia seems to assume that missing values will be imputed on the same
data set used to estimate the imputation model itself. Is there an
interface, or a reasonably discrete section of the code in the package,
which allows an imputation model developed from a training set to be
used to impute missing values in a test set?
--
Paul Dunmore
100 Marine Parade
Paraparaumu 5032
New Zealand
Hi Stuart,
Unfortunately, we don't have any code for implementing PCA with Amelia
output. I was more providing a high-level idea for how one could implement
this. If separate PCA analyses don't work, then maybe combine the data
first and then run PCA on the stacked data (weighting each row by 1/64).
Cheers,
Matt
~~~~~~~~~~~
Matthew Blackwell
Associate Professor of Government
Harvard University
url: http://www.mattblackwell.org
On Mon, Oct 12, 2020 at 10:18 PM Dr Stuart Reece <asreece(a)bigpond.net.au>
wrote:
> Thanks Matt.
>
>
>
> Can you please provide code to work out the PCA in each imputed dataset???
>
> Actually I went through and did this by hand in all 64 imputed datasets
> (for 50% missing data) – and then the code for analyzing it would not work
> at all….
>
> Extremely frustrating….
>
>
>
> I tried this with missMDA and factoMineR and PCA – but it only gave one
> dataset at the end and the results were not robust….
>
>
>
> But I really liked the Amelia framework and wanted to use it – but could
> not make the code run after constructing PCA’s in each dataset as noted
> earlier.
>
>
>
> Thanks for your advice,
>
>
>
> Stuart.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> *From:* Matt Blackwell [mailto:mblackwell@gov.harvard.edu]
> *Sent:* Tuesday, 13 October 2020 11:59 AM
> *To:* stuart.reece(a)bigpond.com
> *Cc:* amelia(a)lists.gking.harvard.edu; Gary King; James Honaker; Stuart
> Reece
> *Subject:* Re: Principal Components of Amelia Datasets
>
>
>
> Hi Stuart,
>
>
>
> Probably the most straightforward way to do this would be to apply PCA to
> each of the imputed data sets and then use those in whatever analysis
> models you want. As an alternative, you could use the stacked dataset of
> all imputation (see my earlier email) and run PCA giving each of the rows
> of the stacked data (1/m) weight where m is the number of imputed datasets.
> This would ensure that all of the imputed data sets use the same factor
> loadings.
>
>
>
> Cheers,
> Matt
>
>
>
> ~~~~~~~~~~~
>
> Matthew Blackwell
>
> Associate Professor of Government
>
> Harvard University
>
> url: http://www.mattblackwell.org
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.mattblackwell.org&d…>
>
>
>
>
>
> On Fri, Oct 9, 2020 at 10:38 PM <stuart.reece(a)bigpond.com> wrote:
>
> Hi Amelia Users.
>
>
>
> I was wondering if anyone would advise how I can add principal components
> to imputed datasets – and how to correctly combine them from all the
> imputations ???
>
>
>
> I was not able to find anything on this online….
>
>
>
> Thanks so much,
>
>
>
> Stuart Reece.
>
>
Hi Stuart,
Sorry, no, ameliabind is also designed like the combine.output I described:
it combines multiple runs of Amelia in one output object of class
"amelia". Hope that helps!
Cheers,
Matt
~~~~~~~~~~~
Matthew Blackwell
Associate Professor of Government
Harvard University
url: http://www.mattblackwell.org
On Mon, Oct 12, 2020 at 10:13 PM Dr Stuart Reece <asreece(a)bigpond.net.au>
wrote:
> Thanks Matt.
>
> Yes I have used the do.call(rbind code many times.
>
>
>
> But I thought ameliabind was to do.call(rbind like dplyr’s bind_rows was
> to the do.call rbind….
>
> ameliabind doesn’t work that way????
>
> I could not find the syntax listed anywhere online… but I don’t mind using
> do.call..(rbind
>
> Thankyou so much,
>
> Stuart.
>
>
>
>
>
>
>
>
>
> *From:* Matt Blackwell [mailto:mblackwell@gov.harvard.edu]
> *Sent:* Tuesday, 13 October 2020 11:56 AM
> *To:* stuart.reece(a)bigpond.com
> *Cc:* amelia(a)lists.gking.harvard.edu; Gary King; James Honaker; Stuart
> Reece
> *Subject:* Re: Query on Amelia::combine.output()
>
>
>
> Hi Stuart,
>
>
>
> Ah, `combine.output()` actually just takes multiple Amelia runs that were
> done separately and combines them into one object, as if you didn't them
> all together. This is helpful if you want to run additional imputations
> after a first batch. Here is some code that will take Amelia output and
> create a stacked data frame of all imputations with a column for imputation
> numbers:
>
>
>
> library(Amelia)
> data(africa)
> imps <- 5
> a.out <- amelia(africa, cs = "country", ts = "year", m = imps)
>
> stacked_df <- do.call(rbind, a.out$imputations)
> stacked_df$imp_number <- rep(1:imps, each = nrow(africa))
>
>
>
> Having said all of that, you probably don't want to do this. Instead, you
> probably want to apply your analysis model to each of the imputed data sets
> and then combine the coefficients/model parameters using the Rubin rules
> described in the various Amelia papers.
>
>
>
> Cheers,
> Matt
>
>
>
> ~~~~~~~~~~~
>
> Matthew Blackwell
>
> Associate Professor of Government
>
> Harvard University
>
> url: http://www.mattblackwell.org
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.mattblackwell.org&d…>
>
>
>
>
>
> On Fri, Oct 9, 2020 at 10:33 PM <stuart.reece(a)bigpond.com> wrote:
>
> Hi Amelia Users.
>
>
>
> I am running a Windows computer i9-9900K CPU 3.6GHz, 64MB RAM, 64-bit
> system.
>
>
>
> I have the R Studio 1.3.1093 based on R 4.0.2, just re-installed today.
>
>
>
> Amelia works on my data and runs models very nicely. The parallel
> routines work really well, which I very much appreciate on my 16 CPU’s.
>
>
>
> However I use complex geospatial models and would love to model the
> complete imputed geospatial data in R::splm.
>
>
>
> So combining all the imputations into one df would be a fantastic
> assistance.
>
>
>
> I think combine.output should do this very nicely.
>
>
>
> I think the syntax for combine.output is probably like that of ameliabind
> – really simple…. Can’t find the syntax online….
>
>
>
> But whenever I run combine.output – with whatever syntax – I always get
> the same error message which reads:
>
>
>
> CombAmelia1616 <- Amelia::combine.output(a.r.CS.Raw.ETOPFA.LIR.02.16,
> a.r.CS.Raw.ETOPFA.LIR.02.16e)
>
> Error: 'combine.output' is not an exported object from 'namespace:Amelia'
>
>
>
> I was wondering please if something is wrong??
>
>
>
> Also – could someone please confirm that the correct syntax for
> combine.output is the same as ameliabind – super simple????
>
>
>
> Thanks so much,
>
>
>
> Stuart Reece.
>
>
>
>
>
>
>
>
Hi Amelia Users,
I was also wondering how one makes graphs from iterative Amelia datasets???
Or even changes a variable onto a factor across all datasets???
Or is this best done prior to running Amelia??
I mostly use ggplot2.
Thanks again,
Stuart Reece.
Hi Stuart,
Probably the most straightforward way to do this would be to apply PCA to
each of the imputed data sets and then use those in whatever analysis
models you want. As an alternative, you could use the stacked dataset of
all imputation (see my earlier email) and run PCA giving each of the rows
of the stacked data (1/m) weight where m is the number of imputed datasets.
This would ensure that all of the imputed data sets use the same factor
loadings.
Cheers,
Matt
~~~~~~~~~~~
Matthew Blackwell
Associate Professor of Government
Harvard University
url: http://www.mattblackwell.org
On Fri, Oct 9, 2020 at 10:38 PM <stuart.reece(a)bigpond.com> wrote:
> Hi Amelia Users.
>
>
>
> I was wondering if anyone would advise how I can add principal components
> to imputed datasets – and how to correctly combine them from all the
> imputations ???
>
>
>
> I was not able to find anything on this online….
>
>
>
> Thanks so much,
>
>
>
> Stuart Reece.
>