Hi Heidy,
The procedure for computing descriptive statistics is exactly the same as for producing
combined estimates from a model. But if you do not care about getting correct standard
errors for such descriptives, you could always just use the first imputed dataset:
first <- a.out$imputations[1]
This has the benefit of being a dataset that respects the choices you made in terms of
nominal/ordinal variables and not averaging across the imputations. And it’s valid (if
potentially inefficient) way to get point estimates of complete-data summary statistics.
Hope that helps!
Cheers,
Matt
~~~~~~~~~~~
Matthew Blackwell
Assistant Professor of Government
Harvard University
url:
On Sun, Jan 4, 2015 at 9:01 PM, Heidy Colón-Lugo <heidycolon(a)gmail.com>
wrote:
Thank you very much for the example code. I truly
appreciate it and that
does answer part of my questions.My other question is if the Amelia and/or
Zelig package had a function like the *mice* package has with
*complete() *where
the original dataset with the missing data is filled in with predicted
values? In the end I am trying to have one complete dataset so that I can
use to make one table for the descriptive statistics.
Thank you very much for taking the time to answering my questions!
Sincerely,
Heidy
On Sun, Jan 4, 2015 at 6:25 PM, Honaker, James <jhonaker(a)iq.harvard.edu>
wrote:
> Heidy,
>
> Zelig accepts the Amelia output object in place of a dataframe. Thus
> just impute your dataset in Amelia, and take the output and use that as
> your dataset in Zelig. It will give you results using Rubin's rules for
> combining the M different imputed datasets automatically. The following
> code shows how to do this, in either Zelig 4.2 (the latest on CRAN) or
> Zelig 5.0, which is a beta version of an entirely rewritten version of
> Zelig (which you can test out using the iqss R repository as in the code
> example)
>
> #This installs Zelig 5.0
>
> #install.packages("Zelig", repos="http://r.iq.harvard.edu",
type="source")
>
>
> #This installs Zelig 4.2
>
> install.packages("Zelig",
repos="http://lib.stat.cmu.edu/R/CRAN/")
>
>
> library(Zelig)
>
> library(Amelia)
>
>
> n<-1000
>
> xx<-runif(n)
>
> zz<-runif(n)
>
> ss<-runif(n)
>
> yn<-xx +0.5 + rnorm(n,mean=0,sd=1)
>
>
> xx[1:20]<-NA
>
> data<-data.frame(xx,zz,yn)
>
>
> # This is the Amelia Object:
>
> am.out<-amelia(data)
>
>
> # Note the Amelia Output Object used in place of a Dataframe:
>
> z.out<- zelig(yn ~ xx + zz, data = am.out, model = "ls")
>
> print(summary(z.out))
>
>
> #or in Zelig 5.0 you can also do:
>
> #z.out$summarize()
>
>
> Let me know if this just generates more questions, or you have any
> problems, do let me know. I believe an official announcement about Zelig 5
> will come out in the next couple weeks, but if you want to see more in the
> meantime, you can see an extensive project description and overview here at
> a new project page for the forthcoming release:
>
http://zeligproject.org
>
> Best,
> James.
>
> The output I get from this example for Zelig 5 and Zelig 4 are below:
> *Zelig 4.2*
> > print(summary(z.out))
>
> Model: ls
> Number of multiply imputed data sets: 5
>
> Combined results:
>
> Call:
> lm(formula = formula, weights = weights, model = F, data = data)
>
> Coefficients:
> Value Std. Error t-stat p-value
> (Intercept) 0.5115606 0.08420543 6.075150 1.249595e-09
> xx 1.3082128 0.10851395 12.055710 2.058970e-33
> zz -0.3022590 0.11147591 -2.711429 6.699652e-03
>
> For combined results from datasets i to j, use summary(x, subset = i:j).
> For separate results, use print(summary(x), subset = i:j).
>
> *Zelig 5.0*
> * …*
> Model: Combined Imputations
> Estimate Std.Error z value Pr(>|z|)
> (Intercept) 0.47795 0.08475 5.6395 1.705e-08 ***
> xx 1.03132 0.11261 9.1581 0.000e+00 ***
> zz -0.03573 0.10864 -0.3289 7.422e-01
> ---
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Next step: Use 'setx' method
> NULL
>
>
> --
> James Honaker, Senior Research Scientist
> //// Institute for Quantitative Social Science, Harvard University
> ------------------------------
> *From:* amelia-bounces(a)lists.gking.harvard.edu [
> amelia-bounces(a)lists.gking.harvard.edu] on behalf of Heidy Colón-Lugo [
> heidycolon(a)gmail.com]
> *Sent:* Friday, January 02, 2015 1:04 PM
> *To:* amelia(a)lists.gking.harvard.edu
> *Subject:* [amelia] question about combining datasets
>
> I am having a hard time finding an example on how to combine the
> imputed datasets from Amelia in Zelig. As a result, I'm using the 10th
> imputed dataset as the one that I use to run my regression models (see
> below). Is this ok? Or is combining the datasets a better option since you
> get more information? And if so, what is the code to combine the datasets?
>
> outa<-amelia(psub, m=10, noms=noms, ords=ords, idvars=idvars)
> write.amelia(obj=outa, file.stem="outdata", extension=NULL,
format="csv")
> outdata10<-read.table("S:/.../Imp outdata/outdata10", header=T,
sep=",",
> na.strings="NA", dec=".", strip.white=T)
>
> I appreciate all of the help I can get.
> Respectfully,
> Heidy Colon-Lugo
> PhD Candidate
>