Mike gets, I think, major brownie points for
this. 8) This
means that everyone should at least try multiple imputation
for their papers and report whether their coefficients and
quantities of interest display any bias before imputation.
8)
----- Original Message -----
From: "Michael Richard Kellermann" <kellerm(a)fas.harvard.edu>
To: <gov2001-l(a)lists.fas.harvard.edu>
Sent: Thursday, December 16, 2004 9:11 PM
Subject: Re: [gov2001-l] mix package
This code should work for the mix package on R 2.0.1:
library(mix)
immig <- read.table("C:/immigration.dat", header = TRUE)
# This turns gender into a 1,2 categorical variable
# We treat the three ordinal variables as continuous rather
than
# categorical; otherwise, it has to estimate 686 cells,
which is way
# too many given the amount of data.
immig$gender <- immig$gender + 1
# This puts gender on the left of the matrix
immig <- cbind(immig$gender, immig[,1:4])
names(immig) <- c("gender", names(immig[,2:5]))
# This does a whole bunch of stuff to identify the missing
data
# I think the data needs to be a matrix, not a frame
s <- prelim.mix(as.matrix(immig), 1)
# This calculates initial estimates of the means and
var-covar matrix
thetahat <- em.mix(s)
# This sets the random number seed
rngseed(1234567)
# This generates the imputed dataset. It cranks through the
da.mix
# function 5000 times, draws a dataset, and then picks up
where it left
# off. A new dataset is drawn after every 5000 steps.
newtheta <- da.mix(s, thetahat, steps = 5000)
getparam.mix(s, newtheta)
x1 <- as.data.frame(imp.mix(s, newtheta, immig))
newtheta <- da.mix(s, newtheta, steps = 5000)
x2 <- as.data.frame(imp.mix(s, newtheta, immig))
newtheta <- da.mix(s, newtheta, steps = 5000)
x3 <- as.data.frame(imp.mix(s, newtheta, immig))
newtheta <- da.mix(s, newtheta, steps = 5000)
x4 <- as.data.frame(imp.mix(s, newtheta, immig))
newtheta <- da.mix(s, newtheta, steps = 5000)
x5 <- as.data.frame(imp.mix(s, newtheta, immig))
# This coerces ipip back into an ordinal variable
x1$ipip <- round(x1$ipip)
x2$ipip <- round(x2$ipip)
x3$ipip <- round(x3$ipip)
x4$ipip <- round(x4$ipip)
x5$ipip <- round(x5$ipip)
# This gets rid of any 6s that creep into the imputed data
# If you end up with any 0s after the round step, get rid of
# those too.
x1$ipip[x1$ipip >5] <- 5
x2$ipip[x2$ipip >5] <- 5
x3$ipip[x3$ipip >5] <- 5
x4$ipip[x4$ipip >5] <- 5
x5$ipip[x5$ipip >5] <- 5
#This turns the imputed datasets into a list for Zelig
immig.mi <- list(x1,x2,x3,x4,x5)
Cheers,
Mike
On Wed, 15 Dec 2004, Olivia Lau wrote:
You should follow the steps in the example at the
bottom of
help(imp.mix)...so yes, you do need to do some
pre-processing before you
use imp.mix().
Also, note that I forgot to mention something in section
last week: If we
fix tau1 = 0, we need to fix tau2 > 0. So you need to
exp(par[k+1]) it to
contstrain it to be greater than zero, but I'm sure that
everyone has
figured that out by now... 8)
Olivia.
On Wed, 15 Dec 2004, Jens Hainmueller wrote:
> is 'imp.mix' the right command to use in the mix package
> for the
> imputations? it's the only imputation type command i
> cound find there.
>
> 'imp.mix' apparently requires pre-processing with
> 'da.mix' or 'dabipf.mix'
> do we need to use any of those too?
>
> thanks!
>
> in da mix,
> j.
>
>
> _______________________________________________
> gov2001-l mailing list
> gov2001-l(a)lists.fas.harvard.edu
>
http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l
>
_______________________________________________
gov2001-l mailing list
gov2001-l(a)lists.fas.harvard.edu
http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l
_______________________________________________
gov2001-l mailing list
gov2001-l(a)lists.fas.harvard.edu
http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l
_______________________________________________
gov2001-l mailing list
gov2001-l(a)lists.fas.harvard.edu