Hi Matthew, a few notes below...
Gary
--
*Gary King* - Albert J. Weatherhead III University Professor - Director,
IQSS <http://iq.harvard.edu/> - Harvard University
GaryKing.org <http://garyking.org/> - King(a)Harvard.edu - @KingGary
<https://twitter.com/kinggary> - 617-500-7570 - Assistant
<king-assist(a)iq.harvard.edu>: 617-495-9271
On Mon, May 25, 2020 at 7:29 PM Matthew Simonson <
simonson.m(a)northeastern.edu> wrote:
> Hello Amelia Team,
>
> Three questions about a dataset I'm working with.
>
> 1) My dataset consists of a 10-wave survey in which some survey questions
> were only asked in the final 2 waves. As a block matrix it looks like this
>
> A M
> B C
>
> where all values in block M are structurally missing and not of interest.
> I want to impute the missing values in blocks A, B, and C. I could
> either a) run amelia on the full dataset, b) split it into two datasets,
> one with A and B, the other with only C, or c) split it into two
> overlapping datasets, one with A and B, the other with B and C. I'm
> hesitant to use the full dataset because including block M increases the
> missingness from 25% to 55%, but I don't know if there are theoretical
> objections to the other two approaches. What do you suggest?
>
There isn't a correct answer here, but your suggestions are reasonable. I'd
mainly just make sure that the imputation model fits the data. that will
enable you to pick an approach. (You might also have a look at this
<https://gking.harvard.edu/files/abs/not-abs.shtml> somewhat related paper,
but I'd still use Amelia rather than coding it up separately.)
>
> 2) In my analysis, I examine the interaction between treatment and
> ideology. I run treatment*(ideology>4) in one regression, but I also try other
> models with ideology>5, 6, etc. to see if the cutoff makes a difference.
> For Amelia, would it be sufficient to include a continuous
> treatment*ideology column in my data, or do I need to dichotomize this
> column in Amelia as well (and hence run Amelia multiple times, once for
> each cutoff)?
>
you definitely want to include at least as much information in the
imputation model as in your analysis model.
>
>
> 3) In order to get bootstrapped confidence intervals in my analysis, I
> bootstrap the original data 1000 times and then run Amelia (with m=5) on
> each bootstrapped dataset before analyzing. Although the original dataset
> works just fine, the bootstrapped versions throw errors about 1/3 of the
> time: first a few hundred "chol(): given matrix is not symmetric" warnings
> followed by a "inv_sympd(): matrix is singular or not positive definite"
> error. Usually all 5 imputations fail for a given bootstrapped data set,
> but sometimes only some of them do. Suggestions?
>
i'd hunt this down. i'm guessing that you have included some dummy
variables that don't exclude the baseline (so the all sum to 1), or
something close to that. you probably have either too small an n or
perfect collinearity. i'd track this down since it might be a data error
that affects everything else too.
Best of luck with your work.
Gary
>
> Thanks,
>
> Matthew Simonson
> Doctoral Student, Northeastern University, Boston
> * The COVID-19 Consortium for Understanding the Public’s Policy
> Preferences Across States *
> *Research Areas: Networks, Civil Wars, COVID-19, Causal Inference*
> *www.msimonson.com <http://www.msimonson.com/>*
> *www.covidstates.org <http://www.covidstates.org>*
>
>
>
Hi All,
For my analysis, I have a number of event count variables (22) I want to
add up to make a composite.
My original plan was to Square root transform the counts during imputation.
Then add them up to make the composite after imputation.
But alternatively, I could add them up prior to imputation, and then root
transform them during imputation.
Which is better? (Sorry if it’s a bad question I have little experience
with MI)
--
Brandon McCormick
Doctoral Student
Clinical Psychology - Psychology and Law
The University of Alabama <https://www.ua.edu/>
101 McMillan
Tuscaloosa, AL 35401
Phone 205-460-8678
bfmccormick(a)crimson.ua.edu
[image: The University of Alabama stacked logo with box A]
<https://www.ua.edu/>
Hello Amelia Team,
Three questions about a dataset I'm working with.
1) My dataset consists of a 10-wave survey in which some survey questions were only asked in the final 2 waves. As a block matrix it looks like this
A M
B C
where all values in block M are structurally missing and not of interest. I want to impute the missing values in blocks A, B, and C. I could either a) run amelia on the full dataset, b) split it into two datasets, one with A and B, the other with only C, or c) split it into two overlapping datasets, one with A and B, the other with B and C. I'm hesitant to use the full dataset because including block M increases the missingness from 25% to 55%, but I don't know if there are theoretical objections to the other two approaches. What do you suggest?
2) In my analysis, I examine the interaction between treatment and ideology. I run treatment*(ideology>4) in one regression, but I also try other models with ideology>5, 6, etc. to see if the cutoff makes a difference. For Amelia, would it be sufficient to include a continuous treatment*ideology column in my data, or do I need to dichotomize this column in Amelia as well (and hence run Amelia multiple times, once for each cutoff)?
3) In order to get bootstrapped confidence intervals in my analysis, I bootstrap the original data 1000 times and then run Amelia (with m=5) on each bootstrapped dataset before analyzing. Although the original dataset works just fine, the bootstrapped versions throw errors about 1/3 of the time: first a few hundred "chol(): given matrix is not symmetric" warnings followed by a "inv_sympd(): matrix is singular or not positive definite" error. Usually all 5 imputations fail for a given bootstrapped data set, but sometimes only some of them do. Suggestions?
Thanks,
Matthew Simonson
Doctoral Student, Northeastern University, Boston
The COVID-19 Consortium for Understanding the Public’s Policy Preferences Across States
Research Areas: Networks, Civil Wars, COVID-19, Causal Inference
www.msimonson.com<http://www.msimonson.com/>
www.covidstates.org