Hi David,
You might want to try no lags/leads, intercs = FALSE, and polytime = 1.
This is a fairly simple imputation model that allows for a "global" linear
time-trend, but it only adds one variable to the imputation model.
Hopefully, then, you should see Amelia converge much more quickly. If not,
there might be another problem to diagnose. My intuition is that including
lags, leads, and unit-specific time trends might be taxing on the data
(with only 12 time periods per unit).
Cheers,
matt.
~~~~~~~~~~~
Matthew Blackwell
Assistant Professor of Political Science
University of Rochester
url:
http://www.mattblackwell.org
On Thu, Jul 11, 2013 at 8:18 AM, Natalia & David, Freedman & Pinto <
3.14david(a)gmail.com> wrote:
Dear list server group,
Because I've previously analyzed already-imputed data in the past (DXA
data from NHANES) and performed some simple imputations in cross-sectional
data, I've volunteered (been nominated?) to help co-workers perform
multiple imputation in a longitudinal, multilevel data set.
The sample is ~1500 infants who were visited every month for the first
year of life, with the main exposure being parent-reported sugar-sweetened
beverage (SSB) consumption over the last week (yes/no); the outcome is
obesity at age 6 y. Overall, about 17% of the data for SSB is missing,
with the amount of missing data increasing in the latter monthly visits.
While SSB consumption generally increases from 1% to 11% over the 12
months, about 10% of the sample has a ‘yes’ that is followed by a ‘no’ for
SSB intake. There is also a large among of missing data on other level-1
variables, such as solid food introduction, and for level-2 covariates such
as family income, birth weight, mother’s weight status, etc.
In Amelia, I've been treating each child as a cross-sectional unit (cs=
’childID’) and using month of visit for the time-series variable
(ts=’month’). I've included SSB in the lags and leads options. An initial
attempt at using ’polytime=2’ (with or without the intercs=T option) failed
to converge even after an hour.
So, my question is whether this approach, based on using cs=, ts=, lags=,
and leads= is adequate for dealing with multilevel data of this type? Or
should I really be using polytime and interceps=T in Amelia, or using
mice.impute.2l.norm in MICE? None of the intercorrelations in the data are
very strong, with the highest being about r=0.20.
I’ve been running the imputations in Ubuntu with
options(amelia.parallel='multicore', amelia.ncpus=4).
Thanks very much for any help/suggestions
David Freedman, Division of Nutrition, CDC Atlanta
--
Amelia mailing list served by HUIT
[Un]Subscribe/View Archive:
http://lists.gking.harvard.edu/?info=amelia
More info about Amelia:
http://gking.harvard.edu/amelia
Amelia mailing list
Amelia(a)lists.gking.harvard.edu
To unsubscribe from this list or get other information:
https://lists.gking.harvard.edu/mailman/listinfo/amelia