Hi David,

You might want to try no lags/leads, intercs = FALSE, and polytime = 1. This is a fairly simple imputation model that allows for a "global" linear time-trend, but it only adds one variable to the imputation model. Hopefully, then, you should see Amelia converge much more quickly. If not, there might be another problem to diagnose. My intuition is that including lags, leads, and unit-specific time trends might be taxing on the data (with only 12 time periods per unit).

Cheers,

matt.

~~~~~~~~~~~

Matthew Blackwell

Assistant Professor of Political Science

University of Rochester

url: http://www.mattblackwell.org

On Thu, Jul 11, 2013 at 8:18 AM, Natalia & David, Freedman & Pinto <3.14david@gmail.com> wrote:

Dear list server group,

Because I've previously analyzed already-imputed data in the past (DXA data from NHANES) and performed some simple imputations in cross-sectional data, I've volunteered (been nominated?) to help co-workers perform multiple imputation in a longitudinal, multilevel data set.

The sample is ~1500 infants who were visited every month for the first year of life, with the main exposure being parent-reported sugar-sweetened beverage (SSB) consumption over the last week (yes/no); the outcome is obesity at age 6 y. Overall, about 17% of the data for SSB is missing, with the amount of missing data increasing in the latter monthly visits. While SSB consumption generally increases from 1% to 11% over the 12 months, about 10% of the sample has a ‘yes’ that is followed by a ‘no’ for SSB intake. There is also a large among of missing data on other level-1 variables, such as solid food introduction, and for level-2 covariates such as family income, birth weight, mother’s weight status, etc.

In Amelia, I've been treating each child as a cross-sectional unit (cs= ’childID’) and using month of visit for the time-series variable (ts=’month’). I've included SSB in the lags and leads options. An initial attempt at using ’polytime=2’ (with or without the intercs=T option) failed to converge even after an hour.

So, my question is whether this approach, based on using cs=, ts=, lags=, and leads= is adequate for dealing with multilevel data of this type? Or should I really be using polytime and interceps=T in Amelia, or using mice.impute.2l.norm in MICE? None of the intercorrelations in the data are very strong, with the highest being about r=0.20.

I’ve been running the imputations in Ubuntu with options(amelia.parallel='multicore', amelia.ncpus=4).

Thanks very much for any help/suggestions

David Freedman, Division of Nutrition, CDC Atlanta

--
Amelia mailing list served by HUIT
[Un]Subscribe/View Archive: http://lists.gking.harvard.edu/?info=amelia
More info about Amelia: http://gking.harvard.edu/amelia
Amelia mailing list
Amelia@lists.gking.harvard.edu

To unsubscribe from this list or get other information:

https://lists.gking.harvard.edu/mailman/listinfo/amelia