Isaac,

In a time-series cross-sectional setting, I might. Suggest 5% (to 10%) of the n in each cross-section (which is going to be be smaller typically than 1% of the total n).  So in your series of 21 observations per cross section, an empri=1 (or 2) should again aid stability and not shrink the coefficients significantly. This is advice from a mix of intuition, exploration and experience from use cases, but of course this could really vary in some settings.

Off list I got some follow up email about my earlier note, which made it clear that I wasn't very clear.  The "tolerance" argument I also suggested adjusting changes how the EM algorithm judges whether it has converged.  This is a separate thing you might adjust in addition to empirical/ridge priors.  Larger numbers would mean that the model parameters (on z-transformed data) can have larger changes between EM-steps and be considered converged. 

James
--
James Honaker, Senior Research Scientist
//// Institute for Quantitative Social Science, Harvard University


-----Original message-----
From: Isaac Petersen <dadrivr@gmail.com>
To:
"Honaker, James" <jhonaker@iq.harvard.edu>
Cc:
Amelia Mailing List <amelia@lists.gking.harvard.edu>
Sent:
Tue, Jul 2, 2013 21:13:19 GMT+00:00
Subject:
Re: [amelia] Is single imputation faster in parallel? Need help speeding up imputation.

Thanks, James.  Your response was very helpful.  Just to clarify on the ridge prior:

My matrix to be imputed is 12,285 rows by 62 columns, composed of 585 cross sectional units and 21 time series units.  Would a good ridge prior be 1 percent of 21 (where 21 is the number of rows---i.e., time series units---within each cross-sectional unit)?

Thanks for clarifying.
-Isaac

On Tue, Jul 2, 2013 at 10:41 AM, Honaker, James <jhonaker@iq.harvard.edu> wrote:
Isaac,

In addition to the newer "multicore" abilities you mention, a small empirical prior, will speed up convergence.  The "empri" argument sets an empirical/ridge prior.  A value of a half to 1 percent of the sample size would be small, aid numerical stability, and unlikely to noticably change results (unless you are using time series cross sectional data, in which case you might use 1 percent of the sample within any cross sectional unit).

The "tolerance" changes the point at which the EM algorithm is judged to have converged, and setting that larger, (like .001, or even .005) is probably quite safe.  We were very conservative with this tolerance choice, and should reexamine other options to set it dynamically.

Best,
James.

--
James Honaker, Senior Research Scientist
//// Institute for Quantitative Social Science, Harvard University

From: amelia-bounces@lists.gking.harvard.edu [amelia-bounces@lists.gking.harvard.edu] on behalf of Isaac Petersen [dadrivr@gmail.com]
Sent: Tuesday, July 02, 2013 9:55 AM
To: Amelia Mailing List
Subject: [amelia] Is single imputation faster in parallel? Need help speeding up imputation.

I'm looking to speed up the run time of a single imputation on a large data set with repeated measures that takes many hours.  Will running the imputation in parallel with the parallel="multicore" option and 6 cores speed up the run time of a single imputation, or will it only speed up the run time of multiple imputations (by running them simultaneously)?  What are my best options for making the single imputation run faster while minimizing any sacrifices in imputation accuracy?

Many thanks!
-Isaac