Hi Wayne,
The advice is a little scattered, but is mostly correct from the
manual. Users should try to use about 0.5-1% of the observations. If
this does not stabilize the algorithm, increasing this value will
help, but going any higher than 10% of the observations might start to
affect inference (since this shrinks the covariances toward 0). We
suggest that users try to stay below this, if possible. We have
updated the manual to unify this advice and remove typos. Thanks for
the feedback.
Cheers,
matt.
On Mon, Jun 29, 2009 at 7:28 PM, Wayne Thornton<thornton(a)fas.harvard.edu> wrote:
PROBLEM: (1) The guidance on setting the parameter "empri" in the user
manual is not consistent, and thus might be confusing.
(2) There is a small typo in the text addressing the
parameter "empri" in the User Manual and the "Amelia" file in the R
package
BACKGROUND:
1. Inconsistent Guidance
(a) User Manual, sec. 7.2 (p. 51) and the file "Amelia" in the R package
(at \ library \ Amelia \ help \ Amelia, which I read as a Windows *.txt
file...) states"
empri: number indicating level of the empirical (or ridge) prior.
This prior shinks the covariances of the data, but keeps the
means and variances the same for problems of high
missingness, small N's or large correlations among the
variables. Should be kept small; a reasonable upper bound
is around 10% of the rows of the data.
(b) User Manual, sec. 5.6.1 (p.21) reads:
"A recommendation of 0.5 to 1 percent of the number of observations, n, is a
reasonable starting value, and often useful in large datasets to add some
numerical stability. For example, in a dataset of two thousand observations,
this would translate to a prior value of 10 or 20 respectively. A prior of
up to 5 percent is moderate in most applications.
For our data, it is easy to code up a 1 percent ridge prior:
a.out.time2 <- amelia(freetrade, ts =
"year", cs = "country",
+ polytime = 2, intercs = TRUE, p2s = 0, empri = 0.01 *
+ nrow(freetrade))...."
Since the example in sec.5.6,1 uses a value equal to 1% of the number of
rows of data, I have favored this interpretation...
(My experimenting indicates that using a value up to 5% of the number of
rows of data works better than trying to use a value of 0.1 to 1% or up to
5% of the number of observations.)
2. Typo
The User Manual, sec. 7.2 (p. 51) and the file "Amelia" in the R package
have the same typo: "shinks" instead of "shrinks"...
RECOMMENDATIONS:
1. Recommend that the "Amelia" file in the R package and both sections of
the User Manual reflect the best guidance, and be consistent.
2. Fix the little typo identified in 2. above
Wayne A. Thornton
thornton(a)fas.harvard.edu
<http://1429236.signature1.mailinfo.com/confirm2.6/0403020B/0003074A/0D004C0
0/65702201.jpg>
-
Amelia mailing list served by Harvard-MIT Data Center
[Un]Subscribe/View Archive: