Averaged stats don't quite mesh - is this normal? - Amelia

5 Jan 2005

Dear Dr. King and other list members,

We are testing some models on survey data that look sort of like this:

Model A: Y = a + (ba1*xa1 + ba2*xa2 .... ban*xan) + e

Model B: Y = a + (ba1......xan) + ( bb1*xb1 + ...... bbn*xbn) +e 

(and so on for Model C and Model D)

This is to say that we are starting by regressing Y on a block A of predictors (xa1
through xan), adding block B of predictors (xb1 through xbn), block C, and block D.  The
test statistics of primary interest are the p values for the F tests for the change in
R-squared between models A and B, B and C, and C and D.  Of lesser interest are the F
values for the change in R-squared themselves, the values of R-squared for the models, and
the regression coefficients at each stage and the F values and p values associated with
each coefficient.

If we were doing this with one data set with no missing values, all of this would be
straightforward and we would have a table of regression coefficients, F values,
R-squareds, and p values.  Well, we are using Amelia to impute missing values, and the
presentation of the results in the article we are trying to write is posing some questions
which are new to us.  Basically, if we run the models on the five sets created by Amelia
and average all the numbers over the five sets, we get a set of numbers which don't
really mesh.  

For example, suppose that Block B has only one new predictor in it.  In that case, on any
single data set, the F value for the change in R-squared from model A to model B will be
the same as the F value for the regression coefficient for that predictor in model B.  But
the average of those F values is NOT exactly the same as the F value that we get if you
start with the averaged R-squareds for model A and for model B and compute the test based
on that difference with the appropriate degrees of freedom. 

(I should say that we have not actually averaged the p values - we have averaged the F
statistics and computed the p values from the F distribution.  Is that wrong?)

So, what should we really be doing here?

Can I assume that this kind of phenomenon is normal for Amelia, or is it a sign that we
are doing something very wrong?

If it is normal, is it a mistake to try to present the whole table of numbers that we
would normally present if we were using a data set with no missing values?  Should we
somehow say, "in using this method, we are concentrating on a small number of test
statistics which are important; presenting all the other data would be useless and
misleading"?

Or is it ok to present them, with a note to the reader that "these numbers are
averages and therefore it is normal that they don't mesh together the way they would
if they had been computed from a single data set with no missing values"?

Should we be actually averaging the p values, or is it ok to average the F values and
compute the p values based on the averages?

Any other advice?

Sincerely,

Peter A. Kimball
Data Analyst/Statistician
American College of Healthcare Executives
(312) 424-9442
pkimball(a)ache.org

Register for ACHE's Congress on Healthcare Management, March 14-17, 2005, in Chicago.
Online registration is now available!
For more information, visit the Education area of www.ache.org.

-
Amelia mailing list served by Harvard-MIT Data Center
[Un]Subscribe/View Archive: http://lists.hmdc.harvard.edu/?info=amelia