I have a question about combining imputed data from Amelia. I understand the rationale for
running your end analysis on each imputed data set separately, and then combining the
model results. However, what if your analysis is more complicated than a simple LM? For
example, for my analysis, I am using imputed data sets (5) of time series variables (12
independent water quality variables, all time series, and 1 dependent time series). I am
decomposing each series using loess and extracting the trend only. Then, I am using
prewhitening and cross correlation to identify lags of variables that may be useful
predictors. Finally, I am differencing each series and creating and comparing ARIMA models
with external regressors to find the best model. I am having a hard time understanding how
going through each of these steps with each imputed data set separately (and trying to
combine the best models) is not going to create more variability and decrease the
confidence of the model compared to averaging the imputed data sets before doing any
analysis.
In short, if my imputed data sets are not "that" different, and the range of
values for each of my predictors is relatively small, could it possibly be better to
average the data first instead of trying to combine the best model from each?
I would greatly appreciate any comments or suggestions.
Thank you for your help.