Dear Matt,

 I do not answer your question directly but I can suggest another way to deal
 with missing values in the framework of PCA. Indeed, it is possible to
 modify PCA algorithms such that it can handle missing values. In my lab, we
 work on this topic and we have developed an R package named missMDA. This
 package is dedicated to handle missing values in principal components method
 such as principal component analysis.

 The rationale of the proposed method is the following:
 First, an EM algorithm (named EM-PCA) is implemented to obtain estimate of
 the scores and of the loadings despite the missing values. The algorithm
 consists in alternating two steps: one step to estimate the parameters via
 PCA and one step to impute the missing values using the PCA model (named
 also the reconstruction formulae). Consequently, at the end of the
 algorithm, a completed data set is obtained as well as the scores and
 loadings (if you achieve your PCA on the completed data set, you find the
 same loadings and scores).

 Then we have proposed a multiple imputation procedure using the PCA model.
  To visualize the different plausible imputations on the PCA maps, we have
 proposed confidences areas around the position of the individuals and the
 variables. References can be found in the following article: Josse, Julie;
 Pagès, Jérôme; Husson, François (2011). Multiple imputation in principal
 component analysis. /Advances in Data Analysis and Classification/: 1-16,
 March 06, 2011.

 J. Josse


Le 07/04/2011 02:03, matthew-c.johnson@ubs.com a écrit :

Dear All,

I am using Amelia to fill in some gaps in national accounts data (and similar panels of data); as a result of the structure of my panel, there is no 'cs' parameter -- just 'ts'.

I intend to use the EM algorithim to complete my panel, and then to extract factors via PCA -- as a reuslt i have two related questions.

1/ I would like to use the tscsPlot command (or similar) to plot the observed values and the imputations (mean + 95% confidence bands) -- is this possible?

2/ What's the best way to use the output from the imputations to generate the factors in the PCA? I have considered two methods, but am unsure which is best (most valid) --

1/ fill the gaps in the panel with the the mean of the imputations and use the single data set to extract the eigenvales and eigenvectors of the assocaited covariance matrix - and then use these weights and the 'mean-filled in' data set to generate the factors.

2/ stack the imputation panels and use the stacked panel to generate the eigenvalues and eigenvectors of the associated covariance matrix - and then use the mean values from the imputation runs to fill the gaps in my original panel, and the weights from the stacked panel?

thanks and best regards

Matt Johnson