Hi Levi,
You may also find some useful ideas in the literature on statistical
matching. This usually (though not always) involves assuming that the
separately measured variables are conditionally independent given the
jointly measured variables. In your context, it would assume that A and
B are independent conditional on C, D and E.
You could regress A on B after imputing for A given C, D and E, and
imputing for B given C, D and E. The advantage of this approach is that
it can be done using your favorite imputation package. It relies on the
conditional independence assumption, however, so would give biased
results if this assumption fails. You would need to judge how worrisome
this would be in your situation. Imputing the whole dataset using a
weakly informative prior for the conditional dependence structure, or a
variety of stronger priors, may help here.
A more informative approach (assuming it's feasible) is to collect some
data jointly on A, B, C, D and E, as Matt suggests.
Hope this helps,
James
On 15/12/08 12:21 PM, Matt Blackwell wrote:
Hi Levi,
Amelia does use a different algorithm for creating imputations, but
the underlying statistical model is very similar to both NORM and proc
MI. The long and short of it is that there will be a "ridge" in the
likelihood because there is no way to determine the correlation
between A and B from the data. This makes the model unidentified as
there are an infinite number of parameters that are the MLE (as we can
arbitrarily change that correlation and not affect the likelihood).
See Schafer (1997) p51-55 for a concise and clear treatment of this
issue.
Your best bets are to try and collect *some* observations that are
observed on both(1) or begin to make strong identifying assumptions.
Your situation is similar to the fundamental problem of causal
inference, where we have data on either the potential outcome for
treatment or the potential outcome for control, but not both on any
observation. In the causal inference literature, they make strong
assumptions (exogeneity, ignorability, etc) on how these two variables
relate in order to make inferences.
Finally, there is always the option to run a Bayesian model with
informative priors on that correlation. This will change the
likelihood ridge into a posterior "wide hill" that will have a unique
mode.
(1) It should be noted that collecting information on both is not
necessarily going to solve the problem as these observations could be
uninformative. This can happen, when the only observations on A and B
have the same values for both in each observation. No variation = no
information.
Hope that helps,
matt.
On Mon, Dec 15, 2008 at 5:47 AM, LevI Littvay (UNL) <levi(a)bigred.unl.edu> wrote:
Dear Amelia Developers
I have the following situation. Lets take two independently collected
samples from the same population (drawn with the same sampling technique,
with low chance of overlap and data collected around the same time). Sample
1 has variables A and C-E. Sample 2 has variables B-E. Both samples have
500 observations. I merge the datasets so I get a dataset with 1000
observations with variables A through E (where A-E are continuous normal).
A and C-E is observed for the first 500 cases and B-E for case 501-1000. I
can assume that data is missing completely at random. My imputation model
would include all variables (from A-E) and in my analysis model I want to
regress A on B for example (correlate A with B). I just talked to Craig
Enders and he verified that this imputation will not work with Proc MI or
NORM as the maximum likelihood model is not identified due to no cases where
A and B are observed at the same time. But I know Amelia uses a different
procedure. Is it possible to run this in Amelia and get unbiased results?
Thanks
Levi
-
Amelia mailing list served by Harvard-MIT Data Center
[Un]Subscribe/View Archive:
http://lists.gking.harvard.edu/?info=amelia
More info about Amelia:
http://gking.harvard.edu/amelia
-
Amelia mailing list served by Harvard-MIT Data Center
[Un]Subscribe/View Archive:
http://lists.gking.harvard.edu/?info=amelia
More info about Amelia:
http://gking.harvard.edu/amelia
-
Amelia mailing list served by Harvard-MIT Data Center
[Un]Subscribe/View Archive:
http://lists.gking.harvard.edu/?info=amelia
More info about Amelia:
http://gking.harvard.edu/amelia