Thanks!  Just to make sure: I'm talking about imputing all the dependent variables for conditions to which that subject was *not* assigned, so if there are 2, 3 or 4 conditions in all, respectively, 50%, 66% or 75% of the data will be missing and imputed.  It would almost be as if a between-subjects experiment were turned into a within-subjects experiment as we'd then have responses for every subject for every condition.  Does that make sense?  -Don

On Tue, Jul 27, 2010 at 2:55 PM, Gary King <king@harvard.edu> wrote:
both of these would 'work', and both have been done before lots of times I think.  The only issue is that Amelia isn't 'aware' of the structural zeros in the second case or the interactions in the first.  e.g., for the interactions, if you include A, B, and A*B as variables, it doesn't know that the product of the first 2 variables is the 3rd and so the imputations won't necessarily respect that known equality.  The way we usually deal with that is to do the imputation unconstrained with all 3 variables and then fix it after the fact, such as by discarding A*B and multiplying the first two columns together.  This will usually give pretty reasonable results, although there may be a new method someone could devise (and program!) that could improve on this approach.
Gary
---
http://gking.harvard.edu



On Tue, Jul 27, 2010 at 12:14 PM, Donald Braman <donald.braman@gmail.com> wrote:
I have a question for the MI experts about imputations and experiments:

We often run experiments in which we hypothesize that responses will vary across conditions.  We expose subjects to a condition -- CONDa, CONDb, or CONDc, say -- and then measure responses to a dependent variables across all conditions, say DV1, DV2, etc.  We also collect data on various independent variables, say IV1, IV2, etc.

But because we anticipate the relationship between the IVs and DVs to vary across the conditions it seems like we ought to do one of two things when imputing missing data:

(1) interact every IV with the conditions so that we have, in effect DV1a, DV1_CONDb, DV1_CONDc, DV2_CONDa, etc.   But quite often, the result of that equation will just result in a zero when the dummy for that condition is zero rather than one.  This seems wasteful of information to me.  Which leads me to my alternative...

(2)  Rather than computing DV_CONDa as equalling zero when CONDa is also zero, I'm tempted to treat every alternative DV (DV1_CONDa, DV1_CONDb, DV1_CONDc ...) as missing for each case and impute it.  Missingness will be high, of course (only 1/conditions of DVs will be present), but at least I won't be throwing away lots of valuable data.   I hesitate to do so because I can't find anyone else who has done this and that makes me think I am probably misguided.

ps. I realize that this is a question about imputation generally, but I thought I'd post it here since I use Amelia for my imputation needs -- let me know if I should not post something like this here & I'll look elsewhere


Donald Braman
phone: 413-628-1221
-
Amelia mailing list served by Harvard-MIT Data Center
[Un]Subscribe/View Archive: http://lists.gking.harvard.edu/?info=amelia
More info about Amelia: http://gking.harvard.edu/amelia




--
Donald Braman
phone: 971-645-0607
http://www.culturalcognition.net/braman/
http://ssrn.com/author=286206
http://www.law.gwu.edu/Faculty/profile.aspx?id=10123