in principle you could include all the dummy variables, but that would make Amelia run very slow because you will have too many parameters. You can't code the stratum indicator as a continuous variable presumably because that doesn't mean anything useful. What you can do is to either include the variables that pick up the main differences among the census tracts, which perhaps are already among your demographic indicators, or you can include group-level dummies, such as state indicators, or urban-rural, etc.

Gary
---
http://gking.harvard.edu

On 10/10/2009 01:59 PM, Fabrício Mendes Fialho wrote:

Hi Amelia II developers and users,

According King et al (APSR 95(1), 2001, p. 57, footnote 18):

"If the data are generated using a complex or multistage survey design, then information abouth the design should be included in the imputation model. For example. ot accoount for stratified sampling, the imputation model should include the strata coded as dummy variables."

How should I proceed if my data come from a survey design using clusters? Almost all data I analyze use census tracts as PSU: first, n census tracts are randomly selected (in dataset I'm currently working, n = 127); then households are randomly selected from each census tract (each tract containing around dozen of cases). Dataset includes one variable indicating from which census tract/PSU each case is from. Should I just include this variable in the MI process like it is (numbering census tracts from 1 to 127), or should I create dummy variables (one dummy variable for each of the 127 census tracts in my sample)?

Thanks for all help (again).

Sincerely,

Fabricio Fialho

- Amelia mailing list served by Harvard-MIT Data Center [Un]Subscribe/View Archive: http://lists.gking.harvard.edu/?info=amelia More info about Amelia: http://gking.harvard.edu/amelia