I am conducting a meta-analyses of national cross-sectional cluster sample surveys (Demographic and Health Surveys) to examine the effect of diagnostic testing on drugs used to treat pediatric fevers in multiple sub-Saharan African countries. We are currently using mixed-models adjusted for confounding covariates and data clustering (random effects: PSU and country identifiers).

I would like to use CEM to pre-process data to balance a set of confounders (e.g. maternal education, child’s age, etc) across treatment groups (tested and untested kids in our study), and then run a logistic regression on the matched dataset to quantify the influence of testing on treatments.

My question then is how to account for data clustering in matching and subsequent regression adjustments? If we matched children using CEM according to country, could we then relax the model specifications such that country does not need to be included a random effect? But still, we need to account for data clustering at the PSU level. Do we still need a mixed-model approach for the matched dataset, or is a simple multivariate logistic regression adequate even if observations are not independent?

Otherwise, is matching even advisable in this scenario and best to continue using our mixed-model approach?

Emily White Johansson

PhD student

Uppsala University

Dept Women's and Children's Health

International Maternal and Child Health