not sure whether the regressions per se are the right thing to do, but its seems reasonable.  using cem to balance the 2 cross-sections is a creative application of the technology.
Gary
--
Gary KingAlbert J. Weatherhead III University Professor - Director, IQSS - Harvard University
GKing.Harvard.edu - King@Harvard.edu - @kinggary - 617-500-7570 - Asst 495-9271 - Fax 812-8581



On Tue, Nov 2, 2010 at 8:10 PM, Reynaldo T. Rojo Mendoza <rtr11@pitt.edu> wrote:

Hello,

 

I am currently working on a paper where I use one-shot survey data from 2006 and 2008 (repeated cross-sections, exact same variables) and, although they are supposed to be random samples drawn from the same population, there is considerable multivariate imbalance on demographic variables. Thus, I used CEM on the pooled sample (with year as “treatment”) to make the samples comparable:

 

UNWEIGHTED MEANS                  

year       education            male          age     urban

2006          8.572                 0.493      37.611   0.792

2008          8.269                 0.495      40.841     0.692

Multivariate L1 distance: .34852375

 

CEM WEIGHED MEANS

year       education            male          age       urban

2006          8.204                 0.492      40.278     0.700

2008          8.256                 0.492       40.696     0.700

Multivariate L1 distance: 2.355e-15

 

My question is whether it makes sense to run separate regressions for 2006 and 2008 using the respective CEM weights obtained from the pooled sample.

 

Hope you can help. Thanks!  

  

Reynaldo T. Rojo Mendoza

Ph.D. Student

Department of Political Science

University of Pittsburgh