Just a quick answer to experiment with coarsening: use option

eval.imb=FALSE

inside the cem() command [or equivalent in Stata]

what slows down the computation is the L1 measure not the coarsening or cem itself

we are working on this issue for the next release
stefano

Inviato da iPhone

Il giorno 07/apr/2011, alle ore 16:25, "Scherer, Ethan" <escherer@prgs.edu> ha scritto:

Dear CEM list,

 

I am using CEM to match individuals receiving workers compensation from an injury to non-injured workers.  We have several continuous variables (e.g., income, firm size, age), a categorical variable (e.g., industry) as well as some dichotomous variables (e.g., gender, born in state).

 

The sample is quite large with many more  potential controls (1.2 million) than injured workers (4 thousand).  Prior to using CEM I coarsened the data myself by putting income into quintiles, four firm size categories, 4 age groups, and 10 industry categories.  I then ran CEM with automatic cuts.  However, based upon the sample size Stuge’s Rule creates 22 bins for each variable which in many cases don’t exist (1/2 a woman).  The bins tend not to be very “coarse” with approximately 2,000 strata.

 

To try and improve this, I put in some cut points similar (coarser than above mention) and then the program never seemed to finish running (2 days later I killed it).

 

Thus, I am thinking of using a different set of auto cuts, but I think the Freedman-Diaconis rule would yield even more cutpoints and I wasn’t sure what other algorithms were available (none listed in the Stata Journal Article).

 

Do you have any suggestion how to coarsen the data further so that I can get the most out of the program?   

 

Thanks in advance for your help!

  

Ethan Scherer MPP, CPA

Doctoral Fellow, Pardee RAND Graduate School

1776 Main St., Mailstop M1N

Santa Monica, CA 90401

W: 310-393-0411 x6056

E: escherer@rand.org


__________________________________________________________________________

This email message is for the sole use of the intended recipient(s) and
may contain confidential information. Any unauthorized review, use,
disclosure or distribution is prohibited. If you are not the intended
recipient, please contact the sender by reply email and destroy all copies
of the original message.