Dear CEM list -
After running the CEM, the cem_matched variable indicates that 100,552
(97,161 controls to 3,391 treatment) of my observations are matched.
However, of these, 86,261 or 85%, are missing the cem_strata variable.
I quickly looked through the data and cem_weights have been attached as
well.
Any potential suggestions as to what is going on?
Many Thanks,
Ethan
__________________________________________________________________________
This email message is for the sole use of the intended recipient(s) and
may contain confidential information. Any unauthorized review, use,
disclosure or distribution is prohibited. If you are not the intended
recipient, please contact the sender by reply email and destroy all copies
of the original message.
-
cem Mailing List, served by Harvard-MIT Data Center
Send messages: cem(a)lists.gking.harvard.edu
[un]subscribe Options: http://lists.gking.harvard.edu/?info=cem
More information on cem: http://gking.harvard.edu/cem
Dear CEM list,
I am using CEM to match individuals receiving workers compensation from
an injury to non-injured workers. We have several continuous variables
(e.g., income, firm size, age), a categorical variable (e.g., industry)
as well as some dichotomous variables (e.g., gender, born in state).
The sample is quite large with many more potential controls (1.2
million) than injured workers (4 thousand). Prior to using CEM I
coarsened the data myself by putting income into quintiles, four firm
size categories, 4 age groups, and 10 industry categories. I then ran
CEM with automatic cuts. However, based upon the sample size Stuge's
Rule creates 22 bins for each variable which in many cases don't exist
(1/2 a woman). The bins tend not to be very "coarse" with approximately
2,000 strata.
To try and improve this, I put in some cut points similar (coarser than
above mention) and then the program never seemed to finish running (2
days later I killed it).
Thus, I am thinking of using a different set of auto cuts, but I think
the Freedman-Diaconis rule would yield even more cutpoints and I wasn't
sure what other algorithms were available (none listed in the Stata
Journal Article).
Do you have any suggestion how to coarsen the data further so that I can
get the most out of the program?
Thanks in advance for your help!
Ethan Scherer MPP, CPA
Doctoral Fellow, Pardee RAND Graduate School
1776 Main St., Mailstop M1N
Santa Monica, CA 90401
W: 310-393-0411 x6056
E: escherer(a)rand.org
__________________________________________________________________________
This email message is for the sole use of the intended recipient(s) and
may contain confidential information. Any unauthorized review, use,
disclosure or distribution is prohibited. If you are not the intended
recipient, please contact the sender by reply email and destroy all copies
of the original message.
Hi, cemers,
I am currently using cem to examine whether domestic firms being geographically close to foreign firms affects their exports (the geogrpahical proximity is the treatment). After cem (over a set of firm characteristics, such as firm size), I obtained the imabalance indicator:
Multivariate L1 distance: 1.9969533
However, the manual says that L1 shall be between 0 and 1. Can anybody help?
Cheers
Sun
Dr Sizhong Sun
Lecturer in Economics
School of Business, James Cook University, QLD 4814
P (07) 4781 4710
I +61 7 4781 4710
F (07) 4781 4019
E sizhong.sun(a)jcu.edu.au<mailto:sizhong.sun@jcu.edu.au>
www.jcu.edu.au<http://www.jcu.edu.au/>
Location: DA27.219
JCU CRICOS Provider Code: 00117J
Note: The contents of this email transmission, including any attachments, are intended solely for the named addressee and are confidential; any unauthorised use, reproduction or storage of the contents and any attachments is expressly prohibited. If you have received this transmission in error please delete it and any attachments from your system immediately and advise the sender by return email or telephone. James Cook University does not warrant that this email and any attachments are error or virus free.