Hi there,
This was an issue in an earlier version of CEM, but the latest version on
SSC should have this fixed. Perhaps you can try to reinstall from the repo
and see if it is still an issue? Note that the matching still works and
cem_matched is correct in these situations. Hope that helps!
Cheers,
Matt
On Thu, Jan 11, 2018 at 3:43 AM LEE Matthew <matthew.lee(a)insead.edu> wrote:
> Dear all,
>
> I am writing to follow up on a past thread with the same subject line
> (original message below). I have encountered the same issue in which after
> calling cem in STATA, I have some observations for which cem_matched == 1
> but cem_strata is missing. I have not been able to solve it, but do have
> some additional clues and would be interested to know if the community has
> any ideas here.
>
> It seems that the cem command is truncating the assignment of cem_strata
> at a fixed limit of 32,740 strata (I don’t know if this value is general or
> specific to my data). When executing my original match (which has many
> theoretical strata based on the coarsened variables: 28 buckets X 10 X 5 X
> 5 X 5 X 5 = 175,000), the assignment of strata stops at 32,740. If I
> coarsen further so that the number of theoretical buckets < 32,740 and
> re-run CEM, there are no longer missing observations for cem_strata
> (unfortunately this further coarsening does not work for my study).
>
> One more clue: the truncation appears to operate according to ordered
> values of the first matching variable called by CEM. In my original
> matching attempt described above, the first variable was a year variable,
> which ranges from 2009-2013. In the results, the cem_strata values are
> defined for 2009 and 2010 and stop somewhere in 2011 — subsequent years
> have cem_strata missing.
>
> It would be great to know if anyone has further ideas about what might be
> going wrong here. Does cem in STATA have a theoretical maximum number of
> strata? Could it be a working memory issue?
>
> Many thanks,
>
> Matthew
>
> *Matthew Lee*
> Assistant Professor of Strategy
> INSEAD | 1 Ayer Rajah Avenue, Singapore 138676
> *matthew.lee(a)insead.edu <matthew.lee(a)insead.edu> | matthewscottlee.com
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__matthewscottlee.com&d=D…>*
>
> *--*
>
> https://lists.gking.harvard.edu/pipermail/cem/2014-September/000154.html
>
> Ben Hoen bhoen at lbl.gov
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__lbl.gov&d=DwMGaQ&c=WO-R…>
>
> Tue Sep 9 15:42:53 EDT 2014
>
> Hi all,
>
>
>
> I had been using a cem matching output to run regressions and have just now
> found that a large set of the output has the variable "cem_matched" ==1
> while the "cem_strata" ==. (a.k.a. missing). For those cases, there is
> also
> a weight stored in "cem_weights".
>
>
>
> Is this a common occurance? If so, would you be able to explain when/why
> this occurs?
>
>
>
> Ben
>
>
>
> Ben Hoen
>
> Staff Research Associate
>
> Lawrence Berkeley National Laboratory
>
> Office: 845-758-1896
>
> Cell: 718-812-7589
>
> bhoen at lbl.gov
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__lbl.gov&d=DwMGaQ&c=WO-R…>
>
> <http://emp.lbl.gov/staff/ben-hoen
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__emp.lbl.gov_staff_ben-2…>>
> http://emp.lbl.gov/staff/ben-hoen
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__emp.lbl.gov_staff_ben-2…>
>
> --
----- Matthew Blackwell http://www.mattblackwell.org
Hi Scott, if the coarsening is reasonable then large weights just indicate
lots of matches. It may be however that this is an opportunity to use more
fine grained coarsening since the larger of the two treatment regimes isn't
helping you that much. The reason is that the variance of a difference in
means is mostly a function of the smaller of the two means.
---
GaryKing.org
617-500-7570 <(617)%20500-7570>
On Dec 29, 2017 9:16 AM, "Scott Smith" <scott.al.smith(a)gmail.com> wrote:
In n:n coarsened exact matching, is there ever a situation where it makes
more sense to exclude study members and their respective control members
that have extremely large weights? For example, 500 study members match to
just 25 control members. Might it be better to exclude both study and
control members rather than introduce such large weights and variance to
the rest of the study population?
Thanks,
--
Scott Smith