Hi Haakon, thanks for your note. If you match (i.e.,prune further) within
and respecting CEM's strata, then you keep all the bias and model
dependence reducing properties of CEM. The weights I think would have to be
adjusted by using the same formulas we give, with whatever observations
that are left after your second stage procedure. see j.mp/CEMweights on
weights in general.
Gary
--
*Gary King* - Albert J. Weatherhead III University Professor - Director,
IQSS <http://iq.harvard.edu/> - Harvard University
GaryKing.org - King(a)Harvard.edu - @KingGary <https://twitter.com/kinggary> -
617-500-7570 - Assistant <king-assist(a)iq.harvard.edu>: 617-495-9271
On Thu, Feb 8, 2018 at 4:46 PM, Haakon Gjerløw <haakon.gjerlow(a)stv.uio.no>
wrote:
> Dear all,
>
>
>
> I have a question concerning the correct way to combine CEM with other
> matching procedures. Specifically, I am trying to match a data set with
> CEM, and then apply Entropy balancing to the remaining sample (
> https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1904869). It seems
> such two-step balancing is hinted at in both Iacus, King, Porro (2011) and
> Hainmuller (2012)
>
>
>
> My questions concerns the correct way to us the weights in regressions
> after both procedures are done.
>
>
>
> My intuition says that this is basically a two-step sampling procedure,
> and that the correct way to use the weights is to multiply the weights from
> CEM with the weights from Entropy (Solution A).
>
> However, it might also be that the Entropy Balancing is overriding the
> weights from CEM, and the observations should only be weighted by the
> weights from Entropy (Solution B)
>
>
>
> Have any of you investigated this in a more systematic/formal fashion?
>
>
>
> All the best,
>
> Haakon Gjerløw | Phd fellow
>
> Department of Political Science | University of Oslo
>
>
>
>
>
My feeling is that if you apply Entropy Balancing on CEM matched observations you should combine CEM and Entropy balancing weights. -your solution ( A)-.
At least I did so in the case of CEM and complex survey weights in Beyond the question “Does it pay to be green?”: How much green? and when? - ScienceDirect
|
|
| |
Beyond the question “Does it pay to be green?”: How much green? and when...
|
|
|
please note that there is a typo in page 631, ( the formula for weighted controls after CEM is not reporting "*swi".
But I am also really interested to the topic and would appreciate some more formal reference.
In any case, if Entropy Balancing balances covariates with respect to the first, second moment and possibly higher moments while BEM bounds all centered absolute moments. In this sense, I expect that applying the two methods separately will provide consistent results.
Hope it can help
Cesare
On Friday, 9 February 2018, 18:00:07 CET, cem-request(a)lists.gking.harvard.edu <cem-request(a)lists.gking.harvard.edu> wrote:
Send Cem mailing list submissions to
cem(a)lists.gking.harvard.edu
To subscribe or unsubscribe via the World Wide Web, visit
https://lists.gking.harvard.edu/mailman/listinfo/cem
or, via email, send a message with subject or body 'help' to
cem-request(a)lists.gking.harvard.edu
You can reach the person managing the list at
cem-owner(a)lists.gking.harvard.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Cem digest..."
Today's Topics:
1. Combining matching procedures (Haakon Gjerl?w)
----------------------------------------------------------------------
Message: 1
Date: Thu, 8 Feb 2018 21:46:55 +0000
From: Haakon Gjerl?w <haakon.gjerlow(a)stv.uio.no>
To: "cem(a)lists.gking.harvard.edu" <cem(a)lists.gking.harvard.edu>
Subject: [cem] Combining matching procedures
Message-ID: <26d789a592874d86a9c300711e316aa4(a)mail-ex02.exprod.uio.no>
Content-Type: text/plain; charset="iso-8859-1"
Dear all,
I have a question concerning the correct way to combine CEM with other matching procedures. Specifically, I am trying to match a data set with CEM, and then apply Entropy balancing to the remaining sample (https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1904869). It seems such two-step balancing is hinted at in both Iacus, King, Porro (2011) and Hainmuller (2012)
My questions concerns the correct way to us the weights in regressions after both procedures are done.
My intuition says that this is basically a two-step sampling procedure, and that the correct way to use the weights is to multiply the weights from CEM with the weights from Entropy (Solution A).
However, it might also be that the Entropy Balancing is overriding the weights from CEM, and the observations should only be weighted by the weights from Entropy (Solution B)
Have any of you investigated this in a more systematic/formal fashion?
All the best,
Haakon Gjerl?w | Phd fellow
Department of Political Science | University of Oslo
Dear all,
I have a question concerning the correct way to combine CEM with other matching procedures. Specifically, I am trying to match a data set with CEM, and then apply Entropy balancing to the remaining sample (https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1904869). It seems such two-step balancing is hinted at in both Iacus, King, Porro (2011) and Hainmuller (2012)
My questions concerns the correct way to us the weights in regressions after both procedures are done.
My intuition says that this is basically a two-step sampling procedure, and that the correct way to use the weights is to multiply the weights from CEM with the weights from Entropy (Solution A).
However, it might also be that the Entropy Balancing is overriding the weights from CEM, and the observations should only be weighted by the weights from Entropy (Solution B)
Have any of you investigated this in a more systematic/formal fashion?
All the best,
Haakon Gjerløw | Phd fellow
Department of Political Science | University of Oslo
Hi there,
This was an issue in an earlier version of CEM, but the latest version on
SSC should have this fixed. Perhaps you can try to reinstall from the repo
and see if it is still an issue? Note that the matching still works and
cem_matched is correct in these situations. Hope that helps!
Cheers,
Matt
On Thu, Jan 11, 2018 at 3:43 AM LEE Matthew <matthew.lee(a)insead.edu> wrote:
> Dear all,
>
> I am writing to follow up on a past thread with the same subject line
> (original message below). I have encountered the same issue in which after
> calling cem in STATA, I have some observations for which cem_matched == 1
> but cem_strata is missing. I have not been able to solve it, but do have
> some additional clues and would be interested to know if the community has
> any ideas here.
>
> It seems that the cem command is truncating the assignment of cem_strata
> at a fixed limit of 32,740 strata (I don’t know if this value is general or
> specific to my data). When executing my original match (which has many
> theoretical strata based on the coarsened variables: 28 buckets X 10 X 5 X
> 5 X 5 X 5 = 175,000), the assignment of strata stops at 32,740. If I
> coarsen further so that the number of theoretical buckets < 32,740 and
> re-run CEM, there are no longer missing observations for cem_strata
> (unfortunately this further coarsening does not work for my study).
>
> One more clue: the truncation appears to operate according to ordered
> values of the first matching variable called by CEM. In my original
> matching attempt described above, the first variable was a year variable,
> which ranges from 2009-2013. In the results, the cem_strata values are
> defined for 2009 and 2010 and stop somewhere in 2011 — subsequent years
> have cem_strata missing.
>
> It would be great to know if anyone has further ideas about what might be
> going wrong here. Does cem in STATA have a theoretical maximum number of
> strata? Could it be a working memory issue?
>
> Many thanks,
>
> Matthew
>
> *Matthew Lee*
> Assistant Professor of Strategy
> INSEAD | 1 Ayer Rajah Avenue, Singapore 138676
> *matthew.lee(a)insead.edu <matthew.lee(a)insead.edu> | matthewscottlee.com
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__matthewscottlee.com&d=D…>*
>
> *--*
>
> https://lists.gking.harvard.edu/pipermail/cem/2014-September/000154.html
>
> Ben Hoen bhoen at lbl.gov
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__lbl.gov&d=DwMGaQ&c=WO-R…>
>
> Tue Sep 9 15:42:53 EDT 2014
>
> Hi all,
>
>
>
> I had been using a cem matching output to run regressions and have just now
> found that a large set of the output has the variable "cem_matched" ==1
> while the "cem_strata" ==. (a.k.a. missing). For those cases, there is
> also
> a weight stored in "cem_weights".
>
>
>
> Is this a common occurance? If so, would you be able to explain when/why
> this occurs?
>
>
>
> Ben
>
>
>
> Ben Hoen
>
> Staff Research Associate
>
> Lawrence Berkeley National Laboratory
>
> Office: 845-758-1896
>
> Cell: 718-812-7589
>
> bhoen at lbl.gov
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__lbl.gov&d=DwMGaQ&c=WO-R…>
>
> <http://emp.lbl.gov/staff/ben-hoen
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__emp.lbl.gov_staff_ben-2…>>
> http://emp.lbl.gov/staff/ben-hoen
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__emp.lbl.gov_staff_ben-2…>
>
> --
----- Matthew Blackwell http://www.mattblackwell.org
Hi Scott, if the coarsening is reasonable then large weights just indicate
lots of matches. It may be however that this is an opportunity to use more
fine grained coarsening since the larger of the two treatment regimes isn't
helping you that much. The reason is that the variance of a difference in
means is mostly a function of the smaller of the two means.
---
GaryKing.org
617-500-7570 <(617)%20500-7570>
On Dec 29, 2017 9:16 AM, "Scott Smith" <scott.al.smith(a)gmail.com> wrote:
In n:n coarsened exact matching, is there ever a situation where it makes
more sense to exclude study members and their respective control members
that have extremely large weights? For example, 500 study members match to
just 25 control members. Might it be better to exclude both study and
control members rather than introduce such large weights and variance to
the rest of the study population?
Thanks,
--
Scott Smith
In n:n coarsened exact matching, is there ever a situation where it makes
more sense to exclude study members and their respective control members
that have extremely large weights? For example, 500 study members match to
just 25 control members. Might it be better to exclude both study and
control members rather than introduce such large weights and variance to
the rest of the study population?
Thanks,
--
Scott Smith
Hi Carrie, The contact we have at the FDA is Richard A. Forshee, who is the
Associate Director for Research at the Office of Biostatistics &
Epidemiology, in the Center for Biologics Evaluation and Research. He
arranged for the FDA to do this evaluation of CEM. They then evaluated the
software, quizzed us about various features and processes, and they made
their decision.
Best of luck with your research,
Gary
--
*Gary King* - Albert J. Weatherhead III University Professor - Director,
IQSS <http://iq.harvard.edu/> - Harvard University
GaryKing.org - King(a)Harvard.edu - @KingGary <https://twitter.com/kinggary> -
617-500-7570 - Assistant <king-assist(a)iq.harvard.edu>: 617-495-9271
On Mon, Aug 28, 2017 at 1:18 PM, Carrie Bennette <cb11(a)uw.edu> wrote:
> Hello,
>
> I'm very interested in using CEM for some projects evaluating oncology
> therapies, some of which may be submitted to the FDA. Could someone provide
> more context regarding the note on the CEM website that CEM has officially
> been "Qualified for Scientific Use" by the U.S. Food and Drug
> Administration as I've not seen this designation used before? I'm
> particularly interested in understanding what criteria the FDA used in
> granting this designation and, if applicable, which group within the FDA
> (e.g. CDER, CBER, CDRH) granted it.
>
> Thanks!
> Carrie
> --
> *Carrie Bennette, PhD, MPH*
> *Senior Methodologist, Quantitative Sciences, Flatiron Health*
> *Affiliate Assistant Professor, Department of Pharmacy, University of
> Washington*
>
>
Hello,
I'm very interested in using CEM for some projects evaluating oncology
therapies, some of which may be submitted to the FDA. Could someone provide
more context regarding the note on the CEM website that CEM has officially
been "Qualified for Scientific Use" by the U.S. Food and Drug
Administration as I've not seen this designation used before? I'm
particularly interested in understanding what criteria the FDA used in
granting this designation and, if applicable, which group within the FDA
(e.g. CDER, CBER, CDRH) granted it.
Thanks!
Carrie
--
*Carrie Bennette, PhD, MPH*
*Senior Methodologist, Quantitative Sciences, Flatiron Health*
*Affiliate Assistant Professor, Department of Pharmacy, University of
Washington*
Hi All,
Could anybody provide some insights about CEM on low frequency events? We observe the results can be sensitive to how we match (quarterly event or annual) and the event rates is almost always lower for both (intervention and control) groups after match?
Thanks,
Fang
Hi All,
I am really excited about the CEM and want to use it for a lot of future studies. I have two questions:
1. Which software performs the fastest comparing R, STATA and SAS macro?
2. I want to get a matched id so that I know which pair (case and control 1to 1 match) are matched together? Is it possible to obtain a data set has that column in the SAS macro or STATA?
Thanks a lot!
Fang