Just realized that blockgroup and county are both strings. See below:
That likely is NOT what cem is looking for is it? Source of the problem?
(And yes, block group variable, which is the census number, is unique across counties)
Ben
Ben Hoen
LBNL
Office: 845-758-1896
Cell: 718-812-7589
From: Matt Blackwell [mailto:m.blackwell@rochester.edu]
Sent: Monday, July 07, 2014 10:10 PM
To: Ben Hoen
Cc: cem(a)lists.gking.harvard.edu
Subject: Re: [cem] Understaning CEM's use of a categorical variable and #0
Hi Ben,
Hm, it definitely should produce more matches when you use county. One possible issue that
I can think of off the top of my head is this: is the block group variable unique across
counties/states? Or do the values of the block group variable repeat? One thing to check
is to see if what happens if you exact match on both the county and the block group in a
single match.
Hope that helps! If it doesn't, definitely let us know.
Cheers,
Matt
~~~~~~~~~~~
Matthew Blackwell
Assistant Professor of Government
Harvard University
url:
http://www.mattblackwell.org
On Mon, Jul 7, 2014 at 9:36 PM, Ben Hoen <bhoen(a)lbl.gov> wrote:
Hi all,
I have been using the program cem in Stata (Version 13 MP, with Windows 7 Pro 64 bit), and
thought I understood what it was doing well enough but today something occurred which
surprised (read worried) me, in that it acted as I would NOT have expected it to.
I am trying to match target (i.e,, treated) homes to similar (i.e.,
"comparable") homes that do not have the treatment. In this case, the
"treatment" is whether the home does or does not have a photovoltaic energy
system (pv). I have 100 pv homes (treated), and ~ 5,000 non-pv homes (comparable).
To match these homes I am using some basic characteristics of the home - e.g., square feet
of living space (sfla), size of the parcel (acres), age of the home (age), as well as the
year in which it sold (sale year) to ensure the comparable home sold in the same year as
the target home and, finally, a geographic variable (such as the block group) to ensure
the comparable home is located in the same geography. For sale year and the geogrpahy,
they must match perfectly; i.e., the comparable homes must have sold in the same year as
the target (pv) home and also be located in the same geography. For the purposes of this
discussion those geographies could be either the census block group (blockgroup) or the
county (county). All of the block groups fall within the counties, and there are many more
block groups than counties delineated in the data. For example, I have approximately 30
block groups (each with at least one treated and one comparable case) and 10 counties
(each with at least one treated and one comparable). In practice, though, in most
geographies I have ~ 20-50 times the number of pv homes available as comparables to match
to.
Using the sample data and talking to local experts, I have established appropriate cut
points for my various characteristics and run a command similar to the following, when
blockgroup is used as the geography:
cem sfla(0 1000 2000 3000 5000) age(0 1 10 20 100) acres(0.05 0.15 0.5 1 10) saleyear(#0)
blockgroup(#0) , treatment(pv)
And the following, when county is used as the geography:
cem sfla(0 1000 2000 3000 5000) age(0 1 10 20 100) acres(0.05 0.15 0.5 1 10) saleyear(#0)
county(#0) , treatment(pv)
So, here's the confusing part:
I will have ~ 70 matching pv homes, and 300 comparable homes if blockgroup is used, but
only 20 matching pv homes, and 100 comparables homes if county is used. In other words,
when I allow a broader geography of comparables to be drawn from, I get fewer matching
cases. i would think the exact opposite would be the case; if a cast a broader geographic
net, I would have more matches not less.
Any ideas why this would occur?
Thanks, in advance, for any insight you could offer.
Ben
Berkeley Lab
Ben Hoen
Staff Research Associate
Lawrence Berkeley National Laboratory
Office: 845-758-1896
Cell: 718-812-7589
bhoen(a)lbl.gov
http://emp.lbl.gov/staff/ben-hoen
<https://urldefense.proofpoint.com/v1/url?u=http://emp.lbl.gov/staff/ben-hoen&k=AjZjj3dyY74kKL92lieHqQ%3D%3D%0A&r=wldobffzOUTOxpSiBCeaJ8koG11T3tB%2FizPx3rQIeN4%3D%0A&m=YOEeVogLM2TPKRP%2BPbYrnY%2FVTGm0ZObcn2JParSlHSs%3D%0A&s=9efd544f111d8f4f87d1c1fe71296892b9a4dd539a4458113a3e19e6c60267d3>
Visit our publications at:
http://emp.lbl.gov/reports/re
<https://urldefense.proofpoint.com/v1/url?u=http://emp.lbl.gov/reports/re&k=AjZjj3dyY74kKL92lieHqQ%3D%3D%0A&r=wldobffzOUTOxpSiBCeaJ8koG11T3tB%2FizPx3rQIeN4%3D%0A&m=YOEeVogLM2TPKRP%2BPbYrnY%2FVTGm0ZObcn2JParSlHSs%3D%0A&s=fe142ea1bc9393284c0f77085e541a15ef862edbd0cd78a36c396b1ec9e57573>
Sign up for our email list to receive publication notifications at:
https://spreadsheets.google.com/a/lbl.gov/spreadsheet/viewform?formkey=dGlF…
<https://urldefense.proofpoint.com/v1/url?u=https://spreadsheets.google.com/a/lbl.gov/spreadsheet/viewform?formkey%3DdGlFS1U1NFlUNzQ1TlBHSzY2VGZuN1E6MQ&k=AjZjj3dyY74kKL92lieHqQ%3D%3D%0A&r=wldobffzOUTOxpSiBCeaJ8koG11T3tB%2FizPx3rQIeN4%3D%0A&m=YOEeVogLM2TPKRP%2BPbYrnY%2FVTGm0ZObcn2JParSlHSs%3D%0A&s=69dbd2f0fc1d7f8a11f4740cd616c8153b61dafd188209081b767928df00cc0b>
-
--
cem Mailing List, served by HUIT
Send messages: cem(a)lists.gking.harvard.edu
[un]subscribe Options:
http://lists.gking.harvard.edu/?info=cem
More information on cem:
http://gking.harvard.edu/cem
Cem mailing list
Cem(a)lists.gking.harvard.edu
To unsubscribe from this list or get other information:
https://lists.gking.harvard.edu/mailman/listinfo/cem