Yes. but even better are simulated marginal effects (ie. first differences)
or predicted probabilities of Y=1 for each category. You can get these using
zelig.
From: gov2001-l-bounces at
lists.fas.harvard.edu
[mailto:gov2001-l-bounces at
lists.fas.harvard.edu] On Behalf Of Alexei Colin
Sent: Thursday, May 01, 2008 10:52 PM
To: gov2001-l at
lists.fas.harvard.edu
Subject: Re: [gov2001-l] Testing for a non-linear relationship
Ok, the model was a logit, so the previous interpretation of coeffs does not
apply.
I read that coeffs for dummy variables in a logit can be used to calculate
odds ratio
(odds that Y=1) by computing odds_ratio = exp(dummy_coeff). If I do that, I
get that
odds that Y=1 first increase (with slowing rate) and then decrease. If I can
apply
the same logic, I would conclude that highest AccuracyMean has weaker
relationship
with Y than one would expect. Would that makes sense?
Thanks!
On 05/01/2008 10:14 PM, Jens Hainmueller wrote:
is this a linear model? ie. what kind of coefficients are these? If they are
regression coefficients that this is evidence of non-linearity yes. the
lower significance for the last dummy does not matter. If it were really
insignificant it would be even more evidence for non-linearity. You can also
use a joint significant test for all coeffs jointly.
The results indicate that the 0 cat has the lowest predicted average Y, then
it rises to the highest level in cats 2-3 then levels off again for the last
cat. This suggests that a simple linear specification is not well warranted.
Since you have so much data I would break out even more maybe. And
definitely read up on dummy variables. It's not that hard, but essential to
understand.
jens
From: gov2001-l-bounces at
lists.fas.harvard.edu
[mailto:gov2001-l-bounces at
lists.fas.harvard.edu] On Behalf Of Alexei Colin
Sent: Thursday, May 01, 2008 10:10 PM
To: gov2001-l at
lists.fas.harvard.edu
Subject: Re: [gov2001-l] Testing for a non-linear relationship
Jon, cr.plots() gave a slightly curved lowess curve (hint at nonlinearity, I
assume).
Jens, sorry about being illiterate; resource read. AccuracyMean0 chosen as
reference.
The differences in Y are increasing as category changes at a slowing rate
until the
difference in Y starts to decrease. This is evidence of nonlinearity,
correct? If yes,
then is this evidence valid even though the last coefficient is
insignificant?
Coefficients:
Estimate Std. Error z value Pr(>|z|)
AccuracyMean1 0.2346 0.0827 2.84 0.0046 **
AccuracyMean2 0.3194 0.0778 4.11 4.0e-05 ***
AccuracyMean3 0.3617 0.0929 3.89 9.8e-05 ***
AccuracyMean4 0.1477 0.0969 1.52 0.1273
Thank you!!
On 05/01/2008 09:31 PM, Jens Hainmueller wrote:
Alexei,
You need to exclude one of the dummy categories as a reference group. The
coefficients identify the difference in Y between each group and the
reference group. So let's say cat is a dummy for whether you are in group 1
and group 0 is the reference group. Then in a linear additive model
beta1*cat gets at the difference between E[Y|X,cat=1] and E[Y|X,cat=0].
Grab any stats textbook (preferably a book on regression) and read up on
"dummy variables".
This get's you started.
http://www.socialresearchmethods.net/kb/dummyvar.php
Jens
-----Original Message-----
From: gov2001-l-bounces at
lists.fas.harvard.edu [mailto:gov2001-l-
bounces at
lists.fas.harvard.edu] On Behalf Of Alexei Colin
Sent: Thursday, May 01, 2008 9:02 PM
To: gov2001-l at
lists.fas.harvard.edu
Subject: [gov2001-l] Testing for a non-linear relationship
Hi all,
A basic question: we are trying to test for a nonlinear relationship
between a covariate X and a dependent variable Y by breaking up X
into categories and associating a dummy variable with each category.
Dummy variable Di will be 1 iff X falls into corresponding interval i.
We get insignificant results and are not sure if we can still make
conclusions based on them:
Coefficients:
Estimate Std. Error z value Pr(>|z|)
AccuracyMean0 -0.06717 0.12826 -0.52 0.60049
AccuracyMean1 0.21580 0.11777 1.83 0.06690 .
AccuracyMean2 0.27100 0.11473 2.36 0.01817 *
AccuracyMean3 0.29183 0.12434 2.35 0.01892 *
AccuracyMean4 0.06906 0.12720 0.54 0.58721
The count of observations in each interval looks reasonable:
Inverval#: 0, 1, 2, 3, 4
Observations: 816, 1686, 4226, 4012, 2107
Is a conclusion such as: "when AccuracyMean becomes high, its
relationship
with Y becomes weaker since the coefficient is much smaller (0.06 <<
0.2)"
valid? Or does the insignificance of D0 and D4 not allow for such
conclusion?
Thank you for your time!!
-Alexei
_______________________________________________
gov2001-l mailing list
gov2001-l at
lists.fas.harvard.edu
http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l
_______________________________________________
gov2001-l mailing list
gov2001-l at
lists.fas.harvard.edu
http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l
_____
_______________________________________________
gov2001-l mailing list
gov2001-l at
lists.fas.harvard.edu
http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l