[gov2001-l] Need help understanding simulation process - Gov2001

yohai＠fas.harvard.edu

27 Mar 27 Mar

1:13 p.m.

Hi Bilal, If you look at the section7 handout (in the Sections folder on the course website), there are a few examples that does what I think you would like. See subsection 4 called "R code" and also subsection 5 which shows how the Zelig syntax works for this type of thing. In your particular example, I don't see a 'race' term in your logit equation. You can estimate a logit with income, race, and education on the right hand side. Then when you simulate a first difference, you can hold income and education at their means, while changing race presumably from a zero to 1 if you have race coded as a dummy variable. In Zelig this is achieved by using two setx commands - again, see subsection 5 of the section 7 handout. On your last point about changing from 45% to 50% turnout, I'm not sure I follow entirely. But you can play around by changing the levels of education in your first difference setup, while holding the other variables at means (or at some other values that you deem substantively important) and computing the effect on turnout. You need to take draws from a multivariate normal distribution to make this work, so I definitely would suggest moving to R. Best, Ian On Tue, 28 Mar 2006, Bilal Khan wrote:

...

Hi All Can somebody help me to understand the two types of simulation that Gary gave lecture on. I am still bit confused. I use SPSS for my logit works but I strongly believe that we have to move beyond calculating simple betas and odds and give quantities of interest along with uncertainity. Suppose Beta = .0250I for education and Beta = .06531 for income in a logistic regression equation: Logit (turnout) = .02501 education + .06531 income. I would like to know through an example how would you simulate the impact of race on turnout 1. while holding constant income and education at their means. 2. with income bracket of 30,000 to 45,000 dollars and less than high school of education. Can somebody give example by drawing three to four samples? Also many times when you have predicted probabilities of voting in an election for a data set using logistic regression model for each case in the sample of a state or an area and after considering probability of less than .50 not voting and more than .50 voting, how can you show the impact of changing a value of the parameter e.g. education with less than high school to all the sample having atleast high school education, on the predicted turnout of say 45 percent for the sample. That is I would like to say that changing a certain parameter (kind of first difference) the total turnout would improve from 45 percent to 50 percent or whatever. I know I can do that in SPSS but it wont give me uncertanity or confidence intervals: which most of the analysts dont give for such type of "what if analysis" I am going through the work of Wolfinger and Rosentone "Who votes"; excelent work but no confidence interval levels or uncertanity in explaining their quantities of interest claculating through probit. How can you use Zelig for producing such quantities of interest? Bilal

Reply

ghumphr＠fas.harvard.edu

2:56 p.m.

The stochastic component of a logit model may be written as a Bernoulli distribution with some parameter $\pi$. If you have a large sample that reflects the population, you may draw from it as if it were the population, for the purpose of producing estimates. If you were to produce models based on samples of the sample, you could look at variation in model parameters for purpose of estimating confidence intervals of parameters based on size of the drawn samples. Normally you will want to draw samples of the size of the sample. Quoting Bilal Khan <sbhk597(a)gmail.com>om>:

...

Hi Ian So sorry to bother you again. Actually this stuff is so important that I would like to make sure that I understand it properly. I dont know why I have problem understanding it; perhaps due to non familiarity with various notations or matrix algebra, whatever, but I really want to understand it properly and that is why I am giving you some problem. I really appreciate your help on this. Okay! I have gone through the notes and the article and every thing but I am still not clear. What do you mean when you say "you can use the draws of the coefficient to simulate uncertainty about these fitted values" or when Gary says in his article on page three and point 2 "Draw one value of the vector .... ...from the multivariate normal distribution in Equation 4. Denote the ......." and points 3 nad 4 on the same page 3. Taking the simulated effect ....... 4. Simulate the outcome variable Y hat........ Can you give me an example of two or three random draws using the Logit model from Gary,s article the one he simulated from NES study by Rosenstone and Hanson. What I really did not understand was *how he repeated for each case the expected value algorithm M = 1000 times to approximate a 99 percent confidence intervals around the probability of voting.* I would reaaly appreciate your help. Thanks again Bilal On 3/28/06, Ian Brett Yohai <yohai(a)fas.harvard.edu> wrote:

Hi Bilal, If you look at the section7 handout (in the Sections folder on the course website), there are a few examples that does what I think you would like. See subsection 4 called "R code" and also subsection 5 which shows how the Zelig syntax works for this type of thing. In your particular example, I don't see a 'race' term in your logit equation. You can estimate a logit with income, race, and education on the right hand side. Then when you simulate a first difference, you can hold income and education at their means, while changing race presumably from a zero to 1 if you have race coded as a dummy variable. In Zelig this is achieved by using two setx commands - again, see subsection 5 of the section 7 handout. On your last point about changing from 45% to 50% turnout, I'm not sure I follow entirely. But you can play around by changing the levels of education in your first difference setup, while holding the other variables at means (or at some other values that you deem substantively important) and computing the effect on turnout. You need to take draws from a multivariate normal distribution to make this work, so I definitely would suggest moving to R. Best, Ian On Tue, 28 Mar 2006, Bilal Khan wrote:

Hi All Can somebody help me to understand the two types of simulation that Gary gave lecture on. I am still bit confused. I use SPSS for my logit works

but

I strongly believe that we have to move beyond calculating simple betas

and

odds and give quantities of interest along with uncertainity. Suppose Beta = .0250I for education and Beta = .06531 for income in a logistic regression equation: Logit (turnout) = .02501 education +

.06531

income. I would like to know through an example how would you simulate

the

impact of race on turnout 1. while holding constant income and education at their means. 2. with income bracket of 30,000 to 45,000 dollars and less than high

school

of education. Can somebody give example by drawing three to four samples? Also many times when you have predicted probabilities of voting in an election for a data set using logistic regression model for each case in

the

sample of a state or an area and after considering probability of less

than

.50 not voting and more than .50 voting, how can you show the impact of changing a value of the parameter e.g. education with less than high

school

to all the sample having atleast high school education, on the predicted turnout of say 45 percent for the sample. That is I would like to say that changing a certain parameter (kind of

first

difference) the total turnout would improve from 45 percent to 50

percent or

whatever. I know I can do that in SPSS but it wont give me uncertanity or

confidence

intervals: which most of the analysts dont give for such type of "what

if

analysis" I am going through the work of Wolfinger and Rosentone "Who votes"; excelent work but no confidence interval levels or uncertanity

in

explaining their quantities of interest claculating through probit. How can you use Zelig for producing such quantities of interest? Bilal

_______________________________________________ gov2001-l mailing list gov2001-l(a)lists.fas.harvard.edu http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l

Reply

nall＠fas.harvard.edu

4:59 p.m.

New subject: [gov2001-l] Re: forward loop question

Suzanna, There's no need to use a for loop in this situation. If your two columns of data have names, you can apply the operators to the columns as if you're applying the operation to each row. Hence, one line: vector <- abs(diversity.clean[,1]-diversity.clean[,2]) On 3/28/06, Suzanna Chapman <schapman(a)fas.harvard.edu> wrote:

...

Hi guys, quick question on forward loops - I have used cbind to put two columns of data together, and I want to fill a vector with the absolute values of the difference between the two columns for each row. I know this should be easy, but I'm not getting it to work - here's what I have but it doesn't work - can someone tell me what I'm missing? thanks! - Suzanna vector<-c() for (i in 3110) { vector[i] <- abs(diversity.clean[i,1]-diversity.clean[i,2]) } On Wed, 29 Mar 2006, Bilal Khan wrote:

Hi Ian So sorry to bother you again. Actually this stuff is so important that I would like to make sure that I understand it properly. I dont know why I have problem understanding it; perhaps due to non familiarity with various notations or matrix algebra, whatever, but I really want to understand it properly and that is why I am giving you some problem. I really appreciate your help on this. Okay! I have gone through the notes and the article and every thing but I am still not clear. What do you mean when you say "you can use the draws of the coefficient to simulate uncertainty about these fitted values" or when Gary says in his article on page three and point 2 "Draw one value of the vector .... ...from the multivariate normal distribution in Equation 4. Denote the ......." and points 3 nad 4 on the same page 3. Taking the simulated effect ....... 4. Simulate the outcome variable Y hat........ Can you give me an example of two or three random draws using the Logit model from Gary,s article the one he simulated from NES study by Rosenstone and Hanson. What I really did not understand was *how he repeated for each case the expected value algorithm M = 1000 times to approximate a 99 percent confidence intervals around the probability of voting.* I would reaaly appreciate your help. Thanks again Bilal On 3/28/06, Ian Brett Yohai <yohai(a)fas.harvard.edu> wrote:

Hi Bilal, If you look at the section7 handout (in the Sections folder on the course website), there are a few examples that does what I think you would like. See subsection 4 called "R code" and also subsection 5 which shows how the Zelig syntax works for this type of thing. In your particular example, I don't see a 'race' term in your logit equation. You can estimate a logit with income, race, and education on the right hand side. Then when you simulate a first difference, you can hold income and education at their means, while changing race presumably from a zero to 1 if you have race coded as a dummy variable. In Zelig this is achieved by using two setx commands - again, see subsection 5 of the section 7 handout. On your last point about changing from 45% to 50% turnout, I'm not sure I follow entirely. But you can play around by changing the levels of education in your first difference setup, while holding the other variables at means (or at some other values that you deem substantively important) and computing the effect on turnout. You need to take draws from a multivariate normal distribution to make this work, so I definitely would suggest moving to R. Best, Ian On Tue, 28 Mar 2006, Bilal Khan wrote:

Hi All Can somebody help me to understand the two types of simulation that Gary gave lecture on. I am still bit confused. I use SPSS for my logit works

but

I strongly believe that we have to move beyond calculating simple betas

and

odds and give quantities of interest along with uncertainity. Suppose Beta = .0250I for education and Beta = .06531 for income in a logistic regression equation: Logit (turnout) = .02501 education +

.06531

income. I would like to know through an example how would you simulate

the

impact of race on turnout 1. while holding constant income and education at their means. 2. with income bracket of 30,000 to 45,000 dollars and less than high

school

of education. Can somebody give example by drawing three to four samples? Also many times when you have predicted probabilities of voting in an election for a data set using logistic regression model for each case in

the

sample of a state or an area and after considering probability of less

than

.50 not voting and more than .50 voting, how can you show the impact of changing a value of the parameter e.g. education with less than high

school

to all the sample having atleast high school education, on the predicted turnout of say 45 percent for the sample. That is I would like to say that changing a certain parameter (kind of

first

difference) the total turnout would improve from 45 percent to 50

percent or

whatever. I know I can do that in SPSS but it wont give me uncertanity or

confidence

intervals: which most of the analysts dont give for such type of "what

if

analysis" I am going through the work of Wolfinger and Rosentone "Who votes"; excelent work but no confidence interval levels or uncertanity

in

explaining their quantities of interest claculating through probit. How can you use Zelig for producing such quantities of interest? Bilal

_______________________________________________ gov2001-l mailing list gov2001-l(a)lists.fas.harvard.edu http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l

Reply

owasow＠fas.harvard.edu

5:22 p.m.

New subject: [gov2001-l] Re: forward loop question

Hi Suzanna: Clayton and Delia have answered the substantive part of your question, but I thought it might be helpful to also note that your for-loop is probably not working because the "3110" actually needs to written as "1:3110" (assuming you want to start at 1). Omar On Mar 28, 2006, at 9:55 PM, Suzanna Chapman wrote: Hi guys, quick question on forward loops - I have used cbind to put two columns of data together, and I want to fill a vector with the absolute values of the difference between the two columns for each row. I know this should be easy, but I'm not getting it to work - here's what I have but it doesn't work - can someone tell me what I'm missing? thanks! - Suzanna vector<-c() for (i in 3110) { vector[i] <- abs(diversity.clean[i,1]-diversity.clean[i,2]) } On Wed, 29 Mar 2006, Bilal Khan wrote:

...

Hi Ian So sorry to bother you again. Actually this stuff is so important that I would like to make sure that I understand it properly. I dont know why I have problem understanding it; perhaps due to non familiarity with various notations or matrix algebra, whatever, but I really want to understand it properly and that is why I am giving you some problem. I really appreciate your help on this. Okay! I have gone through the notes and the article and every thing but I am still not clear. What do you mean when you say "you can use the draws of the coefficient to simulate uncertainty about these fitted values" or when Gary says in his article on page three and point 2 "Draw one value of the vector .... ...from the multivariate normal distribution in Equation 4. Denote the ......." and points 3 nad 4 on the same page 3. Taking the simulated effect ....... 4. Simulate the outcome variable Y hat........ Can you give me an example of two or three random draws using the Logit model from Gary,s article the one he simulated from NES study by Rosenstone and Hanson. What I really did not understand was *how he repeated for each case the expected value algorithm M = 1000 times to approximate a 99 percent confidence intervals around the probability of voting.* I would reaaly appreciate your help. Thanks again Bilal On 3/28/06, Ian Brett Yohai <yohai(a)fas.harvard.edu> wrote:

Hi Bilal, If you look at the section7 handout (in the Sections folder on the course website), there are a few examples that does what I think you would like. See subsection 4 called "R code" and also subsection 5 which shows how the Zelig syntax works for this type of thing. In your particular example, I don't see a 'race' term in your logit equation. You can estimate a logit with income, race, and education on the right hand side. Then when you simulate a first difference, you can hold income and education at their means, while changing race presumably from a zero to 1 if you have race coded as a dummy variable. In Zelig this is achieved by using two setx commands - again, see subsection 5 of the section 7 handout. On your last point about changing from 45% to 50% turnout, I'm not sure I follow entirely. But you can play around by changing the levels of education in your first difference setup, while holding the other variables at means (or at some other values that you deem substantively important) and computing the effect on turnout. You need to take draws from a multivariate normal distribution to make this work, so I definitely would suggest moving to R. Best, Ian On Tue, 28 Mar 2006, Bilal Khan wrote:

Hi All Can somebody help me to understand the two types of simulation that Gary gave lecture on. I am still bit confused. I use SPSS for my logit works

but

I strongly believe that we have to move beyond calculating simple betas

and

odds and give quantities of interest along with uncertainity. Suppose Beta = .0250I for education and Beta = .06531 for income in a logistic regression equation: Logit (turnout) = .02501 education +

.06531

income. I would like to know through an example how would you simulate

the

impact of race on turnout 1. while holding constant income and education at their means. 2. with income bracket of 30,000 to 45,000 dollars and less than high

school

of education. Can somebody give example by drawing three to four samples? Also many times when you have predicted probabilities of voting in an election for a data set using logistic regression model for each case in

the

sample of a state or an area and after considering probability of less

than

.50 not voting and more than .50 voting, how can you show the impact of changing a value of the parameter e.g. education with less than high

school

to all the sample having atleast high school education, on the predicted turnout of say 45 percent for the sample. That is I would like to say that changing a certain parameter (kind of

first

difference) the total turnout would improve from 45 percent to 50

percent or

whatever. I know I can do that in SPSS but it wont give me uncertanity or

confidence

intervals: which most of the analysts dont give for such type of "what

if

analysis" I am going through the work of Wolfinger and Rosentone "Who votes"; excelent work but no confidence interval levels or uncertanity

in

explaining their quantities of interest claculating through probit. How can you use Zelig for producing such quantities of interest? Bilal

_______________________________________________ gov2001-l mailing list gov2001-l(a)lists.fas.harvard.edu http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l

_______________________________________________ gov2001-l mailing list gov2001-l(a)lists.fas.harvard.edu http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l Omar Wasow M.A.-Ph.D. Candidate Department of African and African American Studies Department of Government Harvard University

Reply

yohai＠fas.harvard.edu

9:26 a.m.

New subject: [gov2001-l] Random effects logit

I don't know offhand but the Pinheiro and Bates, Mixed Effects Models in S and S-Plus book might have something. Best, Ian On Wed, 29 Mar 2006, Holger Lutz Kern wrote:

...

Hi all, does anyone know what package to use to fit a random-effects logit model? I've checked out nlme but couldn't find it in there... best, Holger Omar Wasow wrote:

Hi Suzanna: Clayton and Delia have answered the substantive part of your question, but I thought it might be helpful to also note that your for-loop is probably not working because the "3110" actually needs to written as "1:3110" (assuming you want to start at 1). Omar On Mar 28, 2006, at 9:55 PM, Suzanna Chapman wrote: Hi guys, quick question on forward loops - I have used cbind to put two columns of data together, and I want to fill a vector with the absolute values of the difference between the two columns for each row. I know this should be easy, but I'm not getting it to work - here's what I have but it doesn't work - can someone tell me what I'm missing? thanks! - Suzanna vector<-c() for (i in 3110) { vector[i] <- abs(diversity.clean[i,1]-diversity.clean[i,2]) } On Wed, 29 Mar 2006, Bilal Khan wrote:

Hi Ian So sorry to bother you again. Actually this stuff is so important that I would like to make sure that I understand it properly. I dont know why I have problem understanding it; perhaps due to non familiarity with various notations or matrix algebra, whatever, but I really want to understand it properly and that is why I am giving you some problem. I really appreciate your help on this. Okay! I have gone through the notes and the article and every thing but I am still not clear. What do you mean when you say "you can use the draws of the coefficient to simulate uncertainty about these fitted values" or when Gary says in his article on page three and point 2 "Draw one value of the vector .... ...from the multivariate normal distribution in Equation 4. Denote the ......." and points 3 nad 4 on the same page 3. Taking the simulated effect ....... 4. Simulate the outcome variable Y hat........ Can you give me an example of two or three random draws using the Logit model from Gary,s article the one he simulated from NES study by Rosenstone and Hanson. What I really did not understand was *how he repeated for each case the expected value algorithm M = 1000 times to approximate a 99 percent confidence intervals around the probability of voting.* I would reaaly appreciate your help. Thanks again Bilal On 3/28/06, Ian Brett Yohai <yohai(a)fas.harvard.edu> wrote:

Hi Bilal, If you look at the section7 handout (in the Sections folder on the course website), there are a few examples that does what I think you would like. See subsection 4 called "R code" and also subsection 5 which shows how the Zelig syntax works for this type of thing. In your particular example, I don't see a 'race' term in your logit equation. You can estimate a logit with income, race, and education on the right hand side. Then when you simulate a first difference, you can hold income and education at their means, while changing race presumably from a zero to 1 if you have race coded as a dummy variable. In Zelig this is achieved by using two setx commands - again, see subsection 5 of the section 7 handout. On your last point about changing from 45% to 50% turnout, I'm not sure I follow entirely. But you can play around by changing the levels of education in your first difference setup, while holding the other variables at means (or at some other values that you deem substantively important) and computing the effect on turnout. You need to take draws from a multivariate normal distribution to make this work, so I definitely would suggest moving to R. Best, Ian On Tue, 28 Mar 2006, Bilal Khan wrote: > Hi All > > Can somebody help me to understand the two types of simulation that > Gary > gave lecture on. I am still bit confused. I use SPSS for my logit works but > I strongly believe that we have to move beyond calculating simple betas and > odds and give quantities of interest along with uncertainity. > > Suppose Beta = .0250I for education and Beta = .06531 for income in a > logistic regression equation: Logit (turnout) = .02501 education + .06531 > income. I would like to know through an example how would you simulate the > impact of race on turnout > > 1. while holding constant income and education at their means. > 2. with income bracket of 30,000 to 45,000 dollars and less than high school > of education. > > Can somebody give example by drawing three to four samples? > > Also many times when you have predicted probabilities of voting in an > election for a data set using logistic regression model for each > case in the > sample of a state or an area and after considering probability of less than > .50 not voting and more than .50 voting, how can you show the impact of > changing a value of the parameter e.g. education with less than high school > to all the sample having atleast high school education, on the > predicted > turnout of say 45 percent for the sample. > That is I would like to say that changing a certain parameter (kind of first > difference) the total turnout would improve from 45 percent to 50 percent or > whatever. > > I know I can do that in SPSS but it wont give me uncertanity or confidence > intervals: which most of the analysts dont give for such type of "what if > analysis" I am going through the work of Wolfinger and Rosentone "Who > votes"; excelent work but no confidence interval levels or uncertanity in > explaining their quantities of interest claculating through probit. > > How can you use Zelig for producing such quantities of interest? > > Bilal > _______________________________________________ gov2001-l mailing list gov2001-l(a)lists.fas.harvard.edu http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l

_______________________________________________ gov2001-l mailing list gov2001-l(a)lists.fas.harvard.edu http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l Omar Wasow M.A.-Ph.D. Candidate Department of African and African American Studies Department of Government Harvard University _______________________________________________ gov2001-l mailing list gov2001-l(a)lists.fas.harvard.edu http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l

-- Holger Lutz Kern Graduate Student Department of Government Cornell University Institute for Quantitative Social Science Harvard University 1737 Cambridge Street N350 Cambridge, MA 02138 www.people.cornell.edu/pages/hlk23 _______________________________________________ gov2001-l mailing list gov2001-l(a)lists.fas.harvard.edu http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l

Reply

yohai＠fas.harvard.edu

28 Mar 28 Mar

5:19 p.m.

Hi Bilal, Here is an example. #step 1 fit the model - in this case a logit model library(Zelig) data(turnout) z.out <- zelig(vote ~ race + educate + income, model = "logit", data = turnout) #step 2 - draw parameters from their sampling distribution #MLEs are asymptotically normal so draw from a multivariate normal distribution #here we are taking 1000 draws #mean is equal to the parameters (in this case, the coefficients from the logit model) #Sigma = variance-covariance matrix draws <- mvrnorm(1000, mu=z.out$coefficients, Sigma=summary(z.out)$cov.unscaled) #this is what one draw looks like, one column for each of the coefficients: draws[1,] #(Intercept) racewhite educate income # -0.8687 0.3077 0.1098 0.1400 #now use these draws to get predicted values for each obs in the dataset #do this by plugging the draws into the systematic component (logit, in this case) #set X's equal to observed values for each observation, include intercept X <- cbind(c(rep(1,length(turnout$race))), turnout$race, turnout$educate, turnout$income) #calculate predicted values #1,000 predicted values for each of the 2,000 observations, as we set M=1000 pred <- 1/(1+exp(-X%*%t(draws))) The graph that Gary talked about in lecture was a 99% CI showing the effect of age on turnout by level of education. So basically, you set age equal to each value between 18-90, set education to high school, and keep other variables at their means. You can then simulate 1,000 expected values for each age value. To get the 99% CI, just sort the expected values in numerical order and take the 5th and 995th value as your CI (you can do this using the quantile() function in R as well). Then repeat setting education to college, and plot both along with the 99% CI bars. Best, Ian On Wed, 29 Mar 2006, Bilal Khan wrote:

...

Hi Ian So sorry to bother you again. Actually this stuff is so important that I would like to make sure that I understand it properly. I dont know why I have problem understanding it; perhaps due to non familiarity with various notations or matrix algebra, whatever, but I really want to understand it properly and that is why I am giving you some problem. I really appreciate your help on this. Okay! I have gone through the notes and the article and every thing but I am still not clear. What do you mean when you say "you can use the draws of the coefficient to simulate uncertainty about these fitted values" or when Gary says in his article on page three and point 2 "Draw one value of the vector .... ...from the multivariate normal distribution in Equation 4. Denote the ......." and points 3 nad 4 on the same page 3. Taking the simulated effect ....... 4. Simulate the outcome variable Y hat........ Can you give me an example of two or three random draws using the Logit model from Gary,s article the one he simulated from NES study by Rosenstone and Hanson. What I really did not understand was *how he repeated for each case the expected value algorithm M = 1000 times to approximate a 99 percent confidence intervals around the probability of voting.* I would reaaly appreciate your help. Thanks again Bilal On 3/28/06, Ian Brett Yohai <yohai(a)fas.harvard.edu> wrote:

Hi Bilal, If you look at the section7 handout (in the Sections folder on the course website), there are a few examples that does what I think you would like. See subsection 4 called "R code" and also subsection 5 which shows how the Zelig syntax works for this type of thing. In your particular example, I don't see a 'race' term in your logit equation. You can estimate a logit with income, race, and education on the right hand side. Then when you simulate a first difference, you can hold income and education at their means, while changing race presumably from a zero to 1 if you have race coded as a dummy variable. In Zelig this is achieved by using two setx commands - again, see subsection 5 of the section 7 handout. On your last point about changing from 45% to 50% turnout, I'm not sure I follow entirely. But you can play around by changing the levels of education in your first difference setup, while holding the other variables at means (or at some other values that you deem substantively important) and computing the effect on turnout. You need to take draws from a multivariate normal distribution to make this work, so I definitely would suggest moving to R. Best, Ian On Tue, 28 Mar 2006, Bilal Khan wrote:

Hi All Can somebody help me to understand the two types of simulation that Gary gave lecture on. I am still bit confused. I use SPSS for my logit works

but

I strongly believe that we have to move beyond calculating simple betas

and

odds and give quantities of interest along with uncertainity. Suppose Beta = .0250I for education and Beta = .06531 for income in a logistic regression equation: Logit (turnout) = .02501 education +

.06531

income. I would like to know through an example how would you simulate

the

impact of race on turnout 1. while holding constant income and education at their means. 2. with income bracket of 30,000 to 45,000 dollars and less than high

school

of education. Can somebody give example by drawing three to four samples? Also many times when you have predicted probabilities of voting in an election for a data set using logistic regression model for each case in

the

sample of a state or an area and after considering probability of less

than

.50 not voting and more than .50 voting, how can you show the impact of changing a value of the parameter e.g. education with less than high

school

to all the sample having atleast high school education, on the predicted turnout of say 45 percent for the sample. That is I would like to say that changing a certain parameter (kind of

first

difference) the total turnout would improve from 45 percent to 50

percent or

whatever. I know I can do that in SPSS but it wont give me uncertanity or

confidence

intervals: which most of the analysts dont give for such type of "what

if

analysis" I am going through the work of Wolfinger and Rosentone "Who votes"; excelent work but no confidence interval levels or uncertanity

in

explaining their quantities of interest claculating through probit. How can you use Zelig for producing such quantities of interest? Bilal

_______________________________________________ gov2001-l mailing list gov2001-l(a)lists.fas.harvard.edu http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l

Reply