[gov2001-l] Quantities of Interest

List overview All Threads
Download

newer

older

[gov2001-l] Spell Checking LaTeX...

[gov2001-l] replication

ghumphr＠fas.harvard.edu

28 Mar 2006 28 Mar '06

8:59 p.m.

Hi, I was thinking about how to do exponential regression and came up with this quick optimization. I was considering using it to fill in missing parts of a Zipf distribution, but I have decided that the assumptions to do so are not met, particularly considering that political manifestos in German (which has a lot of compound words) have a pretty well-filled Zipf distribution whereas manifestos in English (which has a lot of multi-word terms) do not. I am curious as to how to adapt this estimator to time-series analysis and to make it more robust. Geoff # the function to optimize f <- function(par, X, Y, W=rep(1, nrow(X))) { beta <- par beta[2] <- 1/par[2] sum(t(W) %*% abs(exp(X %*% beta) - Y)) } # read in the data and set up variables table <- read.csv("table.csv", header=T) Y <- as.matrix(cbind(table[7])) X <- as.matrix(cbind(1, rev(c(1:nrow(Y))))) W <- as.matrix(table[6]) # do a linear regression on transformed Y values, taking the # reciprocal of beta_2 for optimization simplicity lm.out <- lm(log(Y)~(X[,2]), weights=as.vector(W)) par <- c(coefficients(lm.out)[1], 1/coefficients(lm.out)[2]) b0 <- par # extract a set of coefficients for initializing the optimization betahat_ <- optim(par, f, method="CG", X=X, Y=Y, W=W)$par betahat <- betahat_ betahat[2] <- 1/betahat[2] # plot plot(X[,2], Y, main="Price of IBM vs Time", xlab="day", ylab="adjusted price") lines(X[,2], exp(X %*% betahat), col="blue") b0[2] <- 1/b0[2] lines(X[,2], exp(X %*% b0), col="red") legend(x=0, y=120, legend=c("Naive Transformed Least Squares", "Absolute Least Error Predictor"), fill=c("red", "blue"), bty="n")

Show replies by date

yohai＠fas.harvard.edu

29 Mar 29 Mar

3:07 a.m.

Hi Geoff, Exponential regression models are duration models (time until an event happens) - so I'm not quite sure what you mean when you say you want to adapt it to time-series analysis. Also zelig fits exponential models. Best, Ian On Wed, 29 Mar 2006 ghumphr(a)fas.harvard.edu wrote:

...

ghumphr＠fas.harvard.edu

10:23 a.m.

New subject: [gov2001-l] Time Series

Zelig fits models of the form y=ae^(bx); I just reparameterized and fit the model with another error function; the reparameterization below could be improved in order to take advantage of greater floating point precision (which Zelig presumeably does). Exponential models are also useful for modeling population growth of yeast in vats and sundry other phenomena. Something was mentioned in lecture while on the slide "Quantities of Interest" about drawing from the log historical prices of stocks. YourCast fits time series models. http://gking.harvard.edu/yourcast/ Quoting Ian Brett Yohai <yohai(a)fas.harvard.edu>du>:

...

Hi, I was thinking about how to do exponential regression and came up with this quick optimization. I was considering using it to fill in missing parts of

Zipf distribution, but I have decided that the assumptions to do so are not met, particularly considering that political manifestos in German (which

has a

lot of compound words) have a pretty well-filled Zipf distribution whereas manifestos in English (which has a lot of multi-word terms) do not. I am curious as to how to adapt this estimator to time-series analysis and to

make

it more robust. Geoff # the function to optimize f <- function(par, X, Y, W=rep(1, nrow(X))) { beta <- par beta[2] <- 1/par[2] sum(t(W) %*% abs(exp(X %*% beta) - Y)) } # read in the data and set up variables table <- read.csv("table.csv", header=T) Y <- as.matrix(cbind(table[7])) X <- as.matrix(cbind(1, rev(c(1:nrow(Y))))) W <- as.matrix(table[6]) # do a linear regression on transformed Y values, taking the # reciprocal of beta_2 for optimization simplicity lm.out <- lm(log(Y)~(X[,2]), weights=as.vector(W)) par <- c(coefficients(lm.out)[1], 1/coefficients(lm.out)[2]) b0 <- par # extract a set of coefficients for initializing the optimization betahat_ <- optim(par, f, method="CG", X=X, Y=Y, W=W)$par betahat <- betahat_ betahat[2] <- 1/betahat[2] # plot plot(X[,2], Y, main="Price of IBM vs Time", xlab="day", ylab="adjusted

price")

lines(X[,2], exp(X %*% betahat), col="blue") b0[2] <- 1/b0[2] lines(X[,2], exp(X %*% b0), col="red") legend(x=0, y=120, legend=c("Naive Transformed Least Squares", "Absolute

Least

Error Predictor"), fill=c("red", "blue"), bty="n")

_______________________________________________ gov2001-l mailing list gov2001-l(a)lists.fas.harvard.edu http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l

ghumphr＠fas.harvard.edu

7:32 p.m.

I took a closer look and decided that the whole thing is garbage. Still working on it (spring break and replication pretty much done). Quoting Ian Brett Yohai <yohai(a)fas.harvard.edu>du>:

...

Hi, I was thinking about how to do exponential regression and came up with this quick optimization. I was considering using it to fill in missing parts of

Zipf distribution, but I have decided that the assumptions to do so are not met, particularly considering that political manifestos in German (which

has a

make

price")

lines(X[,2], exp(X %*% betahat), col="blue") b0[2] <- 1/b0[2] lines(X[,2], exp(X %*% b0), col="red") legend(x=0, y=120, legend=c("Naive Transformed Least Squares", "Absolute

Least

Error Predictor"), fill=c("red", "blue"), bty="n")

_______________________________________________ gov2001-l mailing list gov2001-l(a)lists.fas.harvard.edu http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l

ghumphr＠fas.harvard.edu

30 Mar 30 Mar

2:17 a.m.

I would be interested in learning how to adapt this sort of estimator to handle serially correlated errors. table <- read.csv("table.csv") table <- table[ rev( c(1:nrow(table))),] money <- 1000 commission = 8 ndays <- 250 starts <- 1:(nrow(table) - ndays) ends <- starts + ndays interest <- (table[ends, 7] - table[starts, 7]) / table[starts, 7] contribution <- money * interest - 2 * commission expected_contribution <- mean(contribution) hist(contribution) abline(v=expected_contribution, col="red") print("expected_contribution:") print(expected_contribution) Quoting ghumphr(a)fas.harvard.edu:

...

I took a closer look and decided that the whole thing is garbage. Still working on it (spring break and replication pretty much done). Quoting Ian Brett Yohai <yohai(a)fas.harvard.edu>du>:

this

> quick optimization. I was considering using it to fill in missing parts

a > Zipf distribution, but I have decided that the assumptions to do so are

not

met, particularly considering that political manifestos in German (which

has a > lot of compound words) have a pretty well-filled Zipf distribution

whereas

manifestos in English (which has a lot of multi-word terms) do not. I am curious as to how to adapt this estimator to time-series analysis and to

make

price")

lines(X[,2], exp(X %*% betahat), col="blue") b0[2] <- 1/b0[2] lines(X[,2], exp(X %*% b0), col="red") legend(x=0, y=120, legend=c("Naive Transformed Least Squares", "Absolute

Least

Error Predictor"), fill=c("red", "blue"), bty="n")

_______________________________________________ gov2001-l mailing list gov2001-l(a)lists.fas.harvard.edu http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l

ghumphr＠fas.harvard.edu

11:44 a.m.

...

I took a closer look and decided that the whole thing is garbage. Still working on it (spring break and replication pretty much done). Quoting Ian Brett Yohai <yohai(a)fas.harvard.edu>du>:

this

> quick optimization. I was considering using it to fill in missing parts

a > Zipf distribution, but I have decided that the assumptions to do so are

not

met, particularly considering that political manifestos in German (which

has a > lot of compound words) have a pretty well-filled Zipf distribution

whereas

manifestos in English (which has a lot of multi-word terms) do not. I am curious as to how to adapt this estimator to time-series analysis and to

make

price")

lines(X[,2], exp(X %*% betahat), col="blue") b0[2] <- 1/b0[2] lines(X[,2], exp(X %*% b0), col="red") legend(x=0, y=120, legend=c("Naive Transformed Least Squares", "Absolute

Least

Error Predictor"), fill=c("red", "blue"), bty="n")

_______________________________________________ gov2001-l mailing list gov2001-l(a)lists.fas.harvard.edu http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l

yohai＠fas.harvard.edu

11:54 a.m.

Hi Geoff, Yes, exponential models can fit multiple independent variables - like any other sysetmatic component. Best, Ian On Thu, 30 Mar 2006 ghumphr(a)fas.harvard.edu wrote:

...

I took a closer look and decided that the whole thing is garbage. Still working on it (spring break and replication pretty much done). Quoting Ian Brett Yohai <yohai(a)fas.harvard.edu>du>:

this

> quick optimization. I was considering using it to fill in missing parts

a > Zipf distribution, but I have decided that the assumptions to do so are

not

met, particularly considering that political manifestos in German (which

has a > lot of compound words) have a pretty well-filled Zipf distribution

whereas

manifestos in English (which has a lot of multi-word terms) do not. I am curious as to how to adapt this estimator to time-series analysis and to

make

price")

lines(X[,2], exp(X %*% betahat), col="blue") b0[2] <- 1/b0[2] lines(X[,2], exp(X %*% b0), col="red") legend(x=0, y=120, legend=c("Naive Transformed Least Squares", "Absolute

Least

Error Predictor"), fill=c("red", "blue"), bty="n")

_______________________________________________ gov2001-l mailing list gov2001-l(a)lists.fas.harvard.edu http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l

abby.williamson＠gmail.com

1:31 p.m.

New subject: [gov2001-l] Alpha (Heterogeneity Coef.)

Hello All, After extensive sleuthing, we still have a mystery coefficient in one of the less crucial tables in our paper. The table includes three models of a negative binomial regression with the size of discussion networks as the independent variable. The explanatory variables are (1) whether the respondent came from a 1985 or 2004 wave of the survey, (2) the interviewers' perceptions of the respondent's cooperativeness (cooperative, restless, hostile, with friendly/interested as the omitted base case), (3) the number of the previous 10 questions the respondent had refused to answer, and (4) demographic characteristics of the respondents (education, sex, age, marriage status, race). Finally, each model includes the following coefficient, labeled only as "Alpha (Heterogeneity Coef.)," followed by an F-test. We can't find anything in the article on what this might be. It doesn't seem like it would be appropriate to use a Cronbach's alpha here (and the coefficients are really low - 0.059 to 0.159, with the values getting lower as additional explanatory variables of the above added to the model). And if they are using Blau's heterogeneity index, we can't figure out how. Does this jog anyone's memory? If not, we'll contact the authors, but I wanted to try this last ditch effort. Many thanks, Abby -----Original Message----- From: gov2001-l-bounces(a)lists.fas.harvard.edu [mailto:gov2001-l-bounces@lists.fas.harvard.edu]On Behalf Of Ian Brett Yohai Sent: Thursday, March 30, 2006 4:54 PM To: gov2001-l(a)lists.fas.harvard.edu Subject: Re: [gov2001-l] Quantities of Interest Hi Geoff, Yes, exponential models can fit multiple independent variables - like any other sysetmatic component. Best, Ian On Thu, 30 Mar 2006 ghumphr(a)fas.harvard.edu wrote:

...

One last question on this subject (and a few words of praise for Zelig, an excellent package)- can exponential models be fit reliably to multiple independent variables? I hope that all is well with you. Quoting ghumphr(a)fas.harvard.edu: > I took a closer look and decided that the whole thing is garbage. Still > working > on it (spring break and replication pretty much done). > > Quoting Ian Brett Yohai <yohai(a)fas.harvard.edu>du>: > > > Hi Geoff, > > > > Exponential regression models are duration models (time until an event > > happens) - so I'm not quite sure what you mean when you say you want

...

> > adapt it to time-series analysis. Also zelig fits exponential models. > > > > Best, > > Ian > > > > On Wed, 29 Mar 2006 ghumphr(a)fas.harvard.edu wrote: > > > > > Hi, > > > > > > I was thinking about how to do exponential regression and came up

with

...

> this > > > quick optimization. I was considering using it to fill in missing

parts

...

> of > > a > > > Zipf distribution, but I have decided that the assumptions to do so

are

...

> not > > > met, particularly considering that political manifestos in German

(which

...

> > has a > > > lot of compound words) have a pretty well-filled Zipf distribution > whereas > > > manifestos in English (which has a lot of multi-word terms) do not.

I am

...

> > > curious as to how to adapt this estimator to time-series analysis

and to

...

> > make > > > it more robust. > > > > > > Geoff > > > > > > > > > # the function to optimize > > > f <- function(par, X, Y, W=rep(1, nrow(X))) > > > { > > > beta <- par > > > beta[2] <- 1/par[2] > > > sum(t(W) %*% abs(exp(X %*% beta) - Y)) > > > } > > > > > > # read in the data and set up variables > > > table <- read.csv("table.csv", header=T) > > > Y <- as.matrix(cbind(table[7])) > > > X <- as.matrix(cbind(1, rev(c(1:nrow(Y))))) > > > W <- as.matrix(table[6]) > > > > > > # do a linear regression on transformed Y values, taking the > > > # reciprocal of beta_2 for optimization simplicity > > > lm.out <- lm(log(Y)~(X[,2]), weights=as.vector(W)) > > > par <- c(coefficients(lm.out)[1], 1/coefficients(lm.out)[2]) > > > b0 <- par > > > > > > # extract a set of coefficients for initializing the optimization > > > betahat_ <- optim(par, f, method="CG", X=X, Y=Y, W=W)$par > > > betahat <- betahat_ > > > betahat[2] <- 1/betahat[2] > > > > > > # plot > > > plot(X[,2], Y, main="Price of IBM vs Time", xlab="day",

ylab="adjusted

...

> > price") > > > lines(X[,2], exp(X %*% betahat), col="blue") > > > b0[2] <- 1/b0[2] > > > lines(X[,2], exp(X %*% b0), col="red") > > > legend(x=0, y=120, legend=c("Naive Transformed Least Squares",

"Absolute

...

Least

Error Predictor"), fill=c("red", "blue"), bty="n")

_______________________________________________ gov2001-l mailing list gov2001-l(a)lists.fas.harvard.edu http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l

dennis_feehan＠harvard.edu

1:51 p.m.

New subject: [gov2001-l] Alpha (Heterogeneity Coef.)

Hi, If I remember correctly, when you run a negative binomial model in stata, it includes some sort of hypothesis test to determine whether or not there is enough overdispersion to conclude that the negative binomial model is an appropriate alternative to the poisson. Is it possible that this is what you're seeing? If you have access to stata, you might try running the regression (the command is "nbreg") to see if it looks like that's what the author is reporting. Good luck, Dennis -----Original Message----- From: gov2001-l-bounces(a)lists.fas.harvard.edu [mailto:gov2001-l-bounces@lists.fas.harvard.edu] On Behalf Of Abby Williamson Sent: Thursday, March 30, 2006 6:31 PM To: gov2001-l(a)lists.fas.harvard.edu Subject: [gov2001-l] Alpha (Heterogeneity Coef.) Hello All, After extensive sleuthing, we still have a mystery coefficient in one of the less crucial tables in our paper. The table includes three models of a negative binomial regression with the size of discussion networks as the independent variable. The explanatory variables are (1) whether the respondent came from a 1985 or 2004 wave of the survey, (2) the interviewers' perceptions of the respondent's cooperativeness (cooperative, restless, hostile, with friendly/interested as the omitted base case), (3) the number of the previous 10 questions the respondent had refused to answer, and (4) demographic characteristics of the respondents (education, sex, age, marriage status, race). Finally, each model includes the following coefficient, labeled only as "Alpha (Heterogeneity Coef.)," followed by an F-test. We can't find anything in the article on what this might be. It doesn't seem like it would be appropriate to use a Cronbach's alpha here (and the coefficients are really low - 0.059 to 0.159, with the values getting lower as additional explanatory variables of the above added to the model). And if they are using Blau's heterogeneity index, we can't figure out how. Does this jog anyone's memory? If not, we'll contact the authors, but I wanted to try this last ditch effort. Many thanks, Abby -----Original Message----- From: gov2001-l-bounces(a)lists.fas.harvard.edu [mailto:gov2001-l-bounces@lists.fas.harvard.edu]On Behalf Of Ian Brett Yohai Sent: Thursday, March 30, 2006 4:54 PM To: gov2001-l(a)lists.fas.harvard.edu Subject: Re: [gov2001-l] Quantities of Interest Hi Geoff, Yes, exponential models can fit multiple independent variables - like any other sysetmatic component. Best, Ian On Thu, 30 Mar 2006 ghumphr(a)fas.harvard.edu wrote:

...

One last question on this subject (and a few words of praise for Zelig, an excellent package)- can exponential models be fit reliably to multiple independent variables? I hope that all is well with you. Quoting ghumphr(a)fas.harvard.edu: > I took a closer look and decided that the whole thing is garbage. > Still working on it (spring break and replication pretty much done). > > Quoting Ian Brett Yohai <yohai(a)fas.harvard.edu>du>: > > > Hi Geoff, > > > > Exponential regression models are duration models (time until an > > event > > happens) - so I'm not quite sure what you mean when you say you > > want

...

with

...

> this > > > quick optimization. I was considering using it to fill in > > > missing

parts

...

> of > > a > > > Zipf distribution, but I have decided that the assumptions to do > > > so

are

...

> not > > > met, particularly considering that political manifestos in > > > German

(which

...

> > has a > > > lot of compound words) have a pretty well-filled Zipf > > > distribution > whereas > > > manifestos in English (which has a lot of multi-word terms) do not.

I am

...

> > > curious as to how to adapt this estimator to time-series > > > analysis

and to

...

> > make > > > it more robust. > > > > > > Geoff > > > > > > > > > # the function to optimize > > > f <- function(par, X, Y, W=rep(1, nrow(X))) { > > > beta <- par > > > beta[2] <- 1/par[2] > > > sum(t(W) %*% abs(exp(X %*% beta) - Y)) } > > > > > > # read in the data and set up variables table <- > > > read.csv("table.csv", header=T) Y <- as.matrix(cbind(table[7])) > > > X <- as.matrix(cbind(1, rev(c(1:nrow(Y))))) W <- > > > as.matrix(table[6]) > > > > > > # do a linear regression on transformed Y values, taking the # > > > reciprocal of beta_2 for optimization simplicity lm.out <- > > > lm(log(Y)~(X[,2]), weights=as.vector(W)) par <- > > > c(coefficients(lm.out)[1], 1/coefficients(lm.out)[2]) b0 <- par > > > > > > # extract a set of coefficients for initializing the > > > optimization betahat_ <- optim(par, f, method="CG", X=X, Y=Y, > > > W=W)$par betahat <- betahat_ betahat[2] <- 1/betahat[2] > > > > > > # plot > > > plot(X[,2], Y, main="Price of IBM vs Time", xlab="day",

ylab="adjusted

...

> > price") > > > lines(X[,2], exp(X %*% betahat), col="blue") b0[2] <- 1/b0[2] > > > lines(X[,2], exp(X %*% b0), col="red") legend(x=0, y=120, > > > legend=c("Naive Transformed Least Squares",

"Absolute

...

Least

Error Predictor"), fill=c("red", "blue"), bty="n")

_______________________________________________ gov2001-l mailing list gov2001-l(a)lists.fas.harvard.edu http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l

_______________________________________________ gov2001-l mailing list gov2001-l(a)lists.fas.harvard.edu http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l _______________________________________________ gov2001-l mailing list gov2001-l(a)lists.fas.harvard.edu http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l

hlk23＠cornell.edu

5:30 p.m.

New subject: [gov2001-l] Zelig

Hi all, does anyone know whether it is possible to trick Zelig into computing quantities of interest for models it doesn't officially know about, e.g., a random effects logit model? Holger ghumphr(a)fas.harvard.edu wrote:

...

I took a closer look and decided that the whole thing is garbage. Still working on it (spring break and replication pretty much done). Quoting Ian Brett Yohai <yohai(a)fas.harvard.edu>du>:

this

> quick optimization. I was considering using it to fill in missing parts

a > Zipf distribution, but I have decided that the assumptions to do so are

not

met, particularly considering that political manifestos in German (which

has a > lot of compound words) have a pretty well-filled Zipf distribution

whereas

manifestos in English (which has a lot of multi-word terms) do not. I am curious as to how to adapt this estimator to time-series analysis and to

make

price")

lines(X[,2], exp(X %*% betahat), col="blue") b0[2] <- 1/b0[2] lines(X[,2], exp(X %*% b0), col="red") legend(x=0, y=120, legend=c("Naive Transformed Least Squares", "Absolute

Least

Error Predictor"), fill=c("red", "blue"), bty="n")

_______________________________________________ gov2001-l mailing list gov2001-l(a)lists.fas.harvard.edu http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l

-- Holger Lutz Kern Graduate Student Department of Government Cornell University Institute for Quantitative Social Science Harvard University 1737 Cambridge Street N350 Cambridge, MA 02138 www.people.cornell.edu/pages/hlk23

king＠harvard.edu

5:35 p.m.

New subject: [gov2001-l] Zelig

see the 'advanced' chapter in Zelig. what it requires is 'wrapping' a function that does random effects logit, tho, which is probalby not the answer you wanted! Gary On Thu, 30 Mar 2006, Holger Lutz Kern wrote:

...

I took a closer look and decided that the whole thing is garbage. Still working on it (spring break and replication pretty much done). Quoting Ian Brett Yohai <yohai(a)fas.harvard.edu>du>:

this

> quick optimization. I was considering using it to fill in missing parts

a > Zipf distribution, but I have decided that the assumptions to do so are

not

> met, particularly considering that political manifestos in German (which has a > lot of compound words) have a pretty well-filled Zipf distribution

whereas

> manifestos in English (which has a lot of multi-word terms) do not. I > am > curious as to how to adapt this estimator to time-series analysis and to make > it more robust. > > Geoff > > > # the function to optimize > f <- function(par, X, Y, W=rep(1, nrow(X))) > { > beta <- par > beta[2] <- 1/par[2] > sum(t(W) %*% abs(exp(X %*% beta) - Y)) > } > > # read in the data and set up variables > table <- read.csv("table.csv", header=T) > Y <- as.matrix(cbind(table[7])) > X <- as.matrix(cbind(1, rev(c(1:nrow(Y))))) > W <- as.matrix(table[6]) > > # do a linear regression on transformed Y values, taking the > # reciprocal of beta_2 for optimization simplicity > lm.out <- lm(log(Y)~(X[,2]), weights=as.vector(W)) > par <- c(coefficients(lm.out)[1], 1/coefficients(lm.out)[2]) > b0 <- par > > # extract a set of coefficients for initializing the optimization > betahat_ <- optim(par, f, method="CG", X=X, Y=Y, W=W)$par > betahat <- betahat_ > betahat[2] <- 1/betahat[2] > > # plot > plot(X[,2], Y, main="Price of IBM vs Time", xlab="day", ylab="adjusted price") > lines(X[,2], exp(X %*% betahat), col="blue") > b0[2] <- 1/b0[2] > lines(X[,2], exp(X %*% b0), col="red") > legend(x=0, y=120, legend=c("Naive Transformed Least Squares", "Absolute Least > Error Predictor"), fill=c("red", "blue"), bty="n") > > > > _______________________________________________ gov2001-l mailing list gov2001-l(a)lists.fas.harvard.edu http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l

_______________________________________________ gov2001-l mailing list gov2001-l(a)lists.fas.harvard.edu http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l

delia＠hss.caltech.edu

31 Mar 31 Mar

12:36 a.m.

New subject: [gov2001-l] Zelig

Hi Holger, I was thinking more about your problem, and I think what you want is the function glmmPQL() in the package MASS. it estimates generalized linear mixed effects models by repeated calls to lme in the nlme package. it suppports all the glm families, i.e., link=logit. Hope this helps, Delia Quoting Gary King <king(a)harvard.edu>du>:

...

I took a closer look and decided that the whole thing is garbage. Still working on it (spring break and replication pretty much done). Quoting Ian Brett Yohai <yohai(a)fas.harvard.edu>du>: > Hi Geoff, > > Exponential regression models are duration models (time until an event > happens) - so I'm not quite sure what you mean when you say you want to > adapt it to time-series analysis. Also zelig fits exponential models. > > Best, > Ian > > On Wed, 29 Mar 2006 ghumphr(a)fas.harvard.edu wrote: > >> Hi, >> >> I was thinking about how to do exponential regression and came up with this >> quick optimization. I was considering using it to fill in missing parts of > a >> Zipf distribution, but I have decided that the assumptions to do so are not >> met, particularly considering that political manifestos in German (which > has a >> lot of compound words) have a pretty well-filled Zipf distribution whereas >> manifestos in English (which has a lot of multi-word terms) do >> not. I am >> curious as to how to adapt this estimator to time-series analysis and to > make >> it more robust. >> >> Geoff >> >> >> # the function to optimize >> f <- function(par, X, Y, W=rep(1, nrow(X))) >> { >> beta <- par >> beta[2] <- 1/par[2] >> sum(t(W) %*% abs(exp(X %*% beta) - Y)) >> } >> >> # read in the data and set up variables >> table <- read.csv("table.csv", header=T) >> Y <- as.matrix(cbind(table[7])) >> X <- as.matrix(cbind(1, rev(c(1:nrow(Y))))) >> W <- as.matrix(table[6]) >> >> # do a linear regression on transformed Y values, taking the >> # reciprocal of beta_2 for optimization simplicity >> lm.out <- lm(log(Y)~(X[,2]), weights=as.vector(W)) >> par <- c(coefficients(lm.out)[1], 1/coefficients(lm.out)[2]) >> b0 <- par >> >> # extract a set of coefficients for initializing the optimization >> betahat_ <- optim(par, f, method="CG", X=X, Y=Y, W=W)$par >> betahat <- betahat_ >> betahat[2] <- 1/betahat[2] >> >> # plot >> plot(X[,2], Y, main="Price of IBM vs Time", xlab="day", ylab="adjusted > price") >> lines(X[,2], exp(X %*% betahat), col="blue") >> b0[2] <- 1/b0[2] >> lines(X[,2], exp(X %*% b0), col="red") >> legend(x=0, y=120, legend=c("Naive Transformed Least Squares", "Absolute > Least >> Error Predictor"), fill=c("red", "blue"), bty="n") >> >> >> >> > _______________________________________________ > gov2001-l mailing list > gov2001-l(a)lists.fas.harvard.edu > http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l > _______________________________________________ gov2001-l mailing list gov2001-l(a)lists.fas.harvard.edu http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l

_______________________________________________ gov2001-l mailing list gov2001-l(a)lists.fas.harvard.edu http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l

6608

days inactive

6610

days old

gov2001@lists.gking.harvard.edu

Manage subscription

11 comments

7 participants

tags (0)

participants (7)

abby.williamson＠gmail.com
delia＠hss.caltech.edu
dennis_feehan＠harvard.edu
ghumphr＠fas.harvard.edu
hlk23＠cornell.edu
king＠harvard.edu
yohai＠fas.harvard.edu