[gov2001-l] problems with log(exp().... Sequential processing in R causing issues

11 Mar 2009

Glad to hear it was resolved.  If something like this happens, one quick fix
is to rescale some of your X's.  Also, this is somewhat related to why we
like to work with the log scale (as in log-likelihood).  It's really easy to
hit numerical limits when we multiply stuff rather than adding stuff.

On Wed, Mar 11, 2009 at 9:11 PM, Schmidt, William <wschmidt at hbs.edu> wrote:

...
   Clarence figured out the fix for his code and sent me
and Abigail a note
 since we seemed to be the only ones affected. I made the same error as him.
 Miya was correct, it was a problem with my function. That said, R does hit a
 wall when taking an exp() of any value greater than 710. This could be
 problematic with other data and I am not sure there is a clean solution when
 using the log-likelihood of a logit with a large argument for the exp().
 Thanks for your help.

 *From:* gov2001-l-bounces at lists.fas.harvard.edu [mailto:
 gov2001-l-bounces at lists.fas.harvard.edu] *On Behalf Of *Olena Ageyeva
 *Sent:* Wednesday, March 11, 2009 7:53 PM

 *To:* gov2001-l at lists.fas.harvard.edu
 *Subject:* Re: [gov2001-l] problems with log(exp().... Sequential
 processing in R causing issues

 Just a curiosity, which problem did you get this issue with?

 Sincerely,
 Olena Ageyeva

  ------------------------------

 From: wschmidt at hbs.edu
 To: gov2001-l at lists.fas.harvard.edu
 Date: Wed, 11 Mar 2009 19:03:52 -0400
 Subject: Re: [gov2001-l] problems with log(exp().... Sequential processing
 in R causing issues

 I hate to pile on, but count me as another voice that is seeing the same
 problem. It feels like there should be an elegant work around, but elegance
 evades me. Olena offered a solution that does work but it is disappointing
 to think R would require that kind of trickeration.

 *From:* gov2001-l-bounces at lists.fas.harvard.edu [mailto:
 gov2001-l-bounces at lists.fas.harvard.edu] *On Behalf Of *Lee, Clarence
 *Sent:* Wednesday, March 11, 2009 6:57 PM
 *To:* gov2001-l at lists.fas.harvard.edu
 *Subject:* Re: [gov2001-l] problems with log(exp().... Sequential
 processing in R causing issues

 Hi guys,

 I am running into the same issues as Abigail, and I?m taking the log
 likelihood exactly as is, derived from slide 3 of Gary?s lecture titled
 ?Single Equation Models.?  Are we not suppose to use this likelihood?

 I?ve checked the inputs for dimensionality and errors as well, and I do not
 think that?s causing the problem.  This seems to be a numerical overflow
 issue when dealing with Logit Models.   Do you guys know if there is a
 general workaround for this aside from manually coding in approximations
 ourselves?

 Clarence

 *From:* gov2001-l-bounces at lists.fas.harvard.edu [mailto:
 gov2001-l-bounces at lists.fas.harvard.edu] *On Behalf Of *Miya Woolfalk
 *Sent:* Wednesday, March 11, 2009 1:06 PM
 *To:* gov2001-l at lists.fas.harvard.edu
 *Subject:* Re: [gov2001-l] problems with log(exp().... Sequential
 processing in R causing issues

 Abigail,

 Before trying to implement a work around, you should make sure that your
 log likelihood function is programed correctly.  I have not heard of others
 encountering this issue so my suspicion is that you have an error in your
 function.

 Best,
 Miya

 On Wed, Mar 11, 2009 at 1:02 PM, Olena Ageyeva <eageeva at hotmail.com>
 wrote:

 I would try to check that your variable doesn't get grater then 710.
 This simple modification makes the algorithm work:

  m=605
 n=110
 log(exp(m)*exp(n))  [1] Inf
  log(exp(m))+log(exp(n))  [1] 715

 I would just create a function for log(exp(x)), where x is a large number.

 log_exp_for_large_numbers<-function(x)
 {
 x_array<-c()

 while (x>0)
 {
 ifelse (x>710, x_array<-cbind(x_array,709), x_array<-cbind(x_array,x))
 x<-x-709
 }
 out<-0

 for (i in 1:length(x_array))
 {
  out<-out+log(exp(x_array[i]))
 }
 return(out)
 }

  log_exp_for_large_numbers(31568)  [1]
31568

 Sincerely,
 Olena Ageyeva

  ------------------------------

 From: aallen at hbs.edu

 To: gov2001-l at lists.fas.harvard.edu

 Date: Wed, 11 Mar 2009 00:07:38 -0400
 Subject: [gov2001-l] problems with log(exp().... Sequential processing in R
 causing issues

 In attempting to debug my function I came across the below anomaly which I
 cannot figure out.

 Whenever I try to evaluate log(exp(A)) for any A >= 710  R evaluates the
 result as ?Inf? which is clearly not the case.  See R output below:

  log(exp(708))  [1] 708
  log(exp(709))  [1] 709
 *> log(exp(710))*
 *[1] Inf*
 *> log(exp(832))*
 *[1] Inf*

 I assume that this happens because R evaluates the function inside out:
 i.e. first tries to take the exp(710) which produces more digits than R will
 hold: hence R evaluates exp(710) as Inf, and then applies the log fxn to
 this result? log(Inf) = Inf.

 Ok logical. However, incredibly inconvenient for purposes of this homework
 set, and for optim() in general, as many negative values of B result in
 ln(1+e^(-xiB)) > ln(1+e^710) ?  creating an ?-Inf? evaluation of the log
 likelihood.  This sequential calculation jams my function when I try to
 optimize since R cannot sum (Inf+510+712) for example ? Error in -sum(formula
 in here J) : invalid 'type' (character) of argument.

 Has anyone else encountered this issue?

 Any suggestions?

 Thanks,

 Abigail

 *From:* gov2001-l-bounces at lists.fas.harvard.edu [mailto:
 gov2001-l-bounces at lists.fas.harvard.edu] *On Behalf Of *Patrick Lam
 *Sent:* Tuesday, March 10, 2009 2:50 PM
 *To:* gov2001-l at lists.fas.harvard.edu
 *Subject:* Re: [gov2001-l] Syntax to specify bounds in using L-BFGS-B?

 in the optim help file under "Arguments", there are two arguments that you
 are looking for.

 On Tue, Mar 10, 2009 at 2:26 PM, Allen, Abigail <aallen at hbs.edu> wrote:

 What is the syntax to specify the bounds using L-BFGS-B.  I?ve looked
 through the R-help file but can?t figure out how I tell optim what bounds to
 use?.

 *From:* gov2001-l-bounces at lists.fas.harvard.edu [mailto:
 gov2001-l-bounces at lists.fas.harvard.edu] *On Behalf Of *Lee, Clarence
 *Sent:* Monday, March 09, 2009 10:44 PM
 *To:* gov2001-l at lists.fas.harvard.edu
 *Subject:* Re: [gov2001-l] EXO 1

 I agree with Olena in that you need to use L-BFGS-B to get over that.   My
 question is more of a ?good-standard? practice:

 I notice that the problem to the reparametrization trick is that the SE?s
 gets messed up when we reparamterize ? which seems to be a bit kludgy since
 usually we care about the mean and the SE?s.   If reparamterization is a
 standard technique that most statistician use, then is there a standard
 technique with BFGS that helps us find the SE? (e.g. without specifying the
 bounds using L-BFGS-B)

 Any insights?

 Thanks,

 Clarence

 *From:* gov2001-l-bounces at lists.fas.harvard.edu [mailto:
 gov2001-l-bounces at lists.fas.harvard.edu] *On Behalf Of *Olena Ageyeva
 *Sent:* Monday, March 09, 2009 8:46 PM
 *To:* gov2001-l at lists.fas.harvard.edu
 *Subject:* Re: [gov2001-l] EXO 1

 I've got the same problem with re-parametrization and then I found out that
 instead re-parametrization in this particular case we are advised to use
 L-BFGS-B method in optim(), so we can set a lower and an upper limits for
 out point of interest. So, the point of interest will never get out of this
 interval, which is sort of reparameterization, right?
 The section notes says that log-likelihood (I guess it's in general for any
 LL-function?) is invariant to re-parametrization (any?), but I was not much
 successful applying this technique for exponential log-likelihood function.
 With re-parametrization done exactly the way it's in section note I get
 totally different result from what I derived analytically.

 As for pnorm(opt.1000 - 1.96*se) I think that it could be done separately
 for se and opt.1000 - they both are found for reparameterized function. So,
 we basically looking for opt and se in different coordinate system and then
 re-parameterized to get them back to our original coordinate system. Am I
 right?

 Sincerely,
 Olena Ageyeva

  Date: Mon, 9 Mar 2009 20:00:14 -0400
 From: charlotte.cavaille at gmail.com
 To: gov2001-l at lists.fas.harvard.edu
 Subject: [gov2001-l] EXO 1

 hi!
 Quick (but fundamental question)

 in exercise 1 gamma needs to be positive, so I re-parameterized...In
 order to get the right MLE from optim() i didn't
 forget to "re-parameterize" again...and i get the same result as the
 one i found analytically. However, things change when i look for the
 SE. Indeed, I get different answers analytically and with R (using the
 hessian). I know this comes from the re-parameterization because I
 don't have this issue when I change the whole function and do not
 re-parameterize gamma. So my question is:
 - if I re-parametterize, how do I apply the transformation to the
 hessian to get the right result

 how does that fit with the section notes that follows, why do we take
 "pnorm" (this is the first transformation that is applied in the
 ll.binom function) of "opt.1000 - 1.96*se" and not of "se" for
 instance???

 #binomial log-likelihood (N = # of trials for each observation)
 ll.binom <- function(par, y, N){
 # reparameterize pi; only search over [0,1]
 p <- pnorm(par)
 # log-likelihood
 out <- sum(y*log(p) + (N-y)*log(1-p))
 return(out)
 }

 # compare to wald ci
 se <- sqrt(solve(-optim(par=2, fn=ll.binom, y=samp.1000, N=10, 
method="BFGS",
  control=list(fnscale=-1), hessian=T)$hessian))
#$
 wald.ci <- c( pnorm(opt.1000 - 1.96*se), pnorm(opt.1000 + 1.96*se))
 wald.ci # 0.7364839 0.7535663

 - if i do not re-parameterize in order to be done with it and have
 both my analytical
 and my R result fit, how can i justify i am not re-parameterizing gamma?!

 thanks!

 charlotte
 _______________________________________________
 gov2001-l mailing list
 gov2001-l at lists.fas.harvard.edu
 http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l  
------------------------------

 Windows Live? Groups: Create an online spot for your favorite groups to
 meet. Check it out.

 _______________________________________________
 gov2001-l mailing list
 gov2001-l at lists.fas.harvard.edu
 http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l

 --
 Patrick Lam
 Department of Government and Institute for Quantitative Social Science,
 Harvard University
 http://www.people.fas.harvard.edu/~plam

   ------------------------------

 Windows Live? Contacts: Organize your contact list. Check it out.

 _______________________________________________
 gov2001-l mailing list
 gov2001-l at lists.fas.harvard.edu
 http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l

 --
 Miya Woolfalk
 Ph.D. Student
 Harvard University
 Government and Social Policy

  ------------------------------

 Windows Live?: Keep your life in sync. Check it
out.<http://windowslive.com/explore?ocid=TXT_TAGLM_WL_allup_1b_explore_0…

 _______________________________________________
 gov2001-l mailing list
 gov2001-l at lists.fas.harvard.edu
 http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l

-- 
Patrick Lam
Department of Government and Institute for Quantitative Social Science,
Harvard University
http://www.people.fas.harvard.edu/~plam

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

[gov2001-l] problems with log(exp().... Sequential processing in R causing issues