[gov2001-l] Problems with Optim

30 Mar 2008

A 'grid search' is when you set up a grid of values or the parameters (say 
all possible combinations of beta1={.1,.2,.3,.4) and sigma={4,5,6,7,8,9}, 
etc)  and then 'by hand' put them into your likelihood function to find 
the max.  if the surface is really irrelegular (little pin holes, sharp 
cliffs, etc.) then you can only find the max by a grid search with a fine 
enough grid.  this typically takes a very long time to run, and so altho 
you can get to the max this way with any likelihood function, you always 
want to try the shortcuts that optim (or other algorithms) represent.
but in really hard cases, its worth giving it a try if you have the time 
and computing power.
Gary

On Sun, 30 Mar 2008, Jeremy Hodgen wrote:

...
  Thanks - how do we do a grid search?
 Jeremy

 On 29 Mar 2008, at 17:08, Jens Hainmueller wrote:

  This looks like a rather unorthodox model and
without having seen the math
 it is difficult to assess what exactly the code is trying to accomplish.
 unfortunately, i also don't know anything about twin studies. a few points:

 1. Try the likelihood function first on toy data that you set up so that 
 you
 know the solution. Only take the code to your real data if you verified 
 your
 code on a small toy example first

 2. In the toy example, try to simplify the problem first, for example try
 only to maximize using D not M and D jointly. D and M are not parameters 
 but
 data so you should pass them as an argument directly
 optim(par.svals,D=rbind(Yd,Xd),M=rbind(Ym,Xm))

 3. triple check your code, for example:
    vcvd <- matrix(c(sa+sc+se,sa/2+sc,sa/2+sc,sa+sc+se),2,2)
    vcvm <- matrix(c(sa+sc+se,sa+sc,sa+sc,sa+sc+se),2,2)
 is it correct that these two differ? Also is the order of operations 
 correct
 here? Maybe what you want is sa/(2+sc) (just a wild guess)?

 3. Focus on accuracy first, then worry about efficiency ? but while we are
 at it that loop can definitely be eliminated, either by using apply
 (apply(D,1,FUN = muvcv.fun(mu = mu, vcv = vcv)) replace this by
 apply(D,1,FUN = muvcv.fun,mu = mu, vcv = vcv) as we discussed for passing
 arguments within apply) or by simply rewriting the whole expression in a
 matrix way (eg. X-mu is cbind(X[,1]-mean(X[,1], X[,2]-mean(X[,2]), etc).

 4. I don't see why the search space should be highly unregluar here (but 
 you
 can check this which a little grid search), so optim should work. If not,
 try genoud(rgenoud), optimize() won't help you at all.

 5. the error you are getting could be due to ill-conditioned matrices in
 vcvd. Check the condition numbers on the matrix, if it's ill-conditioned 
 you
 cannot invert it using solve, but this should not happen unless your data 
 is
 very highly collinear.

 hth,
 jens

 From: gov2001-l-bounces at lists.fas.harvard.edu
 [mailto:gov2001-l-bounces at lists.fas.harvard.edu] On Behalf Of Jeremy Hodgen
 Sent: Saturday, March 29, 2008 11:51 AM
 To: gov2001-l at lists.fas.harvard.edu
 Subject: [gov2001-l] Problems with Optim

 Dear All,

 Apologies for the long message, but we didn't think that our questions 
 would
 make sense otherwise. We're having some problems getting optim to work with
 our model & we'd be grateful for advice.

 The paper we're trying to replicate involves a twin study and compares
 identical (MZ) and non-identical (DZ) twins in an attempt to estimate the
 effects of genetic and environmental effects. Sets of twins are modelled
 using a bivariate normal with identical and non-identical twins having
 different but related distributions - basically the model is that identical
 and non-identical twins share genetic causes 100% and 50% respectively.
 Everything else (environment) is assumed to be equivalent. The model is a
 simple one - it assumes 3 latent causal variables: genetic, shared
 environment and non-shared environment. We are (at least initially)
 interested in the effects of individual traits (e.g., Speaking at age 7, as
 measured by teacher assessment.) If we get past this, then we'll try and
 replicate their longitudinal modeling. We are interested in the proportions
 of these effects on the variance & covariance.

 So we've created a log-likelihood function based on this (see R code pasted
 below) and we want to produce a MLE for the three factors. But optim 
 doesn't
 work - it produces the following error message:

 Error in solve.default(vcvm) :
   Lapack routine dgesv: system is exactly singular

 We've tried all the different methods that optim uses (except "L-BFGS-B")
 and all (except "SANN") produce the same error message (sometimes referring
 to vcvd.) We've left SANN going all night and it hasn't produced an answer.
 Now vcvd and vcvm are variance-covariance matrices and, our function does
 attempt to work out the inverse using solve, but the way in which we have
 set them up should not allow singular matrices to be produced (see the
 code.) However, we've tried a range of different starting values & we get
 the same errors.

 Our data has some odd features. After excluding some cases, there's roughly
 2000 identical or MZ twins and roughly 3500 non-identical or DZ twins with
 lots of missing values. But many of the traits that we're interested in 
 have
 a very small scale - so Speaking at age 7 is just a scale of 3 levels (1, 2
 & 3) - and of course the trait for many of the twins (about 75% of them) 
 are
 exactly the same. We can argue about whether this is a good model - but our
 initial job is to replicate the results of the paper. There are several
 other traits that are more straightforward (so 'g' or intelligence is
 measured as z-scores.)

 So the specific questions are:

 1. Why is optim doing this and should we use something different? The help
 file on optim suggests that optim might be problematic for parameters with
 more than one dimension (which is really what we've got here since we use
 the 3 variance parameters to create a variance-covariance matrix)? The help
 files suggests using optimize, but it's not at all clear how we should use
 optimize - the arguments don't appear to us to be equivalent to optim 
 (e.g.,
 no par). Also we're uninterested in mu - should we actually just work with
 the actual mean for the trait from our data - which would presumably be
 faster.

 2. Is there a way of avoiding the for loop in the code? It seems to us that
 this will be really inefficient & optim certainly takes a while to return
 the error message (> 2 minutes). We've tried using apply with a function we
 define ourselves:
 mu <- c(1,1)
 vcv <- matrix(c(1,0,0,1),2,2)
 D <- matrix(c(1,2,3,4),2,2)
 muvcv.fun <- function(x,mu,vcv) (x-mu) %*% vcv %*% (x-mu)
 apply(D,1,FUN = muvcv.fun(mu = mu, vcv = vcv))
 But we get the following error message:
 Error in muvcv.fun(mu = mu, vcv = vcv) :
   argument "x" is missing, with no default

 So we're unsure how the syntax for a parameter should be for a function
 inside optim. We've tried various options that seem intuitive / sensible to
 us, although doubtless we're missing something obvious.
 3. We have to then calculate the root mean square error of approximation
 (RMSEA). In the original paper, this is calculated using the software the
 authors use - Mx, which seems to be pretty exclusively used for twins
 studies, although RMSEA is fairly commonplace. The formula we've found for
 RMSEA is sqrt[(chi^2/df - 1)/(N-1)] but it's not very clear how to use 
 this.
 We think we have 6 parameters (the variances and covariances for the
 identical and non-identical twins] - 3 for the three variance proportions
 that we're estimating, hence 3 df. Then N is the total sample size. But to
 get chi^2, do we take (obs-exp)^2/exp?

 Thanks

 Jeremy & Jill (Hohenstein)

 Here's the R-code:

 # Implement log-likelihood function for MZ & DZ twins - simple comparison 
 of
 a single trait based ACE model & bivariate normal distribution

 ll.dbnmbn <- function(par,Yd,Ym,Xd,Xm){
 	sa <- exp(par[2]) # Re-parameterize R -> R+ as sigma^2 genetic
 (hereditary) variance
  sc <- exp(par[3]) # Shared environment variance
  se <- exp(par[4]) # Non-shared environment

  mu <- par[1]

  vcvd <- matrix(c(sa+sc+se,sa/2+sc,sa/2+sc,sa+sc+se),2,2) # varcov
 matrix for DZ twins
 	vcvm <- matrix(c(sa+sc+se,sa+sc,sa+sc,sa+sc+se),2,2) # varcov matrix
 for MZ twins

  D <- rbind(Yd,Xd) # 2 x N(DZ) matrix of bivariate DZ twins trait
 scores
 	M <- rbind(Ym,Xm) # 2 x N(MZ) matrix of bivariate MZ twins trait
 scores

  d <- 0
  for (i in 1:length(Xd)){
 		d <- d + (D[,i]-mu)%*%solve(vcvd)%*%(D[,i]-mu) # sum((X-mu)
 x varcov^-1 x (X-mu)) for DZ twins
   }

  m <- 0
  for (i in 1:length(Xm)){
 		m <- m + (M[,i]-mu)%*%solve(vcvm)%*%(M[,i]-mu) # sum((X-mu)
 x varcov^-1 x (X-mu)) for MZ twins
   }

  # Log-likelihood func based on bivariate normal - related but
 different for DZ and MZ twins
 	out <- -length(Xd)*log(2*pi*sqrt(det(vcvd)))-0.5*d -
 length(Xm)*log(2*pi*sqrt(det(vcvm)))-0.5*m
  return(out)
  }

 	fit.1 <- optim(par = c(0,2.3,1.5,13.5), ll.dbnmbn, Yd = Yd, Ym = Ym, 
 Xd =
 Xd, Xm = Xm, method="BFGS", control=list(fnscale = -1), hessian = T)

 Dr Jeremy Hodgen
 Senior Lecturer in Mathematics Education
 King's College London
 Department of Education and Professional Studies
 Franklin-Wilkins Building
 Waterloo Bridge Wing
 150 Stamford Street
 London SE1 9NH

 Tel: 020 7848 3102
 Fax: 020 7848 3182
 E-mail: jeremy.hodgen at kcl.ac.uk

 _______________________________________________
 gov2001-l mailing list
 gov2001-l at lists.fas.harvard.edu
 http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l 

 Dr Jeremy Hodgen
 Senior Lecturer in Mathematics Education
 King's College London
 Department of Education and Professional Studies
 Franklin-Wilkins Building
 Waterloo Bridge Wing
 150 Stamford Street
 London SE1 9NH

 Tel: 020 7848 3102
 Fax: 020 7848 3182
 E-mail: jeremy.hodgen at kcl.ac.uk

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

[gov2001-l] Problems with Optim