[gov2001-l] Problems with Optim

30 Mar 2008

Jens,

Thanks. We're trying to fit two highly inter-related but not  
identical bivariate normals to our data - see attached. Although  
maybe we're thinking about it incorrectly. There's some info on the  
site our paper comes from to suggest that there may be several local  
maxima - and this process requires patience!

We can't get genoud to work though - & get this error message with  
the following code:
Maximization Problem.
Error in apply(M, 1, FUN = fnxvx, y = mu, S = vcvm) :
   argument "M" is missing, with no default

R-code:
D <- Db # Toy data
M <- Mb
mu <- DMmean.b
genoud(ll.dbnmbn, nvars = 3, max =T, hessian = T)

And the function is:
# Implement log-likelihood function for MZ & DZ twins - simple  
comparison of a single trait based ACE model & bivariate normal  
distribution

fnxvx <- function(X,y,S) (X-y)%*%solve(S)%*%(X-y) # Function to  
calculate (X-mu) x var-cov^-1 x (X-mu)
# X <- c(11,7); y <-c(1,2); S <- matrix(c(1,0,0,1),2,2); fnxvx(X,y,S)  
# Test fnxvx

ll.dbnmbn <- function(par,mu,D,M){
	sa <- exp(par[1]) # Re-parameterize R -> R+ as sigma^2 genetic  
(hereditary) variance (sigma^2[A])
	sc <- exp(par[2]) # Shared environment variance (sigma^2[C])
	se <- exp(par[2]) # Non-shared environment variance (sigma^2[E])

	vcvm <- matrix(c(sa+sc+se,sa+sc,sa+sc,sa+sc+se),2,2) # varcov matrix  
for MZ (identical) twins
	vcvd <- matrix(c(sa+sc+se,sa/2+sc,sa/2+sc,sa+sc+se),2,2) # varcov  
matrix for DZ (non-identical) twins

	m <- sum(apply(M,1,FUN=fnxvx,y=mu,S=vcvm)) # sum((X-mu) x varcov^-1  
x (X-mu)) for MZ (identical) twins
	d <- sum(apply(D,1,FUN=fnxvx,y=mu,S=vcvd)) # sum((X-mu) x varcov^-1  
x (X-mu)) for DZ (non-identical) twins

	# Log-likelihood func based on bivariate normal - related but  
different for DZ and MZ twins
	out <- -nrow(M)*log(det(vcvm))-m - nrow(D)*log(det(vcvd))-d
	return(out)
	}

And our toy data:

# Create some alternative toy data based on mu = 100, sa = 6, sc = 2,  
se = 2
set.seed(12345)
Db <- mvrnorm(1000,c(100,100),matrix(c(10,8,8,10),2,2)); head(Db);  
nrow(Db)
Mb <- mvrnorm(1000,c(100,100),matrix(c(10,7,7,10),2,2)); head(Mb);  
nrow(Mb)
DMmean.b <- (apply(Db,2,mean) + apply(Mb,2,mean))/2; DMmean
ll.dbnmbn(log(c(6,2,2)),100,Da,Ma) # Does the function still work? Yes
true.b <- c(log(6), log(2), log(2)); true.b # Numbers optim SHOULD  
return

With optim we get different but pretty close results from different  
starting values - but some starting values produce the solve/singular  
error we got earlier.

Jeremy

?

On 30 Mar 2008, at 23:25, Jens Hainmueller wrote:

...
  Jeremy,

 Check if you get the ?right? results with optim and starting  
 values reasonably close to the solution in your toy data. If you  
 set up your toy data using reasonably well behaved data (like  
 multivariate normals or so) it seems a bit curious that you end up  
 with a seemingly irregular search space. Theoretically this should  
 not be the case since you are fitting some form of a bivariate  
 normal log likelihood function (if I understand it correctly). so  
 if the data is normal and the model is normal there should be no  
 issues (unless there is some issue with the code).

 If the search space is really irregular (and correctly so) then try  
 genoud(rgenoud). This is another optimizer I briefly discussed in  
 section. It works just like optim more or less but tends to work  
 well if the search space is not well behaved. It?s also faster  
 then optim?s SANN.

 Good luck,

 jens

 From: gov2001-l-bounces at lists.fas.harvard.edu [mailto:gov2001-l- 
 bounces at lists.fas.harvard.edu] On Behalf Of Jeremy Hodgen
 Sent: Sunday, March 30, 2008 8:05 AM
 To: gov2001-l at lists.fas.harvard.edu
 Subject: Re: [gov2001-l] Problems with Optim

 thanks - using toy data different starting values for optim  
 produces different results so it looks like we've got a series of  
 local maxima - none of them are the right results as yet but we'll  
 work on that

 Jeremy

 On 30 Mar 2008, at 12:55, Gary King wrote:

 A 'grid search' is when you set up a grid of values or the  
 parameters (say

 all possible combinations of beta1={.1,.2,.3,.4) and sigma= 
 {4,5,6,7,8,9},

 etc)  and then 'by hand' put them into your likelihood function to  
 find

 the max.  if the surface is really irrelegular (little pin holes,  
 sharp

 cliffs, etc.) then you can only find the max by a grid search with  
 a fine

 enough grid.  this typically takes a very long time to run, and so  
 altho

 you can get to the max this way with any likelihood function, you  
 always

 want to try the shortcuts that optim (or other algorithms) represent.

 but in really hard cases, its worth giving it a try if you have the  
 time

 and computing power.

 Gary

 On Sun, 30 Mar 2008, Jeremy Hodgen wrote:

 Thanks - how do we do a grid search?

 Jeremy

 On 29 Mar 2008, at 17:08, Jens Hainmueller wrote:

 This looks like a rather unorthodox model and without having seen  
 the math

 it is difficult to assess what exactly the code is trying to  
 accomplish.

 unfortunately, i also don't know anything about twin studies. a few  
 points:

 1. Try the likelihood function first on toy data that you set up so  
 that

 you

 know the solution. Only take the code to your real data if you  
 verified

 your

 code on a small toy example first

 2. In the toy example, try to simplify the problem first, for  
 example try

 only to maximize using D not M and D jointly. D and M are not  
 parameters

 but

 data so you should pass them as an argument directly

 optim(par.svals,D=rbind(Yd,Xd),M=rbind(Ym,Xm))

 3. triple check your code, for example:

    vcvd <- matrix(c(sa+sc+se,sa/2+sc,sa/2+sc,sa+sc+se),2,2)

    vcvm <- matrix(c(sa+sc+se,sa+sc,sa+sc,sa+sc+se),2,2)

 is it correct that these two differ? Also is the order of operations

 correct

 here? Maybe what you want is sa/(2+sc) (just a wild guess)?

 3. Focus on accuracy first, then worry about efficiency ? but while  
 we are

 at it that loop can definitely be eliminated, either by using apply

 (apply(D,1,FUN = muvcv.fun(mu = mu, vcv = vcv)) replace this by

 apply(D,1,FUN = muvcv.fun,mu = mu, vcv = vcv) as we discussed for  
 passing

 arguments within apply) or by simply rewriting the whole expression  
 in a

 matrix way (eg. X-mu is cbind(X[,1]-mean(X[,1], X[,2]-mean(X[,2]),  
 etc).

 4. I don't see why the search space should be highly unregluar here  
 (but

 you

 can check this which a little grid search), so optim should work.  
 If not,

 try genoud(rgenoud), optimize() won't help you at all.

 5. the error you are getting could be due to ill-conditioned  
 matrices in

 vcvd. Check the condition numbers on the matrix, if it's ill- 
 conditioned

 you

 cannot invert it using solve, but this should not happen unless  
 your data

 is

 very highly collinear.

 hth,

 jens

 From: gov2001-l-bounces at lists.fas.harvard.edu

 [mailto:gov2001-l-bounces at lists.fas.harvard.edu] On Behalf Of  
 Jeremy Hodgen

 Sent: Saturday, March 29, 2008 11:51 AM

 To: gov2001-l at lists.fas.harvard.edu

 Subject: [gov2001-l] Problems with Optim

 Dear All,

 Apologies for the long message, but we didn't think that our questions

 would

 make sense otherwise. We're having some problems getting optim to  
 work with

 our model & we'd be grateful for advice.

 The paper we're trying to replicate involves a twin study and compares

 identical (MZ) and non-identical (DZ) twins in an attempt to  
 estimate the

 effects of genetic and environmental effects. Sets of twins are  
 modelled

 using a bivariate normal with identical and non-identical twins having

 different but related distributions - basically the model is that  
 identical

 and non-identical twins share genetic causes 100% and 50%  
 respectively.

 Everything else (environment) is assumed to be equivalent. The  
 model is a

 simple one - it assumes 3 latent causal variables: genetic, shared

 environment and non-shared environment. We are (at least initially)

 interested in the effects of individual traits (e.g., Speaking at  
 age 7, as

 measured by teacher assessment.) If we get past this, then we'll  
 try and

 replicate their longitudinal modeling. We are interested in the  
 proportions

 of these effects on the variance & covariance.

 So we've created a log-likelihood function based on this (see R  
 code pasted

 below) and we want to produce a MLE for the three factors. But optim

 doesn't

 work - it produces the following error message:

 Error in solve.default(vcvm) :

   Lapack routine dgesv: system is exactly singular

 We've tried all the different methods that optim uses (except "L- 
 BFGS-B")

 and all (except "SANN") produce the same error message (sometimes  
 referring

 to vcvd.) We've left SANN going all night and it hasn't produced an  
 answer.

 Now vcvd and vcvm are variance-covariance matrices and, our  
 function does

 attempt to work out the inverse using solve, but the way in which  
 we have

 set them up should not allow singular matrices to be produced (see the

 code.) However, we've tried a range of different starting values &  
 we get

 the same errors.

 Our data has some odd features. After excluding some cases, there's  
 roughly

 2000 identical or MZ twins and roughly 3500 non-identical or DZ  
 twins with

 lots of missing values. But many of the traits that we're  
 interested in

 have

 a very small scale - so Speaking at age 7 is just a scale of 3  
 levels (1, 2

 & 3) - and of course the trait for many of the twins (about 75% of  
 them)

 are

 exactly the same. We can argue about whether this is a good model -  
 but our

 initial job is to replicate the results of the paper. There are  
 several

 other traits that are more straightforward (so 'g' or intelligence is

 measured as z-scores.)

 So the specific questions are:

 1. Why is optim doing this and should we use something different?  
 The help

 file on optim suggests that optim might be problematic for  
 parameters with

 more than one dimension (which is really what we've got here since  
 we use

 the 3 variance parameters to create a variance-covariance matrix)?  
 The help

 files suggests using optimize, but it's not at all clear how we  
 should use

 optimize - the arguments don't appear to us to be equivalent to optim

 (e.g.,

 no par). Also we're uninterested in mu - should we actually just  
 work with

 the actual mean for the trait from our data - which would  
 presumably be

 faster.

 2. Is there a way of avoiding the for loop in the code? It seems to  
 us that

 this will be really inefficient & optim certainly takes a while to  
 return

 the error message (> 2 minutes). We've tried using apply with a  
 function we

 define ourselves:

 mu <- c(1,1)

 vcv <- matrix(c(1,0,0,1),2,2)

 D <- matrix(c(1,2,3,4),2,2)

 muvcv.fun <- function(x,mu,vcv) (x-mu) %*% vcv %*% (x-mu)

 apply(D,1,FUN = muvcv.fun(mu = mu, vcv = vcv))

 But we get the following error message:

 Error in muvcv.fun(mu = mu, vcv = vcv) :

   argument "x" is missing, with no default

 So we're unsure how the syntax for a parameter should be for a  
 function

 inside optim. We've tried various options that seem intuitive /  
 sensible to

 us, although doubtless we're missing something obvious.

 3. We have to then calculate the root mean square error of  
 approximation

 (RMSEA). In the original paper, this is calculated using the  
 software the

 authors use - Mx, which seems to be pretty exclusively used for twins

 studies, although RMSEA is fairly commonplace. The formula we've  
 found for

 RMSEA is sqrt[(chi^2/df - 1)/(N-1)] but it's not very clear how to use

 this.

 We think we have 6 parameters (the variances and covariances for the

 identical and non-identical twins] - 3 for the three variance  
 proportions

 that we're estimating, hence 3 df. Then N is the total sample size.  
 But to

 get chi^2, do we take (obs-exp)^2/exp?

 Thanks

 Jeremy & Jill (Hohenstein)

 Here's the R-code:

 # Implement log-likelihood function for MZ & DZ twins - simple  
 comparison

 of

 a single trait based ACE model & bivariate normal distribution

 ll.dbnmbn <- function(par,Yd,Ym,Xd,Xm){

           sa <- exp(par[2]) # Re-parameterize R -> R+ as sigma^2  
 genetic

 (hereditary) variance

  sc <- exp(par[3]) # Shared environment variance

  se <- exp(par[4]) # Non-shared environment

  mu <- par[1]

  vcvd <- matrix(c(sa+sc+se,sa/2+sc,sa/2+sc,sa+sc+se),2,2) # varcov

 matrix for DZ twins

           vcvm <- matrix(c(sa+sc+se,sa+sc,sa+sc,sa+sc+se),2,2) #  
 varcov matrix

 for MZ twins

  D <- rbind(Yd,Xd) # 2 x N(DZ) matrix of bivariate DZ twins trait

 scores

           M <- rbind(Ym,Xm) # 2 x N(MZ) matrix of bivariate MZ  
 twins trait

 scores

  d <- 0

  for (i in 1:length(Xd)){

                       d <- d + (D[,i]-mu)%*%solve(vcvd)%*%(D[,i]- 
 mu) # sum((X-mu)

 x varcov^-1 x (X-mu)) for DZ twins

   }

  m <- 0

  for (i in 1:length(Xm)){

                       m <- m + (M[,i]-mu)%*%solve(vcvm)%*%(M[,i]- 
 mu) # sum((X-mu)

 x varcov^-1 x (X-mu)) for MZ twins

   }

  # Log-likelihood func based on bivariate normal - related but

 different for DZ and MZ twins

           out <- -length(Xd)*log(2*pi*sqrt(det(vcvd)))-0.5*d -

 length(Xm)*log(2*pi*sqrt(det(vcvm)))-0.5*m

  return(out)

  }

           fit.1 <- optim(par = c(0,2.3,1.5,13.5), ll.dbnmbn, Yd =  
 Yd, Ym = Ym,

 Xd =

 Xd, Xm = Xm, method="BFGS", control=list(fnscale = -1), hessian = T)

 Dr Jeremy Hodgen

 Senior Lecturer in Mathematics Education

 King's College London

 Department of Education and Professional Studies

 Franklin-Wilkins Building

 Waterloo Bridge Wing

 150 Stamford Street

 London SE1 9NH

 Tel: 020 7848 3102

 Fax: 020 7848 3182

 E-mail: jeremy.hodgen at kcl.ac.uk

 _______________________________________________

 gov2001-l mailing list

 gov2001-l at lists.fas.harvard.edu

 http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l

 Dr Jeremy Hodgen

 Senior Lecturer in Mathematics Education

 King's College London

 Department of Education and Professional Studies

 Franklin-Wilkins Building

 Waterloo Bridge Wing

 150 Stamford Street

 London SE1 9NH

 Tel: 020 7848 3102

 Fax: 020 7848 3182

 E-mail: jeremy.hodgen at kcl.ac.uk

 _______________________________________________

 gov2001-l mailing list

 gov2001-l at lists.fas.harvard.edu

 http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l

 Dr Jeremy Hodgen
 Senior Lecturer in Mathematics Education
 King's College London
 Department of Education and Professional Studies
 Franklin-Wilkins Building
 Waterloo Bridge Wing
 150 Stamford Street
 London SE1 9NH

 Tel: 020 7848 3102
 Fax: 020 7848 3182
 E-mail: jeremy.hodgen at kcl.ac.uk

 _______________________________________________
 gov2001-l mailing list
 gov2001-l at lists.fas.harvard.edu
 http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l 

Dr Jeremy Hodgen
Senior Lecturer in Mathematics Education
King's College London
Department of Education and Professional Studies
Franklin-Wilkins Building
Waterloo Bridge Wing
150 Stamford Street
London SE1 9NH

Tel: 020 7848 3102
Fax: 020 7848 3182
E-mail: jeremy.hodgen at kcl.ac.uk

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

[gov2001-l] Problems with Optim