I've been reading through the R code to better understand some challenges I've had working with YourCast.
>From documentation in the code, I see that the country correlations (alternately referred to inside yourcast by "adjacency" or "proximity") must take the values 0,1,2.
Does anyone have more knowledge about what these values mean? It seems that 0 represents no correlation. So, is "2" a higher degree of correlation than "1"?
Thanks for any insights.
Regards,
Ethan
--
Ethan Sharygin
University of Washington
Institute for Health Metrics and Evaluation
An addendum: if it helps, I found the code of the problematic function, digitpull, at
https://github.com/IQSS/YourCast/blob/master/R/cast.funcs.R
and the YourCast code for the Lee Carter model here
https://github.com/IQSS/YourCast/blob/master/R/lc.funcs.R
The error is being thrown at this step (from cast.funcs.R):
if (min(abs(v)) < 10^(stopdig-1)) {
stop(message="narrowest element in v not as wide as stopdig in proc digitpull()")
}
Here is the code in the Lee Carter model that is calling digitpull, as well as its arguments. In the comment, I explain what the value was, according to the debugger, at the time of the crash (plus a question mark if I can't find it):
who.cntry.digits <- get("who.cntry.digits", envir=ewho) # who.cntry.digits=3
who.digit.first <- get("who.digit.first", envir=ewho) # who.digit.first=0
digit.cntry.begin <- who.digit.first + 1 # =1 (?)
digit.cntry.end <- who.digit.first + who.cntry.digits # =4 (?)
whoinsampy <- get("whoinsampy", envir= ewho) # see below for example content
cs.vec <- as.numeric(names(whoinsampy));
cs.cntry.vec <- digitpull(cs.vec,digit.cntry.begin,digit.cntry.end)
This is what whoinsampy looks like:
Browse[1]> whoinsampy
$`00400`
depvar
004001970 -1.064245
004001971 -1.094867
004001972 -1.125606
004001973 -1.166485
... [etc, for each age group up to 004110]
I feel close to understanding this error, but I can't quite grasp it.
|-----Original Message-----
|From: yourcast-bounces(a)lists.gking.harvard.edu [mailto:yourcast-
|bounces(a)lists.gking.harvard.edu] On Behalf Of Ethan Sharygin
|Sent: Wednesday, October 02, 2013 11:42 AM
|Cc: yourcast(a)lists.gking.harvard.edu
|Subject: Re: [yourcast] Error in yourcast step after successful
|yourprep.
|
|Thanks, Nicholas, for the tip. That did get past that error!
|
|A few new things have happened that are interesting. YourCast seems to
|crash without covariates, which I was able to debug (details below).
|Unfortunately, when I add covariates, I still get a strange error. I'm
|hoping you or someone else might have seen this before too, and got past
|it.
|
|I'll document this in reverse order-first, the problem after adding a
|covariate (arbitrarily defined "time", equal to year-30---as in the
|example in the YourCast vignette). Second, for the record, I've
|described the problem I generated when YourCast did not have a covariate
|(might be a bug in the way the lists containing the data are named when
|there is only one covariate).
|
|____YourCast returns an error in digitpull().____
|
|I can't tell what is generating this error. I've tried changing the
|number of digits in the age group and country codes, but to no avail so
|far.
|
|> dta<-yourprep(dpath=paste(wd,"rawmini",sep=""),# path to text files
|+ year.var=TRUE,# year is a column
|+ tag="mxmale",# file names begin with mxmale
|+ index.code="gggaaa",# 3-digit region and 3-digit age
|+ sample.frame=c(1970,2011,2012,2036),# start,end of model yrs and
|forecast
|+ )
|Loading cross section files and checking for errors...
|...Finished
|Total number of cross sections: 24
|> ylc<-yourcast(formula=log(allcause)~index,
|+ dataobj=dta,# name of yourprep data
|+ model="LC",# options:bayes,map,ols,poisson,lc
|+ sample.frame=c(1970,2011,2012,2036),# provide sample frame from
|yourprep step
|+ )
|Creating formulas for all cross-sections...
|Adding time to the covariate list...
|Applying formulas to data in each cross-section...
|Running model frames for data matrices...
|Building the covariates list...
|Standardizing covariates...
|Constructing the list of dependent variables...
|Creating the in-sample and out-sample periods for covariates...
|Creating lists for dependent variable,in-sample and out-sample...
|Saving preprocessing in the
|file...c:/research/gbdcast/_yourcast/data/yourcast.savetmp
|The size of yourcast.savetmp is = 30434
|Using LC model
|Error in digitpull(cs.vec, digit.age.begin, digit.age.end) :
| narrowest element in v not as wide as stopdig in proc digitpull()
|Data is inconsistent.
|Error in get("lst.output", envir = ewho) : object 'lst.output' not found
|
|Some things that might be helpful (?). I can't find cs.vec in any of the
|environments available after debugger().
|
|Browse[1]> digit.first
|digit.first
| 0
|Browse[1]> age.digits
|age.digits
| 3
|Browse[1]> ewho$age.vec
| [1] "000" "001" "005" "010" "015" "020" "025" "030" "035" "040" "045"
|"050"
|[13] "055" "060" "065" "070" "075" "080" "085" "090" "095" "100" "105"
|"110"
|
|
|
|____YourCast seems to crash without covariates.____
|
|To my eyes, it appears that yourcast is calling the matrix column with
|the dependent variable, "allcause", but that the matrix is not labeled
|this way when it is the only variable. This happened until I added a
|covariate to the input data (i.e., a new column in the text files read
|by yourprep).
|
|> wd<-"c:/research/gbdcast/_yourcast/data/"# update working directory
|> setwd(wd)
|> dta<-yourprep(dpath=paste(wd,"rawmini",sep=""),# path to text files
|+ year.var=TRUE,# year is a column
|+ tag="mxmale",# file names begin with mxmale
|+ index.code="gggaaa",# 3-digit region and 3-digit age
|+ G.names="isonames.txt",# file containing country abbvs
|+ sample.frame=c(1970,2011,2012,2036),# start,end of model yrs and
|forecast
|+ # adjacency="adjacency.txt",# file containing WHO adjacency info
|+ verbose=TRUE,# do not suppress output
|+ )
|Loading cross section files and checking for errors...
|mxmale004000.txt
|mxmale004001.txt
|.[etc, until]
|mxmale004110.txt
|Loading auxiliary files...
|isonames.txt
|...Finished
|Total number of cross sections: 24
|
|> ylc <- yourcast(formula=log(allcauses)~index, dataobj=dta, model="LC",
|sample.frame=c(1970,2011,2012,2036))
|Error in mat[, dth] : subscript out of bounds
|
|> debugger()
|Message: Error in mat[, dth] : subscript out of bounds
|Available environments had calls:
|1: yourcast(formula = log(allcauses) ~ index, dataobj = dta, model =
|"LC", sam
|2: input.to.model(datamat = dataobj$data, ff = formula, all.pow =
|low.pow, sam
|3: build.covs.depvar.lst(datamat, ff, all.pow, sample.frame, standard,
|verbose
|4: data.after.first.obv(datamat, ff)
|5: lapply(datamat, FUN = "first.obvy", dth, pop)
|6: FUN(X[[1]], ...)
|Enter an environment number, or 0 to exit Selection: 6
|Browsing in the environment with call:
| FUN(X[[1]], ...)
|Called from: debugger.look(ind)
|
|Browse[1]> ls()
|[1] "dth" "mat" "pop" "rnm"
|
|Browse[1]> dth
|[1] "allcauses"
|
|Browse[1]> mat
| mat
|0040001970 0.34498817
|0040001971 0.33458412
|0040001972 0.32445580
|0040001973 0.31145978
|0040001974 0.29705149
|. [ and so on, until .]
|0040002010 0.07250808
|0040002011 0.06872911
|0040002012 NA
|0040002013 NA
|0040002014 NA
|. [ etc, etc, until.]
|0040002035 NA
|0040002036 NA
|
|This error was fixed by adding a covariate to the source data, which
|resulting in the appropriate naming of the columns in the matrix "mat",
|e.g.:
|
|> dta$data
|$`004000`
| allcause time
|1970 0.34498817 1940
|1971 0.33458412 1941
|...[etc]
|
|And then mat[,dth] pulls from the correct column.
|
|
|From: Nicholas Martinez [mailto:martinez.nicholas@gmail.com]
|Sent: Tuesday, October 01, 2013 10:24 PM
|To: Ethan Sharygin
|Cc: yourcast(a)lists.gking.harvard.edu
|Subject: Re: [yourcast] Error in yourcast step after successful
|yourprep.
|
|Ethan,
|I had the same problem a couple of years ago. You have to pass the the
|frame you set in your.prep into the your.cast sample.frame.
|Try this.
|ylc <- yourcast(formula=log(allcauses)~index, dataobj=dta, model="LC",
|sample.frame=c(1970,2011,2012,2036))
|Hope that works.
|Best,
|Nicholas
|
|On Tue, Oct 1, 2013 at 8:52 PM, Ethan Sharygin <sharygin(a)uw.edu> wrote:
|Dear yourcast users or moderator,
|
|I am unable to get yourcast to run any model with data in yourprep. I've
|formatted my data into text files for each cross-sectional unit,
|following the example in the vignette/documentation for YourCast.
|
|Here is a stripped down example for one geographic unit, 24 age groups,
|years 1970-2036 (non-missing data for 1970-2011). No auxiliary files are
|used in this example.
|
|(1) The R command and output:
|> dta<-yourprep(dpath=paste(wd,"rawmini",sep=""),# path to text files
|+ year.var=TRUE,# year is a column
|+ tag="mxmale",# file names begin with mxmale
|+ index.code="gggaaa",# 3-digit region and 3-digit age
|+ sample.frame=c(1970,2011,2012,2036),# start,end of model yrs and
|forecast
|+ #G.names="isonames.txt",# file containing country abbvs
|+ #adjacency="mxmale.proxim.txt",# file containing adjacency info
|+ verbose=TRUE,# do not suppress output
|+ )
|Loading cross section files and checking for errors...
|mxmale004000.txt
|mxmale004001.txt
|mxmale004005.txt
|mxmale004010.txt
|mxmale004015.txt
|mxmale004020.txt
|mxmale004025.txt
|mxmale004030.txt
|mxmale004035.txt
|mxmale004040.txt
|mxmale004045.txt
|mxmale004050.txt
|mxmale004055.txt
|mxmale004060.txt
|mxmale004065.txt
|mxmale004070.txt
|mxmale004075.txt
|mxmale004080.txt
|mxmale004085.txt
|mxmale004090.txt
|mxmale004095.txt
|mxmale004100.txt
|mxmale004105.txt
|mxmale004110.txt
|...Finished
|Total number of cross sections: 24
|> ylc<-yourcast(formula=log(allcause)~index,
|+ dataobj=dta,# name of yourprep data
|+ model="LC",# options:bayes,map,ols,poisson,lc
|+ debug=TRUE,# add debug info to user space
|+ )
|Error in seq.default(smpvec[ind], smpvec[ln], by = 1) :
| wrong sign in 'by' argument
|
|(2) an example text file (mxmale004000.txt) for one age category (age 0)
|for one country (004)
|year allcause
|1970 .3449881673
|1971 .3345841169
|1972 .3244557977
|1973 .3114597797
|1974 .2970514894
|[truncated here to save space]
|2010 .0725080818
|2011 .0687291101
|2012 NA
|2013 NA
|2014 NA
|[and so on, NA values until 2036]
|
|Any insights would be appreciated, as I cannot distinguish how my data,
|after yourprep, are any different from the example in the documentation.
|
|Thanks sincerely,
|
|--
|Ethan Sharygin
|
|-
|---
|yourcast mailing list served by HUIT
|List Address: yourcast(a)lists.gking.harvard.edu
|Subscribe/Unsubscribe:
|http://lists.gking.harvard.edu/mailman/listinfo/yourcast
|Yourcast mailing list
|Yourcast(a)lists.gking.harvard.edu
|
|To unsubscribe from this list or get other information:
|
|https://lists.gking.harvard.edu/mailman/listinfo/yourcast
|
|-
|---
|yourcast mailing list served by HUIT
|List Address: yourcast(a)lists.gking.harvard.edu
|Subscribe/Unsubscribe:
|http://lists.gking.harvard.edu/mailman/listinfo/yourcast
|Yourcast mailing list
|Yourcast(a)lists.gking.harvard.edu
|
|To unsubscribe from this list or get other information:
|
|https://lists.gking.harvard.edu/mailman/listinfo/yourcast
Dear yourcast users or moderator,
I am unable to get yourcast to run any model with data in yourprep. I've formatted my data into text files for each cross-sectional unit, following the example in the vignette/documentation for YourCast.
Here is a stripped down example for one geographic unit, 24 age groups, years 1970-2036 (non-missing data for 1970-2011). No auxiliary files are used in this example.
(1) The R command and output:
> dta<-yourprep(dpath=paste(wd,"rawmini",sep=""),# path to text files
+ year.var=TRUE,# year is a column
+ tag="mxmale",# file names begin with mxmale
+ index.code="gggaaa",# 3-digit region and 3-digit age
+ sample.frame=c(1970,2011,2012,2036),# start,end of model yrs and forecast
+ #G.names="isonames.txt",# file containing country abbvs
+ #adjacency="mxmale.proxim.txt",# file containing adjacency info
+ verbose=TRUE,# do not suppress output
+ )
Loading cross section files and checking for errors...
mxmale004000.txt
mxmale004001.txt
mxmale004005.txt
mxmale004010.txt
mxmale004015.txt
mxmale004020.txt
mxmale004025.txt
mxmale004030.txt
mxmale004035.txt
mxmale004040.txt
mxmale004045.txt
mxmale004050.txt
mxmale004055.txt
mxmale004060.txt
mxmale004065.txt
mxmale004070.txt
mxmale004075.txt
mxmale004080.txt
mxmale004085.txt
mxmale004090.txt
mxmale004095.txt
mxmale004100.txt
mxmale004105.txt
mxmale004110.txt
...Finished
Total number of cross sections: 24
> ylc<-yourcast(formula=log(allcause)~index,
+ dataobj=dta,# name of yourprep data
+ model="LC",# options:bayes,map,ols,poisson,lc
+ debug=TRUE,# add debug info to user space
+ )
Error in seq.default(smpvec[ind], smpvec[ln], by = 1) :
wrong sign in 'by' argument
(2) an example text file (mxmale004000.txt) for one age category (age 0) for one country (004)
year allcause
1970 .3449881673
1971 .3345841169
1972 .3244557977
1973 .3114597797
1974 .2970514894
[truncated here to save space]
2010 .0725080818
2011 .0687291101
2012 NA
2013 NA
2014 NA
[and so on, NA values until 2036]
Any insights would be appreciated, as I cannot distinguish how my data, after yourprep, are any different from the example in the documentation.
Thanks sincerely,
--
Ethan Sharygin