Binary files /tmp/tmp1TClAk/RJn9eX0x2u/r-cran-epi-2.19/build/vignette.rds and /tmp/tmp1TClAk/HK_qO2S7tg/r-cran-epi-2.30/build/vignette.rds differ diff -Nru r-cran-epi-2.19/CHANGES r-cran-epi-2.30/CHANGES --- r-cran-epi-2.19/CHANGES 2017-08-08 11:43:40.000000000 +0000 +++ r-cran-epi-2.30/CHANGES 2018-05-14 15:03:06.000000000 +0000 @@ -1,6 +1,104 @@ +Changes in 2.30 + +o A function decurve added; it removes not only a linear trend but + also a quadratic term from a designmatrix. + +o Small changes in pc.lines, and addition of pc.matshade to make + shaded confidence limits in APC plots + +Changes in 2.29 + +o ci.lin with a list of two prediction frames as argument to ctr.mat + now also honors the vcov and sample arguments. + Also expanded to accept a single data frame to mimick ci.pred and + extending this to honor arguments vcov and sample. This facility + ignores the offset, though. + +o matshade gains an argument plot that starts a new plot. Defauts to + is.null(dev.list()), so if no plot frame is open one will be made. + +Changes in 2.28 + +o Added function matshade() that plots shaded confidence bands. + +o Epi:::ci.dfr (called from ci.lin) redesigned so that it actually + works in most cases. + +Changes in 2.27 + +o Documentation of LCa.fit improved - really disturbing typos rectified. + +o Ns groomed to have knots= have precedence over df=, + +o Code of ci.dfr groomed, gam objects are now also accommodated when + using the list(dfrx,dfrr) version of the ctr.mat argument. Bug + causing a crash with splines fixed, however not elegantly. + +Changes in 2.26 + +o ci.lin did not honour a reference data frame of 1 row as promised. Fixed. + +Changes in 2.25 + +o N2Y now also completes the Lexis triangles in the last age-category + by fitting the sum to the average of the two prevalent numbers. + +o ci.lin(vcov=TRUE) now returns a list whose first element is named + 'coef' (previously 'est') and which is a vector (previously a 1-column + matrix). Old performance was illogical. + + ci.lin also now accepts a list of two prediction dataframes as the + ctr.mat argument. On the basis of these it computes the row-wise + difference (RR for ci.exp). + +o bootLexis for bootstrapping persons from a Lexis object added. + nid.Lexis (and nid.default) counting the number of persons in a + Lexis object added as utility. + +o Dataset BrCa used for illustration of the Crowther and Lambert paper + added together with a vignette using the data to illustrate a subset + of the analyses as in the paper, plus some extra. + +o Relevel had a minor bug producing warnings where none were needed. + +Changes in 2.24 + +o Bug in Relevel functionality when using a 2-column structure for + grouping levels fixed. Now also transfers ungrouped levels to the + result as new levels. + +o Substantial update of the vignette on follow-up data with Epi. + +o Documentation groomed. + +Changes in 2.22 + +o vcov() use in ci.lin() updated to work with the new definition of vcov. + +o Relevel now also accepts a two-column structure as input --- + basically a list of what each factor level should be mapped to. + +o A few changes and groomings of the vignettes + +Changes in 2.21 + +o minor changes in documentation of LCa.fit + +o the addScales argument to addCov.Lexis now implemented, along with small + changes in the naming of the examination types in the absence of + given ones. + +Changes in 2.20 + +o Vignettes now pre-build on order to keep source + +o Enhanced options for seq.states= argument to mcutLexis() allowing the + last seen event determine the state. Inspired by a vigette in the + survival package describing the feature. + Changes in 2.19 -o Typos in documentation fixed, LCa.fit +o Typos in documentation of LCa.fit fixed o Bug in knot calculation in LCa.fit fixed. Meaningless models emerged if explicit knots were supplied for cohort effects. Prior to 2.19 Binary files /tmp/tmp1TClAk/RJn9eX0x2u/r-cran-epi-2.19/data/BrCa.rda and /tmp/tmp1TClAk/HK_qO2S7tg/r-cran-epi-2.30/data/BrCa.rda differ diff -Nru r-cran-epi-2.19/debian/changelog r-cran-epi-2.30/debian/changelog --- r-cran-epi-2.19/debian/changelog 2017-12-10 08:23:35.000000000 +0000 +++ r-cran-epi-2.30/debian/changelog 2018-06-09 05:59:06.000000000 +0000 @@ -1,3 +1,48 @@ +r-cran-epi (2.30-1ubuntu1) cosmic; urgency=medium + + * Skip flup.R test which requires r-cran-popepi, see #898442 + + -- Graham Inggs Sat, 09 Jun 2018 05:59:06 +0000 + +r-cran-epi (2.30-1build1) cosmic; urgency=medium + + * No-change rebuild against r-api-3.5 + + -- Graham Inggs Fri, 01 Jun 2018 21:12:56 +0000 + +r-cran-epi (2.30-1) unstable; urgency=medium + + * New upstream version + * Maintainer: Debian R Packages Maintainers + * dh-update-R to update Build-Depends + + -- Andreas Tille Fri, 18 May 2018 08:40:34 +0200 + +r-cran-epi (2.28-1) unstable; urgency=medium + + * New upstream version + * dh-update-R to update Build-Depends + * Point Vcs fields to salsa.debian.org + * Standards-Version: 4.1.4 + + -- Andreas Tille Sat, 28 Apr 2018 10:08:25 +0200 + +r-cran-epi (2.26-1) unstable; urgency=medium + + * New upstream version + * d/rules: Fix permissions + + -- Andreas Tille Thu, 15 Mar 2018 07:42:36 +0100 + +r-cran-epi (2.24-1) unstable; urgency=medium + + * New upstream version + * Standards-Version: 4.1.3 + * debhelper 11 + + -- Andreas Tille Wed, 21 Feb 2018 09:12:53 +0100 + r-cran-epi (2.19-3) unstable; urgency=medium * Do not use vignettes that are causing infinite loops in autopkgtest diff -Nru r-cran-epi-2.19/debian/compat r-cran-epi-2.30/debian/compat --- r-cran-epi-2.19/debian/compat 2017-12-10 08:23:35.000000000 +0000 +++ r-cran-epi-2.30/debian/compat 2018-05-18 06:40:34.000000000 +0000 @@ -1 +1 @@ -10 +11 diff -Nru r-cran-epi-2.19/debian/control r-cran-epi-2.30/debian/control --- r-cran-epi-2.19/debian/control 2017-12-10 08:23:35.000000000 +0000 +++ r-cran-epi-2.30/debian/control 2018-06-09 05:59:06.000000000 +0000 @@ -1,28 +1,31 @@ Source: r-cran-epi -Maintainer: Debian Med Packaging Team +Maintainer: Ubuntu Developers +XSBC-Original-Maintainer: Debian R Packages Maintainers Uploaders: Andreas Tille Section: gnu-r Priority: optional -Build-Depends: debhelper (>= 10), +Build-Depends: debhelper (>= 11~), dh-r, r-base-dev, - r-cran-mass, r-cran-cmprsk, r-cran-etm, + r-cran-mass, + r-cran-survival, r-cran-plyr, + r-cran-matrix, r-cran-numderiv, r-cran-data.table, r-cran-zoo -Standards-Version: 4.1.2 -Vcs-Browser: https://anonscm.debian.org/cgit/debian-med/r-cran-epi.git -Vcs-Git: https://anonscm.debian.org/git/debian-med/r-cran-epi.git +Standards-Version: 4.1.4 +Vcs-Browser: https://salsa.debian.org/r-pkg-team/r-cran-epi +Vcs-Git: https://salsa.debian.org/r-pkg-team/r-cran-epi.git Homepage: https://cran.r-project.org/package=Epi Package: r-cran-epi Architecture: any -Depends: ${shlibs:Depends}, - ${misc:Depends}, - ${R:Depends} +Depends: ${R:Depends}, + ${shlibs:Depends}, + ${misc:Depends} Recommends: ${R:Recommends} Suggests: ${R:Suggests} Description: GNU R epidemiological analysis diff -Nru r-cran-epi-2.19/debian/rules r-cran-epi-2.30/debian/rules --- r-cran-epi-2.19/debian/rules 2017-12-10 08:23:35.000000000 +0000 +++ r-cran-epi-2.30/debian/rules 2018-05-18 06:40:34.000000000 +0000 @@ -4,3 +4,8 @@ %: dh $@ --buildsystem R + +override_dh_fixperms: + dh_fixperms + find debian -name fixall -exec chmod -x \{\} \; + find debian -name kopi -exec chmod -x \{\} \; diff -Nru r-cran-epi-2.19/debian/tests/run-unit-test r-cran-epi-2.30/debian/tests/run-unit-test --- r-cran-epi-2.19/debian/tests/run-unit-test 2017-12-10 08:23:35.000000000 +0000 +++ r-cran-epi-2.30/debian/tests/run-unit-test 2018-06-09 05:59:06.000000000 +0000 @@ -13,7 +13,7 @@ cd vignettes for rnw in `ls *.[rR]nw` ; do rfile=`echo $rnw | sed 's/\.[rR]nw/.R/'` -if [ "$rfile" = "simLexis.R" -o "$rfile" = "yll.R" ] ; then +if [ "$rfile" = "simLexis.R" -o "$rfile" = "yll.R" -o "$rfile" = "flup.R" ] ; then >&2 echo "Vignete $rnw runs infinite loop - ignoring this for the test" continue fi diff -Nru r-cran-epi-2.19/DESCRIPTION r-cran-epi-2.30/DESCRIPTION --- r-cran-epi-2.19/DESCRIPTION 2017-08-09 04:39:25.000000000 +0000 +++ r-cran-epi-2.30/DESCRIPTION 2018-05-15 09:44:58.000000000 +0000 @@ -1,6 +1,6 @@ Package: Epi -Version: 2.19 -Date: 2017-08-01 +Version: 2.30 +Date: 2018-04-29 Title: A Package for Statistical Analysis in Epidemiology Authors@R: c(person("Bendix", "Carstensen", role = c("aut", "cre"), email = "b@bxc.dk"), @@ -11,7 +11,7 @@ Depends: R (>= 3.0.0), utils Imports: cmprsk, etm, splines, MASS, survival, plyr, Matrix, numDeriv, data.table, zoo -Suggests: mstate, nlme, lme4 +Suggests: mstate, nlme, lme4, popEpi Description: Functions for demographic and epidemiological analysis in the Lexis diagram, i.e. register and cohort follow-up data, in particular representation, manipulation and simulation of multistate @@ -24,11 +24,11 @@ License: GPL-2 URL: http://BendixCarstensen.com/Epi/ NeedsCompilation: yes -Packaged: 2017-08-08 20:03:05 UTC; bendix +Packaged: 2018-05-15 07:50:10 UTC; bendix Author: Bendix Carstensen [aut, cre], Martyn Plummer [aut], Esa Laara [ctb], Michael Hills [ctb] Maintainer: Bendix Carstensen Repository: CRAN -Date/Publication: 2017-08-09 04:39:25 UTC +Date/Publication: 2018-05-15 09:44:58 UTC Binary files /tmp/tmp1TClAk/RJn9eX0x2u/r-cran-epi-2.19/inst/doc/flup.pdf and /tmp/tmp1TClAk/HK_qO2S7tg/r-cran-epi-2.30/inst/doc/flup.pdf differ diff -Nru r-cran-epi-2.19/inst/doc/flup.R r-cran-epi-2.30/inst/doc/flup.R --- r-cran-epi-2.19/inst/doc/flup.R 1970-01-01 00:00:00.000000000 +0000 +++ r-cran-epi-2.30/inst/doc/flup.R 2018-05-03 14:34:59.000000000 +0000 @@ -0,0 +1,324 @@ +### R code from vignette source 'flup' +### Encoding: UTF-8 + +################################################### +### code chunk number 1: flup.rnw:5-8 +################################################### +options( width=90, + SweaveHooks=list( fig=function() + par(mar=c(3,3,1,1),mgp=c(3,1,0)/1.6,las=1,bty="n") ) ) + + +################################################### +### code chunk number 2: flup.rnw:101-103 +################################################### +library(Epi) +print( sessionInfo(), l=F ) + + +################################################### +### code chunk number 3: flup.rnw:109-116 +################################################### +data( nickel ) +nicL <- Lexis( entry = list( per=agein+dob, + age=agein, + tfh=agein-age1st ), + exit = list( age=ageout ), + exit.status = ( icd %in% c(162,163) )*1, + data = nickel ) + + +################################################### +### code chunk number 4: flup.rnw:126-129 +################################################### +str( nickel ) +str( nicL ) +head( nicL ) + + +################################################### +### code chunk number 5: flup.rnw:138-139 +################################################### +summary( nicL ) + + +################################################### +### code chunk number 6: nicL1 +################################################### +plot( nicL ) + + +################################################### +### code chunk number 7: nicL2 +################################################### +par( mar=c(3,3,1,1), mgp=c(3,1,0)/1.6 ) +plot( nicL, 1:2, lwd=1, col=c("blue","red")[(nicL$exp>0)+1], + grid=TRUE, lty.grid=1, col.grid=gray(0.7), + xlim=1900+c(0,90), xaxs="i", + ylim= 10+c(0,90), yaxs="i", las=1 ) +points( nicL, 1:2, pch=c(NA,3)[nicL$lex.Xst+1], + col="lightgray", lwd=3, cex=1.5 ) +points( nicL, 1:2, pch=c(NA,3)[nicL$lex.Xst+1], + col=c("blue","red")[(nicL$exp>0)+1], lwd=1, cex=1.5 ) + + +################################################### +### code chunk number 8: flup.rnw:193-196 +################################################### +nicS1 <- splitLexis( nicL, "age", breaks=seq(0,100,10) ) +summary( nicL ) +summary( nicS1 ) + + +################################################### +### code chunk number 9: flup.rnw:204-205 +################################################### +round( subset( nicS1, id %in% 8:10 ), 2 ) + + +################################################### +### code chunk number 10: flup.rnw:211-213 +################################################### +nicS2 <- splitLexis( nicS1, "tfh", breaks=c(0,1,5,10,20,30,100) ) +round( subset( nicS2, id %in% 8:10 ), 2 ) + + +################################################### +### code chunk number 11: flup.rnw:218-223 +################################################### +library( popEpi ) +nicM <- splitMulti( nicL, age = seq(0,100,10), + tfh = c(0,1,5,10,20,30,100) ) +summary( nicS2 ) +summary( nicM ) + + +################################################### +### code chunk number 12: flup.rnw:227-230 +################################################### +identical( nicS2, nicM ) +class( nicS2 ) +class( nicM ) + + +################################################### +### code chunk number 13: flup.rnw:250-258 +################################################### +timeBand( nicM, "age", "middle" )[1:20] +# For nice printing and column labelling use the data.frame() function: +data.frame( nicS2[,c("lex.id","per","age","tfh","lex.dur")], + mid.age=timeBand( nicS2, "age", "middle" ), + mid.t=timeBand( nicS2, "tfh", "middle" ), + left.t=timeBand( nicS2, "tfh", "left" ), + right.t=timeBand( nicS2, "tfh", "right" ), + fact.t=timeBand( nicS2, "tfh", "factor" ) )[1:20,] + + +################################################### +### code chunk number 14: flup.rnw:278-279 +################################################### +summary( (nicS2$age-nicS2$tfh) - nicS2$age1st ) + + +################################################### +### code chunk number 15: flup.rnw:284-286 +################################################### +summary( timeBand( nicS2, "age", "middle" ) - + timeBand( nicS2, "tfh", "middle" ) - nicS2$age1st ) + + +################################################### +### code chunk number 16: flup.rnw:308-316 +################################################### +subset( nicL, id %in% 8:10 ) +agehi <- nicL$age1st + 50 / nicL$exposure +nicC <- cutLexis( data = nicL, + cut = agehi, + timescale = "age", + new.state = 2, + precursor.states = 0 ) +subset( nicC, id %in% 8:10 ) + + +################################################### +### code chunk number 17: flup.rnw:323-331 +################################################### +subset( nicS2, id %in% 8:10 ) +agehi <- nicS2$age1st + 50 / nicS2$exposure +nicS2C <- cutLexis( data = nicS2, + cut = agehi, + timescale = "age", + new.state = 2, + precursor.states = 0 ) +subset( nicS2C, id %in% 8:10 ) + + +################################################### +### code chunk number 18: flup.rnw:390-392 +################################################### +( a.kn <- with( subset( nicM, lex.Xst==1 ), quantile( age+lex.dur, (1:5-0.5)/5 ) ) ) +( t.kn <- with( subset( nicM, lex.Xst==1 ), quantile( tfh+lex.dur, (1:5-0.5)/5 ) ) ) + + +################################################### +### code chunk number 19: flup.rnw:405-410 +################################################### +ma <- glm( (lex.Xst==1) ~ Ns(age,knots=a.kn), + family = poisson, + offset = log(lex.dur), + data = nicM ) +summary( ma ) + + +################################################### +### code chunk number 20: pr-a +################################################### +nd <- data.frame( age=40:85, lex.dur=1000 ) +pr.a <- ci.pred( ma, newdata = nd ) +matplot( nd$age, pr.a, + type="l", lty=1, col=1, lwd=c(3,1,1), + log="y", xlab="Age (years)", + ylab="Lunng cancer mortality per 1000 PY") + + +################################################### +### code chunk number 21: flup.rnw:445-447 +################################################### +mat <- update( ma, . ~ . + Ns(tfh,knots=t.kn) ) +summary( mat ) + + +################################################### +### code chunk number 22: flup.rnw:457-458 +################################################### +summary( nickel$age1st ) + + +################################################### +### code chunk number 23: flup.rnw:462-468 +################################################### +nd <- data.frame( expand.grid( age=c(20:90,NA), age1st=seq(15,45,10) ) ) +nd <- transform( nd, tfh = ifelse( age > age1st, age-age1st, NA ), + lex.dur = 1000 ) +# makes no sense to have age < age1st +nd <- transform( nd, age = ifelse( age > age1st, age, NA ) ) +head( nd ) + + +################################################### +### code chunk number 24: pr-at +################################################### +pr.at <- ci.pred( mat, newdata = nd ) +matplot( nd$age, pr.at, + type="l", lty=1, col=1, lwd=c(3,1,1), + log="y", xlab="Age (years)", + ylab="Lunng cancer mortality per 1000 PY") + + +################################################### +### code chunk number 25: flup.rnw:492-493 +################################################### +anova( ma, mat, test="Chisq" ) + + +################################################### +### code chunk number 26: flup.rnw:504-508 +################################################### +( f.kn <- with( subset( nicM, lex.Xst==1 ), quantile( age1st, (1:5-0.5)/5 ) ) ) +maf <- update( ma, . ~ . + Ns(age1st,knots=f.kn) ) +summary( maf ) +anova( maf, ma, mat, test="Chisq" ) + + +################################################### +### code chunk number 27: pr-at-af +################################################### +pr.af <- ci.pred( maf, newdata = nd ) +matplot( nd$age, pr.at, + type="l", lty=1, col=1, lwd=c(3,1,1), + log="y", xlab="Age (years)", + ylab="Lunng cancer mortality per 1000 PY") +matlines( nd$age, pr.af, + type="l", lty=1, col=2, lwd=c(3,0,0) ) + + +################################################### +### code chunk number 28: flup.rnw:536-547 +################################################### +maft <- update( mat, . ~ . + Ns(age1st,knots=f.kn) ) +summary( maft ) +mft <- update( maft, . ~ . - Ns(age,knots=a.kn) ) +mf <- update( maf , . ~ . - Ns(age,knots=a.kn) ) +mt <- update( mat , . ~ . - Ns(age,knots=a.kn) ) +allp <- anova( maft, mat, ma, maf, mf, mft, mt, mat, + maf, maft, mft, + test="Chisq" ) +mall <- as.matrix( allp ) +cbind( mod = c("maft","mat","ma","maf","mf","mft","mt","mat","maf","maft","mft"), + round( allp[,1:5], 3 ) ) + + +################################################### +### code chunk number 29: flup.rnw:562-571 +################################################### +data( nickel ) +nicL <- Lexis( entry = list( per = agein+dob, + age = agein, + tfh = agein-age1st ), + exit = list( age = ageout ), + exit.status = ( icd > 0 ) + ( icd %in% c(162,163) ), + data = nickel ) +summary( nicL ) +subset( nicL, id %in% 8:10 ) + + +################################################### +### code chunk number 30: flup.rnw:576-585 +################################################### +nicL <- Lexis( entry = list( per = agein+dob, + age = agein, + tfh = agein-age1st ), + exit = list( age = ageout ), + exit.status = ( icd > 0 ) + ( icd %in% c(162,163) ), + data = nickel, + states = c("Alive","D.oth","D.lung") ) +summary( nicL ) +str( nicL ) + + +################################################### +### code chunk number 31: flup.rnw:597-605 +################################################### +nicL$agehi <- nicL$age1st + 50 / nicL$exposure +nicC <- cutLexis( data = nicL, + cut = nicL$agehi, + timescale = "age", + new.state = "HiExp", + precursor.states = "Alive" ) +subset( nicC, id %in% 8:10 ) +summary( nicC, scale=1000 ) + + +################################################### +### code chunk number 32: flup.rnw:624-633 +################################################### +nicC <- cutLexis( data = nicL, + cut = nicL$agehi, + timescale = "age", + new.state = "HiExp", + new.scale = "tfe", + split.states = TRUE, + precursor.states = "Alive" ) +subset( nicC, id %in% 8:10 ) +summary( nicC, scale=1000, timeScales=TRUE ) + + +################################################### +### code chunk number 33: nic-box +################################################### +boxes( nicC, boxpos = list(x=c(10,10,80,80,80,80), + y=c(75,25,87,63,13,37)), + scale.Y = 1000, + show.BE = TRUE ) + + Binary files /tmp/tmp1TClAk/RJn9eX0x2u/r-cran-epi-2.19/inst/doc/Follow-up.pdf and /tmp/tmp1TClAk/HK_qO2S7tg/r-cran-epi-2.30/inst/doc/Follow-up.pdf differ diff -Nru r-cran-epi-2.19/inst/doc/Follow-up.R r-cran-epi-2.30/inst/doc/Follow-up.R --- r-cran-epi-2.19/inst/doc/Follow-up.R 2017-08-08 20:02:26.000000000 +0000 +++ r-cran-epi-2.30/inst/doc/Follow-up.R 1970-01-01 00:00:00.000000000 +0000 @@ -1,159 +0,0 @@ -### R code from vignette source 'Follow-up.rnw' - -################################################### -### code chunk number 1: Follow-up.rnw:65-67 -################################################### -library(Epi) -print( sessionInfo(), l=F ) - - -################################################### -### code chunk number 2: Follow-up.rnw:146-153 -################################################### -data( nickel ) -nicL <- Lexis( entry = list( per=agein+dob, - age=agein, - tfh=agein-age1st ), - exit = list( age=ageout ), - exit.status = ( icd %in% c(162,163) )*1, - data = nickel ) - - -################################################### -### code chunk number 3: Follow-up.rnw:163-166 -################################################### -str( nickel ) -str( nicL ) -head( nicL ) - - -################################################### -### code chunk number 4: Follow-up.rnw:175-176 -################################################### -summary( nicL ) - - -################################################### -### code chunk number 5: nicL1 -################################################### -plot( nicL ) - - -################################################### -### code chunk number 6: nicL2 -################################################### -par( mar=c(3,3,1,1), mgp=c(3,1,0)/1.6 ) -plot( nicL, 1:2, lwd=1, col=c("blue","red")[(nicL$exp>0)+1], - grid=TRUE, lty.grid=1, col.grid=gray(0.7), - xlim=1900+c(0,90), xaxs="i", - ylim= 10+c(0,90), yaxs="i", las=1 ) -points( nicL, 1:2, pch=c(NA,3)[nicL$lex.Xst+1], - col="lightgray", lwd=3, cex=1.5 ) -points( nicL, 1:2, pch=c(NA,3)[nicL$lex.Xst+1], - col=c("blue","red")[(nicL$exp>0)+1], lwd=1, cex=1.5 ) - - -################################################### -### code chunk number 7: Follow-up.rnw:229-232 -################################################### -nicS1 <- splitLexis( nicL, "age", breaks=seq(0,100,10) ) -summary( nicL ) -summary( nicS1 ) - - -################################################### -### code chunk number 8: Follow-up.rnw:239-240 -################################################### -round( subset( nicS1, id %in% 8:10 ), 2 ) - - -################################################### -### code chunk number 9: Follow-up.rnw:245-247 -################################################### -nicS2 <- splitLexis( nicS1, "tfh", breaks=c(0,1,5,10,20,30,100) ) -round( subset( nicS2, id %in% 8:10 ), 2 ) - - -################################################### -### code chunk number 10: Follow-up.rnw:253-258 -################################################### -timeBand( nicS2, "age", "middle" )[1:20] -# For nice printing and column labelling use the data.frame() function: -data.frame( nicS2[,c("id","lex.id","per","age","tfh","lex.dur")], - mid.age=timeBand( nicS2, "age", "middle" ), - mid.tfh=timeBand( nicS2, "tfh", "middle" ) )[1:20,] - - -################################################### -### code chunk number 11: Follow-up.rnw:276-281 -################################################### -subset( nicL, id %in% 8:10 ) -agehi <- nicL$age1st + 50 / nicL$exposure -nicC <- cutLexis( data=nicL, cut=agehi, timescale="age", - new.state=2, precursor.states=0 ) -subset( nicC, id %in% 8:10 ) - - -################################################### -### code chunk number 12: Follow-up.rnw:287-292 -################################################### -subset( nicS2, id %in% 8:10 ) -agehi <- nicS2$age1st + 50 / nicS2$exposure -nicS2C <- cutLexis( data=nicS2, cut=agehi, timescale="age", - new.state=2, precursor.states=0 ) -subset( nicS2C, id %in% 8:10 ) - - -################################################### -### code chunk number 13: Follow-up.rnw:312-321 -################################################### -data( nickel ) -nicL <- Lexis( entry = list( per=agein+dob, - age=agein, - tfh=agein-age1st ), - exit = list( age=ageout ), - exit.status = ( icd > 0 ) + ( icd %in% c(162,163) ), - data = nickel ) -summary( nicL ) -subset( nicL, id %in% 8:10 ) - - -################################################### -### code chunk number 14: Follow-up.rnw:325-333 -################################################### -nicL <- Lexis( entry = list( per=agein+dob, - age=agein, - tfh=agein-age1st ), - exit = list( age=ageout ), - exit.status = ( icd > 0 ) + ( icd %in% c(162,163) ), - data = nickel, - states = c("Alive","D.oth","D.lung") ) -summary( nicL ) - - -################################################### -### code chunk number 15: Follow-up.rnw:345-353 -################################################### -nicL$agehi <- nicL$age1st + 50 / nicL$exposure -nicC <- cutLexis( data = nicL, - cut = nicL$agehi, - timescale = "age", - new.state = "HiExp", - precursor.states = "Alive" ) -subset( nicC, id %in% 8:10 ) -summary( nicC, scale=1000 ) - - -################################################### -### code chunk number 16: Follow-up.rnw:371-379 -################################################### -nicC <- cutLexis( data = nicL, - cut = nicL$agehi, - timescale = "age", - new.state = "Hi", - split.states=TRUE, new.scale=TRUE, - precursor.states = "Alive" ) -subset( nicC, id %in% 8:10 ) -summary( nicC, scale=1000 ) - - diff -Nru r-cran-epi-2.19/inst/doc/Follow-up.rnw r-cran-epi-2.30/inst/doc/Follow-up.rnw --- r-cran-epi-2.19/inst/doc/Follow-up.rnw 2015-05-27 15:26:58.000000000 +0000 +++ r-cran-epi-2.30/inst/doc/Follow-up.rnw 1970-01-01 00:00:00.000000000 +0000 @@ -1,399 +0,0 @@ -\SweaveOpts{results=verbatim,keep.source=TRUE,include=FALSE} -%\VignetteIndexEntry{Follow-up data with the Epi package} -\documentclass[a4paper,twoside,12pt]{article} - -\usepackage[english]{babel} -\usepackage{booktabs,rotating,graphicx,amsmath,verbatim,fancyhdr,Sweave} -\usepackage[colorlinks,linkcolor=red,urlcolor=blue]{hyperref} -\newcommand{\R}{\textsf{\bf R}} -\renewcommand{\topfraction}{0.95} -\renewcommand{\bottomfraction}{0.95} -\renewcommand{\textfraction}{0.1} -\renewcommand{\floatpagefraction}{0.9} -\DeclareGraphicsExtensions{.pdf,.jpg} -\setcounter{secnumdepth}{1} -\setcounter{tocdepth}{1} - -\oddsidemargin 1mm -\evensidemargin 1mm -\textwidth 160mm -\textheight 230mm -\topmargin -5mm -\headheight 8mm -\headsep 5mm -\footskip 15mm - -\begin{document} - -\raggedleft -\pagestyle{empty} -\vspace*{0.1\textheight} -\Huge -{\bf Follow-up data with the\\ \texttt{Epi} package} -\noindent\rule[-1ex]{\textwidth}{5pt}\\[2.5ex] -\Large -Summer 2014 -\vfill -\normalsize -\begin{tabular}{rl} - Michael Hills & Retired \\ - & Highgate, London \\[1em] -Martyn Plummer & International Agency for Research on Cancer, Lyon\\ - & \texttt{plummer@iarc.fr} \\[1em] -Bendix Carstensen & Steno Diabetes Center, Gentofte, Denmark\\ - & \small \& Department of Biostatistics, - University of Copenhagen\\ - & \normalsize \texttt{bxc@steno.dk} \\ - & \url{www.pubhealth.ku.dk/~bxc} -\end{tabular} -\normalsize -\newpage -\raggedright -\parindent 3ex -\parskip 0ex -\tableofcontents -\cleardoublepage -\setcounter{page}{1} -\pagestyle{fancy} -\renewcommand{\sectionmark}[1]{\markboth{\thesection #1}{\thesection \ #1}} -\fancyhead[OL]{\sl Follow-up data with the \texttt{Epi} package.} -\fancyhead[ER]{\sl \rightmark} -\fancyhead[EL,OR]{\bf \thepage} -\fancyfoot{} -\renewcommand{\headrulewidth}{0.1pt} - -<<>>= -library(Epi) -print( sessionInfo(), l=F ) -@ - -\section{Follow-up data in the \texttt{Epi} package} - -In the \texttt{Epi}-package, follow-up data is represented by adding -some extra variables to a dataframe. Such a dataframe is called a -\texttt{Lexis} object. The tools for handling follow-up data then use -the structure of this for special plots, tabulations etc. - -Follow-up data basically consists of a time of entry, a time of exit -and an indication of the status at exit (normally either ``alive'' or -``dead''). Implicitly is also assumed a status \emph{during} the -follow-up (usually ``alive''). - -\begin{figure}[htbp] - \centering -\setlength{\unitlength}{1pt} -\begin{picture}(210,70)(0,75) -%\scriptsize -\thicklines - \put( 0,80){\makebox(0,0)[r]{Age-scale}} - \put( 50,80){\line(1,0){150}} - \put( 50,80){\line(0,1){5}} - \put(100,80){\line(0,1){5}} - \put(150,80){\line(0,1){5}} - \put(200,80){\line(0,1){5}} - \put( 50,77){\makebox(0,0)[t]{35}} - \put(100,77){\makebox(0,0)[t]{40}} - \put(150,77){\makebox(0,0)[t]{45}} - \put(200,77){\makebox(0,0)[t]{50}} - - \put( 0,115){\makebox(0,0)[r]{Follow-up}} - - \put( 80,105){\makebox(0,0)[r]{\small Two}} - \put( 90,105){\line(1,0){87}} - \put( 90,100){\line(0,1){10}} - \put(100,100){\line(0,1){10}} - \put(150,100){\line(0,1){10}} - \put(180,105){\circle{6}} - \put( 95,110){\makebox(0,0)[b]{1}} - \put(125,110){\makebox(0,0)[b]{5}} - \put(165,110){\makebox(0,0)[b]{3}} - - \put( 50,130){\makebox(0,0)[r]{\small One}} - \put( 60,130){\line(1,0){70}} - \put( 60,125){\line(0,1){10}} - \put(100,125){\line(0,1){10}} - \put(130,130){\circle*{6}} - \put( 80,135){\makebox(0,0)[b]{4}} - \put(115,135){\makebox(0,0)[b]{3}} -\end{picture} - \caption{\it Follow-up of two persons} - \label{fig:fu2} -\end{figure} - -\section{Timescales} - -A timescale is a variable that varies deterministically \emph{within} -each person during follow-up, \textit{e.g.}: -\begin{itemize} - \item Age - \item Calendar time - \item Time since treatment - \item Time since relapse -\end{itemize} -All timescales advance at the same pace, so the time followed is the -same on all timescales. Therefore, it suffices to use only the entry -point on each of the time scale, for example: -\begin{itemize} - \item Age at entry. - \item Date of entry. - \item Time since treatment (\emph{at} treatment this is 0). - \item Time since relapse (\emph{at} relapse this is 0).. -\end{itemize} -In the \texttt{Epi} package, follow-up in a cohort is represented in a -\texttt{Lexis} object. A \texttt{Lexis} object is a dataframe with a -bit of extra structure representing the follow-up. For the -\texttt{nickel} data we would construct a \texttt{Lexis} object by: -<<>>= -data( nickel ) -nicL <- Lexis( entry = list( per=agein+dob, - age=agein, - tfh=agein-age1st ), - exit = list( age=ageout ), - exit.status = ( icd %in% c(162,163) )*1, - data = nickel ) -@ -The \texttt{entry} argument is a \emph{named} list with the entry -points on each of the timescales we want to use. It defines the names -of the timescales and the entry points. The \texttt{exit} argument -gives the exit time on \emph{one} of the timescales, so the name of -the element in this list must match one of the neames of the -\texttt{entry} list. This is sufficient, because the follow-up time on -all time scales is the same, in this case \texttt{ageout - agein}. Now -take a look at the result: -<<>>= -str( nickel ) -str( nicL ) -head( nicL ) -@ -The \texttt{Lexis} object \texttt{nicL} has a variable for each -timescale which is the entry point on this timescale. The follow-up -time is in the variable \texttt{lex.dur} (\textbf{dur}ation). - -There is a \texttt{summary} function for \texttt{Lexis} objects that -list the numer of transitions and records as well as the total -follow-up time: -<<>>= -summary( nicL ) -@ -We defined the exit status to be death from lung cancer (ICD7 -162,163), i.e. this variable is 1 if follow-up ended with a death from -this cause. If follow-up ended alive or by death from another cause, -the exit status is coded 0, i.e. as a censoring. - -Note that the exit status is in the variable \texttt{lex.Xst} -(e\textbf{X}it \textbf{st}atus. The variable \texttt{lex.Cst} is the -state where the follow-up takes place (\textbf{C}urrent -\textbf{st}atus), in this case 0 (alive). - -It is possible to get a visualization of the follow-up along the -timescales chosen by using the \texttt{plot} method for \texttt{Lexis} -objects. \texttt{nicL} is an object of \emph{class} \texttt{Lexis}, so -using the function \texttt{plot()} on it means that \R\ will look for -the function \texttt{plot.Lexis} and use this function. -<>= -plot( nicL ) -@ -The function allows a lot of control over the output, and a -\texttt{points.Lexis} function allows plotting of the endpoints of -follow-up: -<>= -par( mar=c(3,3,1,1), mgp=c(3,1,0)/1.6 ) -plot( nicL, 1:2, lwd=1, col=c("blue","red")[(nicL$exp>0)+1], - grid=TRUE, lty.grid=1, col.grid=gray(0.7), - xlim=1900+c(0,90), xaxs="i", - ylim= 10+c(0,90), yaxs="i", las=1 ) -points( nicL, 1:2, pch=c(NA,3)[nicL$lex.Xst+1], - col="lightgray", lwd=3, cex=1.5 ) -points( nicL, 1:2, pch=c(NA,3)[nicL$lex.Xst+1], - col=c("blue","red")[(nicL$exp>0)+1], lwd=1, cex=1.5 ) -@ -The results of these two plotting commands are in figure \ref{fig:Lexis-diagram}. -\begin{figure}[tb] -\centering -\label{fig:Lexis-diagram} -\includegraphics[width=0.39\textwidth]{Follow-up-nicL1} -\includegraphics[width=0.59\textwidth]{Follow-up-nicL2} -\caption{\it Lexis diagram of the \texttt{nickel} dataset, left panel - the default version, the right one with bells - and whistles. The red lines are for persons with exposure$>0$, so it - is pretty evident that the oldest ones are the exposed part of the - cohort.} -\end{figure} - -\section{Splitting the follow-up time along a timescale} - -The follow-up time in a cohort can be subdivided by for example -current age. This is achieved by the \texttt{splitLexis} (note that it -is \emph{not} called \texttt{split.Lexis}). This requires that the -timescale and the breakpoints on this timescale are supplied. Try: -<<>>= -nicS1 <- splitLexis( nicL, "age", breaks=seq(0,100,10) ) -summary( nicL ) -summary( nicS1 ) -@ -So we see that the number of events and the amount of follow-up is the -same in the two datasets; only the number of records differ. - -To see how records are split for each individual, it is useful to list -the results for a few individuals: -<<>>= -round( subset( nicS1, id %in% 8:10 ), 2 ) -@ -The resulting object, \texttt{nicS1}, is again a \texttt{Lexis} -object, and so follow-up may be split further along another -timescale. Try this and list the results for individuals 8, 9 and 10 again: -<<>>= -nicS2 <- splitLexis( nicS1, "tfh", breaks=c(0,1,5,10,20,30,100) ) -round( subset( nicS2, id %in% 8:10 ), 2 ) -@ -If we want to model the effect of these timescales we will for each -interval use either the value of the left endpoint in each interval or -the middle. There is a function \texttt{timeBand} which returns these. -Try: -<<>>= -timeBand( nicS2, "age", "middle" )[1:20] -# For nice printing and column labelling use the data.frame() function: -data.frame( nicS2[,c("id","lex.id","per","age","tfh","lex.dur")], - mid.age=timeBand( nicS2, "age", "middle" ), - mid.tfh=timeBand( nicS2, "tfh", "middle" ) )[1:20,] -@ -Note that these are the midpoints of the intervals defined by -\texttt{breaks=}, \emph{not} the midpoints of the actual follow-up -intervals. This is because the variable to be used in modelling must -be independent of the consoring and mortality pattern --- it should -only depend on the chosen grouping of the timescale. - -\section{Splitting time at a specific date} - -If we have a recording of the date of a specific event as for example -recovery or relapse, we may classify follow-up time as being before of -after this intermediate event. This is achieved with the function -\texttt{cutLexis}, which takes three arguments: the time point, the -timescale, and the value of the (new) state following the date. - -Now we define the age for the nickel vorkers where the cumulative -exposure exceeds 50 exposure years: -<<>>= -subset( nicL, id %in% 8:10 ) -agehi <- nicL$age1st + 50 / nicL$exposure -nicC <- cutLexis( data=nicL, cut=agehi, timescale="age", - new.state=2, precursor.states=0 ) -subset( nicC, id %in% 8:10 ) -@ -(The \texttt{precursor.states=} argument is explained below). -Note that individual 6 has had his follow-up split at age 25 where 50 -exposure-years were attained. This could also have been achieved in -the split dataset \texttt{nicS2} instead of \texttt{nicL}, try: -<<>>= -subset( nicS2, id %in% 8:10 ) -agehi <- nicS2$age1st + 50 / nicS2$exposure -nicS2C <- cutLexis( data=nicS2, cut=agehi, timescale="age", - new.state=2, precursor.states=0 ) -subset( nicS2C, id %in% 8:10 ) -@ -Note that follow-up subsequent to the event is classified as being -in state 2, but that the final transition to state 1 (death from lung -cancer) is preserved. This is the point of the \texttt{precursor.states=} -argument. It names the states (in this case 0, ``Alive'') that will be -over-witten by \texttt{new.state} (in this case state 2, ``High -exposure''). Clearly, state 1 (``Dead'') should not be updated even if -it is after the time where the persons moves to state 2. In other -words, only state 0 is a precursor to state 2, state 1 is always -subsequent to state 2. - -Note if the intermediate event is to be used as a time-dependent -variable in a Cox-model, then \texttt{lex.Cst} should be used as the -time-dependent variable, and \texttt{lex.Xst==1} as the event. - -\section{Competing risks --- multiple types of events} - -If we want to consider death from lung cancer and death from other -causes as separate events we can code these as for example 1 and 2. -<<>>= -data( nickel ) -nicL <- Lexis( entry = list( per=agein+dob, - age=agein, - tfh=agein-age1st ), - exit = list( age=ageout ), - exit.status = ( icd > 0 ) + ( icd %in% c(162,163) ), - data = nickel ) -summary( nicL ) -subset( nicL, id %in% 8:10 ) -@ -If we want to label the states, we can enter the names of these in the -\texttt{states} parameter, try for example: -<<>>= -nicL <- Lexis( entry = list( per=agein+dob, - age=agein, - tfh=agein-age1st ), - exit = list( age=ageout ), - exit.status = ( icd > 0 ) + ( icd %in% c(162,163) ), - data = nickel, - states = c("Alive","D.oth","D.lung") ) -summary( nicL ) -@ - -Note that the \texttt{Lexis} function automatically assumes that all -persons enter in the first level (given in the \texttt{states=} -argument) - -When we cut at a date as in this case, the date where cumulative -exposure exceeds 50 exposure-years, we get the follow-up \emph{after} -the date classified as being in the new state if the exit -(\texttt{lex.Xst}) was to a state we defined as one of the -\texttt{precursor.states}: -<<>>= -nicL$agehi <- nicL$age1st + 50 / nicL$exposure -nicC <- cutLexis( data = nicL, - cut = nicL$agehi, - timescale = "age", - new.state = "HiExp", - precursor.states = "Alive" ) -subset( nicC, id %in% 8:10 ) -summary( nicC, scale=1000 ) -@ -Note that the persons-years is the same, but that the number of events -has changed. This is because events are now defined as any transition -from alive, including the transitions to \texttt{HiExp}. - -Also note that (so far) it is necessary to specify the variable with -the cutpoints in full, using only \texttt{cut=agehi} would give an error. - -\subsection{Subdivision of existing states} -It may be of interest to subdivide the states following the -intermediate event according to wheter the event has occurred or -not. That is done by the argument \texttt{split.states=TRUE}. - -Moreover, it will also often be of interest to introduce a new -timescale indicating the time since intermediate event. This can be -done by the argument \texttt{new.scale=TRUE}, alternatively -\texttt{new.scale="tfevent"}, as illustrated here: -<<>>= -nicC <- cutLexis( data = nicL, - cut = nicL$agehi, - timescale = "age", - new.state = "Hi", - split.states=TRUE, new.scale=TRUE, - precursor.states = "Alive" ) -subset( nicC, id %in% 8:10 ) -summary( nicC, scale=1000 ) -@ - -\section{Multiple events of the same type (recurrent events)} -Sometimes more events of the same type are recorded for each person and -one would then like to count these and put follow-up time in states accordingly. -Essentially, each set of cutpoints represents progressions from one -state to the next. Therefore the states should be numbered, and the -numbering of states subsequently occupied be increased accordingly. - -This is a behaviour different from the one outlined above, and it is -achieved by the argument \texttt{count=TRUE} to -\texttt{cutLexis}. When \texttt{count} is set to \texttt{TRUE}, the -value of the arguments \texttt{new.state} and -\texttt{precursor.states} are ignored. Actually, when using the -argument \texttt{count=TRUE}, the function \texttt{countLexis} is -called, so an alternative is to use this directly. - -\end{document} - - diff -Nru r-cran-epi-2.19/inst/doc/index.html r-cran-epi-2.30/inst/doc/index.html --- r-cran-epi-2.19/inst/doc/index.html 2017-02-23 20:52:44.000000000 +0000 +++ r-cran-epi-2.30/inst/doc/index.html 2018-03-11 15:34:29.000000000 +0000 @@ -4,15 +4,20 @@

Vignettes for the Epi package

-Here is the website for the Epi package. +Here is the website for the +Epi package. +
+A list of published papers using +the Lexis macinery is +here. Binary files /tmp/tmp1TClAk/RJn9eX0x2u/r-cran-epi-2.19/inst/doc/simLexis.pdf and /tmp/tmp1TClAk/HK_qO2S7tg/r-cran-epi-2.30/inst/doc/simLexis.pdf differ diff -Nru r-cran-epi-2.19/inst/doc/simLexis.R r-cran-epi-2.30/inst/doc/simLexis.R --- r-cran-epi-2.19/inst/doc/simLexis.R 2017-08-08 20:02:49.000000000 +0000 +++ r-cran-epi-2.30/inst/doc/simLexis.R 2018-05-03 14:36:26.000000000 +0000 @@ -1,8 +1,16 @@ -### R code from vignette source 'simLexis.rnw' +### R code from vignette source 'simLexis' ### Encoding: UTF-8 ################################################### -### code chunk number 1: start +### code chunk number 1: simLexis.rnw:24-27 +################################################### +options( width=90, + SweaveHooks=list( fig=function() + par(mar=c(3,3,1,1),mgp=c(3,1,0)/1.6,las=1,bty="n") ) ) + + +################################################### +### code chunk number 2: start ################################################### options( width=90 ) library( Epi ) @@ -10,7 +18,7 @@ ################################################### -### code chunk number 2: Lexis +### code chunk number 3: Lexis ################################################### data(DMlate) dml <- Lexis( entry = list(Per=dodm, Age=dodm-dobth, DMdur=0 ), @@ -20,7 +28,7 @@ ################################################### -### code chunk number 3: cut +### code chunk number 4: cut ################################################### dmi <- cutLexis( dml, cut = dml$doins, pre = "DM", @@ -32,7 +40,7 @@ ################################################### -### code chunk number 4: boxes +### code chunk number 5: boxes ################################################### boxes( dmi, boxpos = list(x=c(20,20,80,80), y=c(80,20,80,20)), @@ -40,7 +48,7 @@ ################################################### -### code chunk number 5: split +### code chunk number 6: split ################################################### Si <- splitLexis( dmi, 0:30/2, "DMdur" ) dim( Si ) @@ -48,7 +56,7 @@ ################################################### -### code chunk number 6: knots +### code chunk number 7: knots ################################################### nk <- 5 ( ai.kn <- with( subset(Si,lex.Xst=="Ins" & lex.Cst!=lex.Xst ), @@ -64,7 +72,7 @@ ################################################### -### code chunk number 7: Poisson +### code chunk number 8: Poisson ################################################### library( splines ) DM.Ins <- glm( (lex.Xst=="Ins") ~ Ns( Age , knots=ai.kn ) + @@ -86,7 +94,7 @@ ################################################### -### code chunk number 8: prop-haz +### code chunk number 9: prop-haz ################################################### with( Si, table(lex.Cst) ) All.Dead <- glm( (lex.Xst %in% c("Dead(Ins)","Dead")) ~ @@ -100,7 +108,7 @@ ################################################### -### code chunk number 9: get-dev +### code chunk number 10: get-dev ################################################### what <- c("null.deviance","df.null","deviance","df.residual") ( rD <- unlist( DM.Dead[what] ) ) @@ -110,7 +118,7 @@ ################################################### -### code chunk number 10: pr-array +### code chunk number 11: pr-array ################################################### pr.rates <- NArray( list( DMdur = seq(0,12,0.1), DMage = 4:7*10, @@ -121,13 +129,13 @@ ################################################### -### code chunk number 11: simLexis.rnw:515-516 +### code chunk number 12: simLexis.rnw:382-383 ################################################### ci.pred ################################################### -### code chunk number 12: make-pred +### code chunk number 13: make-pred ################################################### nd <- data.frame( DMdur = as.numeric( dimnames(pr.rates)[[1]] ), lex.Cst = factor( 1, levels=1:4, @@ -153,7 +161,7 @@ ################################################### -### code chunk number 13: mort-int +### code chunk number 14: mort-int ################################################### par( mar=c(3,3,1,1), mgp=c(3,1,0)/1.6, las=1 ) plot( NA, xlim=c(40,82), ylim=c(5,300), bty="n", @@ -168,7 +176,7 @@ ################################################### -### code chunk number 14: Tr +### code chunk number 15: Tr ################################################### Tr <- list( "DM" = list( "Ins" = DM.Ins, "Dead" = DM.Dead ), @@ -176,7 +184,7 @@ ################################################### -### code chunk number 15: make-ini +### code chunk number 16: make-ini ################################################### str( Si[NULL,1:9] ) ini <- subset(Si,FALSE,select=1:9) @@ -186,7 +194,7 @@ ################################################### -### code chunk number 16: ini-fill +### code chunk number 17: ini-fill ################################################### ini[1:2,"lex.id"] <- 1:2 ini[1:2,"lex.Cst"] <- "DM" @@ -198,10 +206,10 @@ ################################################### -### code chunk number 17: simL +### code chunk number 18: simL ################################################### set.seed( 52381764 ) -Nsim <- 1000 +Nsim <- 5000 system.time( simL <- simLexis( Tr, ini, t.range = 12, @@ -209,13 +217,13 @@ ################################################### -### code chunk number 18: sum-simL +### code chunk number 19: sum-simL ################################################### summary( simL, by="sex" ) ################################################### -### code chunk number 19: Tr.p-simP +### code chunk number 20: Tr.p-simP ################################################### Tr.p <- list( "DM" = list( "Ins" = DM.Ins, "Dead" = All.Dead ), @@ -228,7 +236,7 @@ ################################################### -### code chunk number 20: Cox-dur +### code chunk number 21: Cox-dur ################################################### library( survival ) Cox.Dead <- coxph( Surv( DMdur, DMdur+lex.dur, @@ -242,7 +250,7 @@ ################################################### -### code chunk number 21: TR.c +### code chunk number 22: TR.c ################################################### Tr.c <- list( "DM" = list( "Ins" = Tr$DM$Ins, "Dead" = Cox.Dead ), @@ -255,7 +263,7 @@ ################################################### -### code chunk number 22: nState +### code chunk number 23: nState ################################################### system.time( nSt <- nState( subset(simL,sex=="M"), @@ -264,7 +272,7 @@ ################################################### -### code chunk number 23: pstate0 +### code chunk number 24: pstate0 ################################################### pM <- pState( nSt, perm=c(1,2,4,3) ) head( pM ) @@ -279,7 +287,7 @@ ################################################### -### code chunk number 24: pstatex +### code chunk number 25: pstatex ################################################### clr <- c("limegreen","orange") # expand with a lighter version of the two chosen colors @@ -290,8 +298,8 @@ lines( as.numeric(rownames(pM)), pM[,2], lwd=3 ) mtext( "60 year old male, diagnosed 1990, aged 55", side=3, line=2.5, adj=0, col=gray(0.6) ) mtext( "Survival curve", side=3, line=1.5, adj=0 ) -mtext( "DM, no insulin DM, Insulin", side=3, line=0.5, adj=0, col=clr[1] ) -mtext( "DM, no insulin", side=3, line=0.5, adj=0, col=clr[2] ) +mtext( "DM, no insulin DM, Insulin", side=3, line=0.5, adj=0, col=clr[2] ) +mtext( "DM, no insulin", side=3, line=0.5, adj=0, col=clr[1] ) axis( side=4 ) axis( side=4, at=1:19/20, labels=FALSE ) axis( side=4, at=1:99/100, labels=FALSE, tcl=-0.3 ) @@ -313,7 +321,7 @@ ################################################### -### code chunk number 25: pstatey +### code chunk number 26: pstatey ################################################### par( mfrow=c(1,2), las=1, mar=c(3,3,4,2), mgp=c(3,1,0)/1.6 ) # Men @@ -326,8 +334,8 @@ lines( as.numeric(rownames(pM)), pM[,2], lwd=3 ) mtext( "60 year old male, diagnosed 1990, aged 55", side=3, line=2.5, adj=0, col=gray(0.6) ) mtext( "Survival curve", side=3, line=1.5, adj=0 ) -mtext( "DM, no insulin DM, Insulin", side=3, line=0.5, adj=0, col=clr[1] ) -mtext( "DM, no insulin", side=3, line=0.5, adj=0, col=clr[2] ) +mtext( "DM, no insulin DM, Insulin", side=3, line=0.5, adj=0, col=clr[2] ) +mtext( "DM, no insulin", side=3, line=0.5, adj=0, col=clr[1] ) axis( side=4 ) axis( side=4, at=1:19/20, labels=FALSE ) axis( side=4, at=1:19/20, labels=FALSE, tcl=-0.4 ) @@ -351,7 +359,7 @@ ################################################### -### code chunk number 26: comp-0 +### code chunk number 27: comp-0 ################################################### PrM <- pState( nState( subset(simP,sex=="M"), at=seq(0,11,0.2), @@ -389,13 +397,23 @@ ################################################### -### code chunk number 27: simLexis.rnw:1033-1034 +### code chunk number 28: CHANGE1 (eval = FALSE) +################################################### +## source( "../R/simLexis.R", keep.source=TRUE ) + + +################################################### +### code chunk number 29: CHANGE2 ################################################### -source( "../R/simLexis.R", keep.source=TRUE ) +simX <- Epi:::simX +sim1 <- Epi:::sim1 +lint <- Epi:::lint +get.next <- Epi:::get.next +chop.lex <- Epi:::chop.lex ################################################### -### code chunk number 28: simLexis.rnw:1050-1053 +### code chunk number 30: simLexis.rnw:934-937 ################################################### cbind( attr( ini, "time.scale" ), @@ -403,55 +421,55 @@ ################################################### -### code chunk number 29: simLexis.rnw:1078-1079 +### code chunk number 31: simLexis.rnw:962-963 ################################################### simLexis ################################################### -### code chunk number 30: simLexis.rnw:1096-1097 +### code chunk number 32: simLexis.rnw:980-981 ################################################### simX ################################################### -### code chunk number 31: simLexis.rnw:1109-1110 +### code chunk number 33: simLexis.rnw:993-994 ################################################### sim1 ################################################### -### code chunk number 32: simLexis.rnw:1122-1123 +### code chunk number 34: simLexis.rnw:1006-1007 ################################################### lint ################################################### -### code chunk number 33: simLexis.rnw:1133-1134 +### code chunk number 35: simLexis.rnw:1017-1018 ################################################### get.next ################################################### -### code chunk number 34: simLexis.rnw:1143-1144 +### code chunk number 36: simLexis.rnw:1027-1028 ################################################### chop.lex ################################################### -### code chunk number 35: simLexis.rnw:1161-1162 +### code chunk number 37: simLexis.rnw:1045-1046 ################################################### nState ################################################### -### code chunk number 36: simLexis.rnw:1171-1172 +### code chunk number 38: simLexis.rnw:1055-1056 ################################################### pState ################################################### -### code chunk number 37: simLexis.rnw:1176-1178 +### code chunk number 39: simLexis.rnw:1060-1062 ################################################### plot.pState lines.pState diff -Nru r-cran-epi-2.19/inst/doc/simLexis.rnw r-cran-epi-2.30/inst/doc/simLexis.rnw --- r-cran-epi-2.19/inst/doc/simLexis.rnw 2017-04-14 06:58:08.000000000 +0000 +++ r-cran-epi-2.30/inst/doc/simLexis.rnw 1970-01-01 00:00:00.000000000 +0000 @@ -1,1204 +0,0 @@ -\SweaveOpts{results=verbatim,keep.source=TRUE,include=FALSE,eps=FALSE,prefix.string=sL} -%\VignetteIndexEntry{Simulation of multistate models with multiple timescales: simLexis} -\documentclass[a4paper,twoside,12pt]{report} - -%---------------------------------------------------------------------- -% General information -\newcommand{\Title}{Simulation of\\ multistate models with\\ multiple - timescales:\\ \texttt{simLexis} in the \texttt{Epi} package} -\newcommand{\Tit}{Multistate models with multiple timescales} -\newcommand{\Version}{Version 2.3} -\newcommand{\Dates}{\today} -\newcommand{\Where}{SDC} -\newcommand{\Homepage}{\url{http://BendixCarstensen.com/Epi/simLexis.pdf}} -\newcommand{\Faculty}{\begin{tabular}{rl} -Bendix Carstensen - & Steno Diabetes Center, Gentofte, Denmark\\ - & {\small \& Department of Biostatistics, - University of Copenhagen} \\ - & \texttt{bxc@steno.dk}\\ - & \url{http://BendixCarstensen.com} \\[1em] - \end{tabular}} - -%---------------------------------------------------------------------- -% Packages -%\usepackage[inline]{showlabels} -\usepackage[utf8]{inputenc} -\usepackage[T1]{fontenc} -\usepackage[english]{babel} -\usepackage[font=it,labelfont=normalfont]{caption} -\usepackage[colorlinks,urlcolor=blue,linkcolor=red]{hyperref} -\usepackage[ae,hyper]{Rd} -\usepackage[dvipsnames]{xcolor} -\usepackage[super]{nth} -\usepackage{makeidx,Sweave,floatflt,amsmath,amsfonts,amsbsy,enumitem,dcolumn,needspace} -\usepackage{ifthen,calc,eso-pic,everyshi} -\usepackage{booktabs,longtable,rotating,graphicx} -\usepackage{pdfpages,verbatim,fancyhdr,datetime,% -afterpage} -\usepackage[abspath]{currfile} -% \usepackage{times} -\renewcommand{\textfraction}{0.0} -\renewcommand{\topfraction}{1.0} -\renewcommand{\bottomfraction}{1.0} -\renewcommand{\floatpagefraction}{0.9} -\DeclareMathOperator{\Pp}{P} -\providecommand{\pmat}[1]{\Pp\left\{#1\right\}} -\providecommand{\ptxt}[1]{\Pp\left\{\text{#1}\right\}} -\providecommand{\dif}{{\,\mathrm d}} -% \usepackage{mslapa} -\newenvironment{exercise}[0]{\refstepcounter{exno} - \begin{quote} - {\bf Exercise \theexno.}} - {\end{quote}} -\definecolor{blaa}{RGB}{99,99,255} -\DeclareGraphicsExtensions{.pdf,.jpg} -% Make the Sweave output nicer -\DefineVerbatimEnvironment{Sinput}{Verbatim}{fontsize=\footnotesize,fontshape=sl,formatcom=\color{Blue}} -\DefineVerbatimEnvironment{Soutput}{Verbatim}{fontsize=\footnotesize,formatcom=\color{Maroon},xleftmargin=0em} -\DefineVerbatimEnvironment{Scode}{Verbatim}{fontsize=\footnotesize} -\fvset{listparameters={\setlength{\topsep}{0pt}}} -\renewenvironment{Schunk}% -{\renewcommand{\baselinestretch}{0.85} \vspace{\topsep}}% -{\renewcommand{\baselinestretch}{1.00} \vspace{\topsep}} -% \renewenvironment{knitrout} -% {\renewcommand{\baselinestretch}{0.85}} -% {\renewcommand{\baselinestretch}{1.00}} % redefined in topreport.tex - -%---------------------------------------------------------------------- -% The usual usefuls -% \input{/home/bendix/util/tex/useful.tex} - -%---------------------------------------------------------------------- -% Set up layout of pages -\oddsidemargin 1mm -\evensidemargin 1mm -\topmargin -5mm -\headheight 8mm -\headsep 5mm -\textheight 240mm -\textwidth 165mm -%\footheight 5mm -\footskip 15mm -\renewcommand{\topfraction}{0.9} -\renewcommand{\bottomfraction}{0.9} -\renewcommand{\textfraction}{0.1} -\renewcommand{\floatpagefraction}{0.9} -\renewcommand{\headrulewidth}{0.1pt} -\setcounter{secnumdepth}{4} -\setcounter{tocdepth}{2} - -%---------------------------------------------------------------------- -% How to insert a figure in a .rnw file -\newcommand{\rwpre}{sL} -\newcommand{\insfig}[3]{ -\begin{figure}[h] - \centering - \includegraphics[width=#2\textwidth]{\rwpre-#1} - \caption{#3} - \label{fig:#1} -% \afterpage{\clearpage} -\end{figure}} - -%---------------------------------------------------------------------- -% Here is the document starting with the titlepage -\begin{document} - -%---------------------------------------------------------------------- -% The title page -\setcounter{page}{1} -\pagenumbering{roman} -\pagestyle{plain} -\thispagestyle{empty} -% \vspace*{0.05\textheight} -\flushright -% The blank below here is necessary in order not to muck up the -% linespacing in title if it has more than 2 lines -{\Huge \bfseries \Title - -}\ \\[-1.5ex] -\noindent\textcolor{blaa}{\rule[-1ex]{\textwidth}{5pt}}\\[2.5ex] -\large -\Where \\ -\Dates \\ -\Homepage \\ -\Version \\[1em] -\normalsize -Compiled \today,\ \currenttime\\ -% from: \texttt{\currfileabspath}\\[1em] -% \input{wordcount} -\normalsize -\vfill -\Faculty -% End of titlepage -% \newpage - -%---------------------------------------------------------------------- -% Table of contents -\tableofcontents - -%---------------------------------------------------------------------- -% General text layout -\raggedright -\parindent 1em -\parskip 0ex -\cleardoublepage - -%---------------------------------------------------------------------- -% General page style -\pagenumbering{arabic} -\setcounter{page}{1} -\pagestyle{fancy} -\renewcommand{\chaptermark}[1]{\markboth{\textsl{#1}}{}} -\renewcommand{\sectionmark}[1]{\markright{\thesection\ \textsl{#1}}{}} -\fancyhead[EL]{\bf \thepage \quad \rm \leftmark} -\fancyhead[ER]{\sl \Tit} -\fancyhead[OR]{\rm \rightmark \quad \bf \thepage} -\fancyfoot{} - - -%---------------------------------------------------------------------- -% Here comes the substance - -\chapter{Using \texttt{simLexis}} - -\section{Introduction} - -This vignette explains the machinery behind simulation of life -histories through multistate models implemented in -\texttt{simLexis}. In \texttt{simLexis} transition rates are allowed -to depend on multiple time scales, including timescales defined as -time since entry to a particular state (duration). This therefore also -covers the case where time \emph{at} entry into a state is an -explanatory variable for the rates, since time at entry is merely time -minus duration. Thus, the set-up here goes beyond Markov- and -semi-Markov-models, and brings simulation based estimation of -state-occupancy probabilities into the realm of realistic multistate -models. - -The basic idea is to simulate a new \texttt{Lexis} object -\cite{Plummer.2011,Carstensen.2011a} as defined in the \texttt{Epi} -package for \R, based on 1) a multistate model defined by its states -and the transition rates between them and 2) an initial population of -individuals. - -Thus the output will be a \texttt{Lexis} object describing the -transitions of a predefined set of persons through a multistate -model. Therefore, if persons are defined to be identical at start, -then calculation of the probability of being in a particular state at -a given time boils down to a simple enumeration of the fraction of the -persons in the particular state at the given time. Bar of course the -(binomial) simulation error, but this can be brought down by -simulation a sufficiently large number of persons. - -An observed \texttt{Lexis} object with follow-up of persons through a -number of states will normally be the basis for estimation of -transition rates between states, and thus will contain all information -about covariates determining the occurrence rates, in particular the -\emph{timescales} \cite{Iacobelli.2013}. Hence, the natural input to -simulation from an estimated multistate model will typically be an -object of the same structure as the originally observed. Since -transitions and times are what is simulated, any values of -\texttt{lex.Xst} and \texttt{lex.dur} in the input object will of -course be ignored. - -This first chapter of this vignette shows by an example how to use the -function \texttt{simLexis} and display the results. The subsequent -chapter discusses in more detail how the simulation machinery is -implemented and is not needed for the practical use of \texttt{simLexis}. - -\section{\texttt{simLexis} in practice} - -This section is largely a commented walk-trough of the example from -the help-page of \texttt{simLexis}, with a larger number of simulated -persons in order to minimize the pure simulation variation. - -When we want to simulate transition times through a multistate model -where transition rates may depend on time since entry to the current -or a previous state, it is essential that we have a machinery to keep -track of the transition time on \emph{all} time scales, as well as a -mechanism that can initiate a new time scale to 0 when a transition -occurs to a state where we shall use time since entry as determinant -of exit rates from that state. This is provided by \texttt{simLexis}. - -\subsection{Input for the simulation} - -Input for simulation of a single trajectory through a multistate model -requires a representation of the \emph{current status} of a person; -the starting conditions. The object that we supply to the simulation -function must contain information about all covariates and all -timescales upon which transitions depend, and in particular which -one(s) of the timescales that are defined as time since entry into a -particular state. Hence, starting conditions should be represented as -a \texttt{Lexis} object (where \texttt{lex.dur} and \texttt{lex.Xst} -are ignored, since there is no follow-up yet), where the time scale -information is in the attributes \texttt{time.scale} and -\texttt{time.since} respectively. - -Thus there are two main arguments to a function to simulate from a -multistate model: -\begin{enumerate} -\item A \texttt{Lexis} object representing the initial states and - covariates of the population to be simulated. This has to have the - same structure as the original \texttt{Lexis} object representing - the multistate model from which transition rates in the model were - estimated. As noted above, the values for \texttt{lex.Xst} and - \texttt{lex.dur} are not required (since these are the quantities - that will be simulated). -\item A transition object, representing the transition intensities - between states, which should be a list of lists of intensity - representations. As an intensity representation we mean a function - that for given a \texttt{Lexis} object that can be used to produce - estimates of the transition intensities at a set of supplied time - points since the state represented in the \texttt{Lexis} object. - - The names of the elements of the transition object (which are lists) - will be names of the \emph{transient} states, that is the states - \emph{from} which a transition can occur. The names of the elements - of each of these lists are the names of states \emph{to} which - transitions can occur (which may be either transient or absorbing - states). - - Hence, if the transition object is called \texttt{Tr} then - \verb+TR$A$B+ (or \verb+Tr[["A"]][["B"]]+) will represent the - transition intensity from state \texttt{A} to the state \texttt{B}. - - The entries in the transition object can be either \texttt{glm} - objects, representing Poisson models for the transitions, - \texttt{coxph} objects representing an intensity model along one - time scale, or simply a function that takes a \texttt{Lexis} - object as input returns an estimated intensity for each row. - -\end{enumerate} - -In addition to these two input items, there will be a couple of tuning -parameters. - -The output of the function will simply be a \texttt{Lexis} object with -simulated transitions between states. This will be the basis for -deriving sensible statistics from the \texttt{Lexis} object --- see -next section. - -\section{Setting up a \texttt{Lexis} object} - -As an example we will use the \texttt{DMlate} dataset from the -\texttt{Epi} package; it is a dataset simulated to resemble a random -sample of 10,000 patients from the Danish National Diabetes Register. - -We start by loading the \texttt{Epi} package: -<>= -options( width=90 ) -library( Epi ) -print( sessionInfo(), l=F ) -@ % -First we load the diabetes data and set up a simple illness-death -model: -<>= -data(DMlate) -dml <- Lexis( entry = list(Per=dodm, Age=dodm-dobth, DMdur=0 ), - exit = list(Per=dox), - exit.status = factor(!is.na(dodth),labels=c("DM","Dead")), - data = DMlate ) -@ % -This is just data for a simple survival model with states ``DM'' and -``Dead''. Now we cut the follow-up at insulin start, which for the -majority of patients (T2D) is a clinical indicator of deterioration of -disease regulation. We therefore also introduce a new timescale, and -split the non-precursor states, so that we can address the question of -ever having been on insulin: -<>= -dmi <- cutLexis( dml, cut = dml$doins, - pre = "DM", - new.state = "Ins", - new.scale = "t.Ins", - split.states = TRUE ) -summary( dmi ) -str(dmi) -@ % $ -We can show how many person-years we have and show the number of -transitions and transition rates (per 1000), using the -\texttt{boxes.Lexis} function to display the states and the number of -transitions between them: -<>= -boxes( dmi, boxpos = list(x=c(20,20,80,80), - y=c(80,20,80,20)), - scale.R = 1000, show.BE = TRUE ) -@ % -\insfig{boxes}{0.8}{Data overview for the \textrm{\tt dmi} - dataset. Numbers in the boxes are person-years and the number of - persons who begin, resp. end their follow-up in each state, and - numbers on the arrows are no. of transitions and rates (transition - intensities) per 1000 PY.} - -\section{Analysis of rates} - -In the \texttt{Lexis} object (which is just a data frame) each person -is represented by one record for each transient state he occupies, -thus in this case either 1 or 2 records; those who have a recorded -time both without and with insulin have two records. - -In order to be able to fit Poisson models with occurrence rates varying -by the different time-scales, we split the follow-up in 6-month intervals -for modeling: -<>= -Si <- splitLexis( dmi, 0:30/2, "DMdur" ) -dim( Si ) -print( subset( Si, lex.id==97 )[,1:10], digits=6 ) -@ % -Note that when we split the follow-up each person's follow up now -consists of many records, each with the \emph{current} values of the -timescales at the start of the interval represented by the record. In -the modelling we must necessarily assume that the rates are constant -within each 6-month interval, but the \emph{size} of these rates we -model as smooth functions of the time scales (that is the values at -the beginning of each interval). - -The approach often used in epidemiology where one parameter is -attached to each interval of time (or age) is not feasible when more -than one time scale is used, because intervals are not classified the -same way on all timescales. - -We shall use natural splines (restricted cubic splines) for the -analysis of rates, and hence we must allocate knots for the -splines. This is done for each of the time-scales, and separately for -the transition out of states ``DM'' and ``Ins''. For age, we place the -knots so that the number of events is the same between each pair of -knots, but only half of this beyond each of the boundary knots, -whereas for the timescales \texttt{DMdur} and \texttt{tIns} where we -have observation from a well-defined 0, we put knots at 0 and place the -remaining knots so that the number of events is the same between each -pair of knots as well as outside the boundary knots. -<>= -nk <- 5 -( ai.kn <- with( subset(Si,lex.Xst=="Ins" & lex.Cst!=lex.Xst ), - quantile( Age+lex.dur , probs=(1:nk-0.5)/nk ) ) ) -( ad.kn <- with( subset(Si,lex.Xst=="Dead"), - quantile( Age+lex.dur , probs=(1:nk-0.5)/nk ) ) ) -( di.kn <- with( subset(Si,lex.Xst=="Ins" & lex.Cst!=lex.Xst ), - c(0,quantile( DMdur+lex.dur, probs=(1:(nk-1))/nk ) )) ) -( dd.kn <- with( subset(Si,lex.Xst=="Dead"), - c(0,quantile( DMdur+lex.dur, probs=(1:(nk-1))/nk ) )) ) -( ti.kn <- with( subset(Si,lex.Xst=="Dead(Ins)"), - c(0,quantile( t.Ins+lex.dur, probs=(1:(nk-1))/nk ) )) ) -@ % -Note that when we tease out the event records for transition to -\emph{transient} states (in this case ``Ins'', that is -verb|lex.Xst=="Ins"|), we should add \verb|lex.Cst!=lex.Xst|, to -include only transition records and avoiding including records of -sojourn time in the transient state. - -We then fit Poisson models to transition rates, using the wrapper -\texttt{Ns} from the \texttt{Epi} package to simplify the -specification of the rates: -<>= -library( splines ) -DM.Ins <- glm( (lex.Xst=="Ins") ~ Ns( Age , knots=ai.kn ) + - Ns( DMdur, knots=di.kn ) + - I(Per-2000) + sex, - family=poisson, offset=log(lex.dur), - data = subset(Si,lex.Cst=="DM") ) -DM.Dead <- glm( (lex.Xst=="Dead") ~ Ns( Age , knots=ad.kn ) + - Ns( DMdur, knots=dd.kn ) + - I(Per-2000) + sex, - family=poisson, offset=log(lex.dur), - data = subset(Si,lex.Cst=="DM") ) -Ins.Dead <- glm( (lex.Xst=="Dead(Ins)") ~ Ns( Age , knots=ad.kn ) + - Ns( DMdur, knots=dd.kn ) + - Ns( t.Ins, knots=ti.kn ) + - I(Per-2000) + sex, - family=poisson, offset=log(lex.dur), - data = subset(Si,lex.Cst=="Ins") ) -@ % -Note the similarity of the code used to fit the three models, is is -mainly redefining the response variable (``to'' state) and the subset -of the data used (``from'' state). - -\section{The mortality rates} - -This section discusses in some detail how to extract ad display the -mortality rates from the models fitted. But it is not necessary for -understanding how to use \texttt{simLexis} in practice. - -\subsection{Proportionality of mortality rates} - -Note that we have fitted separate models for the three transitions, -there is no assumption of proportionality between the mortality rates -from \texttt{DM} and \texttt{Ins}. - -However, there is nothing that prevents us from testing this -assumption; we can just fit a model for the mortality rates in the -entire data frame \texttt{Si}, and compare the deviance from this with -the sum of the deviances from the separate models: -<>= -with( Si, table(lex.Cst) ) -All.Dead <- glm( (lex.Xst %in% c("Dead(Ins)","Dead")) ~ - Ns( Age , knots=ad.kn ) + - Ns( DMdur, knots=dd.kn ) + - lex.Cst + - I(Per-2000) + sex, - family=poisson, offset=log(lex.dur), - data = Si ) -round( ci.exp( All.Dead ), 3 ) -@ -From the parameter values we would in a simple setting just claim that -start of insulin-treatment was associated with a slightly more than -doubling of mortality. - -The model \texttt{All.dead} assumes that the age- and DM-duration -effects on mortality in the ``DM'' and ``Ins'' states are the same, -and moreover that there is no effect of insulin duration, but merely a -mortality that is larger by a multiplicative constant not depending on -insulin duration. The model \texttt{DM.Dead} has 8 parameters to -describe the dependency on age and DM duration, the model -\texttt{Ins.Dead} has 12 for the same plus the insulin duration (a -natural spline with $k$ knots gives $k-1$ parameters, and we chose -$k=5$ above). - -We can compare the fit of the proportional hazards model with the fit -of the separate models for the two mortality rates, by adding up the -deviances and d.f. from these: -<>= -what <- c("null.deviance","df.null","deviance","df.residual") -( rD <- unlist( DM.Dead[what] ) ) -( rI <- unlist( Ins.Dead[what] ) ) -( rA <- unlist( All.Dead[what] ) ) -round( c( dd <- rA-(rI+rD), "pVal"=1-pchisq(dd[3],dd[4]+1) ), 3 ) -@ % -Thus we see there is a substantial non-proportionality of mortality -rates from the two states; but a test provides no clue whatsoever to -the particular \emph{shape} of the non-proportionality. - -To this end, we shall explore the predicted mortalities under the two -models quantitatively in more detail. Note that the reason that there -is a difference in the null deviances (and a difference of 1 in the -null d.f.) is that the null deviance of \texttt{All.Dead} refer to a -model with a single intercept, that is a model with constant and -\emph{identical} mortality rates from the states ``DM'' and ``Ins'', -whereas the null models for \texttt{DM.Dead} and \texttt{Ins.Dead} -have constant but \emph{different} mortality rates from the states -``DM'' and ``Ins''. This is however irrelevant for the comparison of -the \emph{residual} deviances. - -\subsection{How the mortality rates look} - -If we want to see how the mortality rates are modelled in -\texttt{DM.Dead} and \texttt{Ins.Dead} in relation to -\texttt{All.Dead}, we make a prediction of rates for say men diagnosed -in different ages and going on insulin at different times after -this. So we consider men diagnosed in ages 40, 50, 60 and 70, and who -either never enter insulin treatment or do it 0, 2 or 5 years after -diagnosis of DM. - -To this end we create a prediction data frame where we have -observation times from diagnosis and 12 years on (longer would not -make sense as this is the extent of the data). - -But we start by setting up an array to hold the predicted mortality -rates, classified by diabetes duration, age at diabetes onset, time of -insulin onset, and of course type of model. What we want to do is to -plot the age-specific mortality rates for persons not on insulin, and -for persons starting insulin at different times after DM. The -mortality curves start at the age where the person gets diabetes and -continues 12 years; for persons on insulin they start at the age when -they initiate insulin. -<>= -pr.rates <- NArray( list( DMdur = seq(0,12,0.1), - DMage = 4:7*10, - r.Ins = c(NA,0,2,5), - model = c("DM/Ins","All"), - what = c("rate","lo","hi") ) ) -str( pr.rates ) -@ % -For convenience the \texttt{Epi} package contains a function that computes -predicted (log-)rates with c.i. --- it is merely a wrapper for -\texttt{predict.glm}: -<<>>= -ci.pred -@ -So set up the prediction data frame and modify it in loops over -ages at onset and insulin onset. Note that we set \texttt{lex.dur} to -1000 in the prediction frame, so that we obtain rates in units of -events per 1000 PY. -<>= -nd <- data.frame( DMdur = as.numeric( dimnames(pr.rates)[[1]] ), - lex.Cst = factor( 1, levels=1:4, - labels=levels(Si$lex.Cst) ), - sex = factor( 1, levels=1:2, labels=c("M","F")), - lex.dur = 1000 ) -for( ia in dimnames(pr.rates)[[2]] ) - { -dnew <- transform( nd, Age = as.numeric(ia)+DMdur, - Per = 1998+DMdur ) -pr.rates[,ia,1,"DM/Ins",] <- ci.pred( DM.Dead, newdata = dnew ) -pr.rates[,ia,1,"All" ,] <- ci.pred( All.Dead, newdata = dnew ) -for( ii in dimnames(pr.rates)[[3]][-1] ) - { -dnew = transform( dnew, lex.Cst = factor( 2, levels=1:4, - labels=levels(Si$lex.Cst) ), - t.Ins = ifelse( (DMdur-as.numeric(ii)) >= 0, - DMdur-as.numeric(ii), NA ) ) -pr.rates[,ia, ii ,"DM/Ins",] <- ci.pred( Ins.Dead, newdata = dnew ) -pr.rates[,ia, ii ,"All" ,] <- ci.pred( All.Dead, newdata = dnew ) - } - } -@ % $ -So for each age at DM onset we make a plot of the mortality as function -of current age both for those with no insulin treatment at those that -start 1, 3 and 5 years after, thus 4 curves (with c.i.). These curves -are replicated with a different color for the simplified model. -<>= -par( mar=c(3,3,1,1), mgp=c(3,1,0)/1.6, las=1 ) -plot( NA, xlim=c(40,82), ylim=c(5,300), bty="n", - log="y", xlab="Age", ylab="Mortality rate per 1000 PY" ) -abline( v=seq(40,80,5), h=outer(1:9,10^(0:2),"*"), col=gray(0.8) ) -for( aa in 4:7*10 ) for( ii in 1:4 ) - matlines( aa+as.numeric(dimnames(pr.rates)[[1]]), - cbind( pr.rates[,paste(aa),ii,"DM/Ins",], - pr.rates[,paste(aa),ii,"All" ,] ), - type="l", lty=1, lwd=c(3,1,1), - col=rep(c("red","limegreen"),each=3) ) -@ % -\insfig{mort-int}{0.9}{Estimated mortality rates for male diabetes - patients with no insulin (lower sets of curves) and insulin (upper - curves), with DM onset in age 40, 50, 60 and 70. The red curves are - from the models with separate age effects for persons with and - without insulin, and a separate effect of insulin duration. The - green curves are from the model with common age-effects and only a - time-dependent effect of insulin, assuming no effect of insulin - duration (the classical time-dependent variable approach). Hence the - upper green curve is common for any time of insulin inception.} - -From figure \ref{fig:mort-int} we see that there is a substantial -insulin-duration effect which is not accommodated by the simple model -with only one time-dependent variable to describe the insulin -effect. Note that the simple model (green curves) for those on insulin -does not depend in insulin duration, and hence the mortality curves -for those on insulin are just parallel to the mortality curves for -those not on insulin, regardless of diabetes duration (or age) at the -time of insulin initiation. This is the proportional hazards -assumption. Thus the effect of insulin initiation is under-estimated -for short duration of insulin and over-estimated for long duration of -insulin. - -This is the major discrepancy between the two models, and -illustrates the importance of being able to accommodate different time -scales, but there is also a declining overall insulin effect by age which -is not accommodated by the proportional hazards approach. - -Finally, this plot illustrates an important feature in reporting -models with multiple timescales; all timescales must be represented in -the predicted rates, only reporting the effect of one timescale, -conditional on a fixed value of other timescales is misleading since -all timescales by definition advance at the same pace. For example, -the age-effect for a fixed value of insulin duration really is a -misnomer since it does not correspond to any real person's follow-up, -but to the mortality of persons in different ages but with the same -duration of incuin use. - -\section{Input to the \texttt{simLexis} function} - -In order to simulate from the multistate model with the estimated -transition rates, and get the follow-up of a hypothetical cohort, we -must supply \emph{both} the transition rates and the structure of the -model \emph{as well as} the initial cohort status to -\texttt{simLexis}. - -\subsection{The transition object} - -We first put the models into an object representing the transitions; -note this is a list of lists, the latter having \texttt{glm} objects -as elements: -<>= -Tr <- list( "DM" = list( "Ins" = DM.Ins, - "Dead" = DM.Dead ), - "Ins" = list( "Dead(Ins)" = Ins.Dead ) ) -@ % -Now we have the description of the rates and of the structure of the -model. The \texttt{Tr} object defines the states and models for all -transitions between them; the object \verb|Tr$A$B| is the model -for the transition intensity from state \texttt{A} to state -\texttt{B}. - -\subsection{The initial cohort} - -We now define an initial \texttt{Lexis} object of persons with all -relevant covariates defined. Note that we use \texttt{subset} to get a -\texttt{Lexis} object, this conserves the \texttt{time.scale} and -\texttt{time.since} attributes which is needed for the simulation (the -usual ``\texttt{[}'' operator does not preserve these attributes when -you select columns): -<>= -str( Si[NULL,1:9] ) -ini <- subset(Si,FALSE,select=1:9) -str( ini ) -ini <- subset(Si,select=1:9)[NULL,] -str( ini ) -@ % -We now have an empty \texttt{Lexis} object with attributes reflecting -the timescales in multistate model we want to simulate, so we must now -enter some data to represent the persons whose follow-up we want to -simulate through the model; we set up an initial dataset with one man -and one woman: -<>= -ini[1:2,"lex.id"] <- 1:2 -ini[1:2,"lex.Cst"] <- "DM" -ini[1:2,"Per"] <- 1995 -ini[1:2,"Age"] <- 60 -ini[1:2,"DMdur"] <- 5 -ini[1:2,"sex"] <- c("M","F") -ini -@ % - -\section{Simulation of the follow-up} - -Now we simulate 1000 of each of these persons using the estimated -model. The \texttt{t.range} argument gives the times range where the -integrated intensities (cumulative rates) are evaluated and where -linear interpolation is used when simulating transition times. Note -that this must be given in the same units as \texttt{lex.dur} in the -\texttt{Lexis} object used for fitting the models for the transitions. -<>= -set.seed( 52381764 ) -Nsim <- 1000 -system.time( simL <- simLexis( Tr, - ini, - t.range = 12, - N = Nsim ) ) -@ % -%% Temp -%% <<>>= -%% source("../../../tmpstore/rbind.Lexis.R") -%% source("../R/simLexis.R") -%% lls() -%% @ -%% Temp -The result is a \texttt{Lexis} object --- a data frame representing -the simulated follow-up of \Sexpr{2*Nsim} persons (\Sexpr{Nsim} -identical men and \Sexpr{Nsim} identical women) according to the rates -we estimated from the original dataset. -<>= -summary( simL, by="sex" ) -@ % - -\subsection{Using other models for simulation} - -\subsubsection{Proportional hazards Poisson model} - -We fitted a proportional mortality model \texttt{All.Dead} (which fitted -worse than the other two), this is a model for \emph{both} the -transition from ``DM'' to ``Death'' \emph{and} from ``Ins'' to -``Dead(Ins)'', assuming that they are proportional. But it can easily -be used in the simulation set-up, because the state is embedded in the -model via the term \texttt{lex.Cst}, which is updated during the simulation. - -Simulation using this instead just requires that we supply a different -transition object: -<>= -Tr.p <- list( "DM" = list( "Ins" = DM.Ins, - "Dead" = All.Dead ), - "Ins" = list( "Dead(Ins)" = All.Dead ) ) -system.time( simP <- simLexis( Tr.p, - ini, - t.range = 12, - N = Nsim ) ) -summary( simP, by="sex" ) -@ % - -\subsubsection{Proportional hazards Cox model} - -A third possibility would be to replace the two-time scale -proportional mortality model by a one-time-scale Cox-model, using -diabetes duration as time scale, and age at diagnosis of DM as (fixed) -covariate: -<>= -library( survival ) -Cox.Dead <- coxph( Surv( DMdur, DMdur+lex.dur, - lex.Xst %in% c("Dead(Ins)","Dead")) ~ - Ns( Age-DMdur, knots=ad.kn ) + - I(lex.Cst=="Ins") + - I(Per-2000) + sex, - data = Si ) -round( ci.exp( Cox.Dead ), 3 ) -round( ci.exp( All.Dead ), 3 ) -@ % -Note that in order for this model to be usable for simulation, it is -necessary that we use the components of the \texttt{Lexis} object to -specify the survival. Each record in the data frame \texttt{Si} -represents follow up from \texttt{DMdur} to \texttt{DMdur+lex.dur}, so -the model is a Cox model with diabetes duration as underlying timescale -and age at diagnosis, \texttt{Age-DMdur}, as covariate. - -Also note that we used \texttt{I(lex.Cst=="Ins")} instead of just -\texttt{lex.Cst}, because \texttt{coxph} assigns design matrix columns -to all levels of \texttt{lex.Cst}, also those not present in data, -which would give \texttt{NA}s among the parameter estimates and -\texttt{NA}s as mortality outcomes. - -We see that the effect of insulin and the other covariates are pretty -much the same as in the two-time-scale model. We can simulate from this -model too; there is no restrictions on what type of model can be used -for different transitions -<>= -Tr.c <- list( "DM" = list( "Ins" = Tr$DM$Ins, - "Dead" = Cox.Dead ), - "Ins" = list( "Dead(Ins)" = Cox.Dead ) ) -system.time( simC <- simLexis( Tr.c, - ini, - t.range = 12, - N = Nsim ) ) -summary( simC, by="sex" ) -@ - -\section{Reporting the simulation results} - -We can now tabulate the number of persons in each state at a -predefined set of times on a given time scale. Note that in order for -this to be sensible, the \texttt{from} argument would normally be -equal to the starting time for the simulated individuals. -<>= -system.time( -nSt <- nState( subset(simL,sex=="M"), - at=seq(0,11,0.2), from=1995, time.scale="Per" ) ) -nSt[1:10,] -@ % -We see that as time goes by, the 5000 men slowly move away from the -starting state (\texttt{DM}). - -Based on this table (\texttt{nSt} is a table) we can now compute the -fractions in each state, or, rather more relevant, the cumulative fraction -across the states in some specified order, so that a plot of the -stacked probabilities can be made, using either the default rather -colorful layout, or a more minimalistic version (both in figure \ref{fig:pstate0}): -<>= -pM <- pState( nSt, perm=c(1,2,4,3) ) -head( pM ) -par( mfrow=c(1,2), mar=c(3,3,1,1), mgp=c(3,1,0)/1.6 ) -plot( pM ) -plot( pM, border="black", col="transparent", lwd=3 ) -text( rep(as.numeric(rownames(pM)[nrow(pM)-1]),ncol(pM)), - pM[nrow(pM),]-diff(c(0,pM[nrow(pM),]))/5, - colnames( pM ), adj=1 ) -box( col="white", lwd=3 ) -box() -@ % -\insfig{pstate0}{1.0}{Default layout of the \textrm{\tt plot.pState} - graph (left), and a version with the state probabilites as lines and - annotation of states.} - -A more useful set-up of the graph would include a more through -annotation and sensible choice of colors, as seen in figure \ref{fig:pstatex}: -<>= -clr <- c("limegreen","orange") -# expand with a lighter version of the two chosen colors -clx <- c( clr, rgb( t( col2rgb( clr[2:1] )*2 + rep(255,3) ) / 3, max=255 ) ) -par( mfrow=c(1,2), las=1, mar=c(3,3,4,2), mgp=c(3,1,0)/1.6 ) -# Men -plot( pM, col=clx ) -lines( as.numeric(rownames(pM)), pM[,2], lwd=3 ) -mtext( "60 year old male, diagnosed 1990, aged 55", side=3, line=2.5, adj=0, col=gray(0.6) ) -mtext( "Survival curve", side=3, line=1.5, adj=0 ) -mtext( "DM, no insulin DM, Insulin", side=3, line=0.5, adj=0, col=clr[1] ) -mtext( "DM, no insulin", side=3, line=0.5, adj=0, col=clr[2] ) -axis( side=4 ) -axis( side=4, at=1:19/20, labels=FALSE ) -axis( side=4, at=1:99/100, labels=FALSE, tcl=-0.3 ) -# Women -pF <- pState( nState( subset(simL,sex=="F"), - at=seq(0,11,0.2), - from=1995, - time.scale="Per" ), - perm=c(1,2,4,3) ) -plot( pF, col=clx ) -lines( as.numeric(rownames(pF)), pF[,2], lwd=3 ) -mtext( "60 year old female, diagnosed 1990, aged 55", side=3, line=2.5, adj=0, col=gray(0.6) ) -mtext( "Survival curve", side=3, line=1.5, adj=0 ) -mtext( "DM, no insulin DM, Insulin", side=3, line=0.5, adj=0, col=clr[1] ) -mtext( "DM, no insulin", side=3, line=0.5, adj=0, col=clr[2] ) -axis( side=4 ) -axis( side=4, at=1:19/20, labels=FALSE ) -axis( side=4, at=1:99/100, labels=FALSE, tcl=-0.3 ) -@ % -\insfig{pstatex}{1.0}{\textrm{\tt plot.pState} graphs where persons - ever on insulin are given in orange and persons never on insulin in - green, and the overall survival (dead over the line) as a black line.} - -If we instead wanted to show the results on the age-scale, we would -use age as timescale when constructing the probabilities; otherwise the -code is pretty much the same as before (Figure \ref{fig:pstatey}): -<>= -par( mfrow=c(1,2), las=1, mar=c(3,3,4,2), mgp=c(3,1,0)/1.6 ) -# Men -pM <- pState( nState( subset(simL,sex=="M"), - at=seq(0,11,0.2), - from=60, - time.scale="Age" ), - perm=c(1,2,4,3) ) -plot( pM, col=clx, xlab="Age" ) -lines( as.numeric(rownames(pM)), pM[,2], lwd=3 ) -mtext( "60 year old male, diagnosed 1990, aged 55", side=3, line=2.5, adj=0, col=gray(0.6) ) -mtext( "Survival curve", side=3, line=1.5, adj=0 ) -mtext( "DM, no insulin DM, Insulin", side=3, line=0.5, adj=0, col=clr[1] ) -mtext( "DM, no insulin", side=3, line=0.5, adj=0, col=clr[2] ) -axis( side=4 ) -axis( side=4, at=1:19/20, labels=FALSE ) -axis( side=4, at=1:19/20, labels=FALSE, tcl=-0.4 ) -axis( side=4, at=1:99/100, labels=FALSE, tcl=-0.3 ) -# Women -pF <- pState( nState( subset(simL,sex=="F"), - at=seq(0,11,0.2), - from=60, - time.scale="Age" ), - perm=c(1,2,4,3) ) -plot( pF, col=clx, xlab="Age" ) -lines( as.numeric(rownames(pF)), pF[,2], lwd=3 ) -mtext( "60 year old female, diagnosed 1990, aged 55", side=3, line=2.5, adj=0, col=gray(0.6) ) -mtext( "Survival curve", side=3, line=1.5, adj=0 ) -mtext( "DM, no insulin DM, Insulin", side=3, line=0.5, adj=0, col=clr[1] ) -mtext( "DM, no insulin", side=3, line=0.5, adj=0, col=clr[2] ) -axis( side=4 ) -axis( side=4, at=1:9/10, labels=FALSE ) -axis( side=4, at=1:19/20, labels=FALSE, tcl=-0.4 ) -axis( side=4, at=1:99/100, labels=FALSE, tcl=-0.3 ) -@ % -Note the several statements with \texttt{axis(side=4,...}; they are -nesessary to get the fine tick-marks in the right hand side of the -plots that you will need in order to read off the probabilities at -2006 (or 71 years). - -\insfig{pstatey}{1.0}{\textrm{\tt plot.pState} graphs where persons - ever on insulin are given in orange and persons never on insulin in - green, and the overall survival (dead over the line) as a black line.} - -\subsection{Comparing predictions from different models} - -We have actually fitted different models for the transitions, and we -have simulated Lexis objects from all three approaches, so we can plot -these predictions on top of each other: -<>= -PrM <- pState( nState( subset(simP,sex=="M"), - at=seq(0,11,0.2), - from=60, - time.scale="Age" ), - perm=c(1,2,4,3) ) -PrF <- pState( nState( subset(simP,sex=="F"), - at=seq(0,11,0.2), - from=60, - time.scale="Age" ), - perm=c(1,2,4,3) ) -CoxM <- pState( nState( subset(simC,sex=="M"), - at=seq(0,11,0.2), - from=60, - time.scale="Age" ), - perm=c(1,2,4,3) ) -CoxF <- pState( nState( subset(simC,sex=="F"), - at=seq(0,11,0.2), - from=60, - time.scale="Age" ), - perm=c(1,2,4,3) ) - -par( mfrow=c(1,2), mar=c(3,3,1,1), mgp=c(3,1,0)/1.6 ) - plot( pM, border="black", col="transparent", lwd=3 ) -lines( PrM, border="blue" , col="transparent", lwd=3 ) -lines( CoxM, border="red" , col="transparent", lwd=3 ) -text( 60.5, 0.05, "M" ) -box( lwd=3 ) - - plot( pF, border="black", col="transparent", lwd=3 ) -lines( PrF, border="blue" , col="transparent", lwd=3 ) -lines( CoxF, border="red" , col="transparent", lwd=3 ) -text( 60.5, 0.05, "F" ) -box( lwd=3 ) -@ % -\insfig{comp-0}{1.0}{Comparison of the simulated state occupancy - probabilities using separate Poisson models for the mortality rates - with and without insulin (black) and using proportional hazards - Poisson models (blue) and Cox-models with diabetes duration as - timescale and age at diabetes diagnosis as covariate (red).} - -From figure \ref{fig:comp-0} it is clear that the two proportional -hazards models (blue and red curves) produce pretty much the same -estimates of the state occupancy probabilites over time, but also that -they relative to the model with separately estimated transition -intensities overestimates the probability of being alive without -insulin and underestimates the probabilities of being dead without -insulin. However both the overall survival, and the fraction of -persons on insulin are quite well in agreement with the more elaborate -model. Thus the proportional hazards models overestimate the relative -mortality of the insulin treated diabetes patients relative to the -non-insulin treated. - -Interestingly, we also see a bump in th estimated probabilities from -the Cox-model based model, but this is entirely an artifact that comes -from the estimation method for the baseline hazard of the Cox-model -that lets the (cumulative) hazard have large jumps at event times -where the risk set is small. So also here it shows up that an -assumption of continuous underlying hazards leads to more credible -estimates. - -\chapter{Simulation of transitions in multistate models} - -\section{Theory} - -Suppose that the rate functions for the transitions out of the current -state to, say, 3 different states are $\lambda_1$, $\lambda_2$ and -$\lambda_3$, and the corresponding cumulative rates are $\Lambda_1$, -$\Lambda_2$ and $\Lambda_3$, and we want to simulate an exit time and -an exit state (that is either 1, 2 or 3). This can be done in two -slightly different ways: -\begin{enumerate} -\item First time, then state: - \begin{enumerate} - \item Compute the survival function, $S(t) = - \exp\bigl(-\Lambda_1(t)-\Lambda_2(t)-\Lambda_3(t)\bigr)$ - \item Simulate a random U(0,1) variate, $u$, say. - \item The simulated exit time is then the solution $t_u$ to - the equation $S(t_u) = u \quad \Leftrightarrow \quad \sum_j\Lambda_j(t_u) = -\log(u)$. - \item A simulated transition at $t_u$ is then found by simulating a - random draw from the multinomial distribution with probabilities - $p_i=\lambda_i(t_u) / \sum_j\lambda_j(t_u)$. - \end{enumerate} -\item Separate cumulative incidences: - \begin{enumerate} - \item Simulate 3 independent U(0,1) random variates $u_1$, $u_2$ and $u_3$. - \item Solve the equations $\Lambda_i(t_i)=-\log(u_i), i=1,2,3$ and get $(t_1,t_2,t_3)$. - \item The simulated survival time is then $\min(t_1,t_2,t_3)$, and - the simulated transition is to the state corresponding to this, - that is $k \in \{1,2,3\}$, where $t_k=\min(t_1,t_2,t_3)$ - \end{enumerate} -\end{enumerate} -The intuitive argument is that with three possible transition there are -3 independent processes running, but only the first transition is observed. -The latter approach is used in the implementation in \texttt{simLexis}. - -The formal argument for the equality of the two approaches goes as -follows: -\begin{enumerate} -\item Equality of the transition times: - \begin{enumerate} - \item In the first approach we simulate from a distribution with - cumulative rate $\Lambda_1(t)+\Lambda_2(t)+\Lambda_3(t)$, hence - from a distribution with survival function: - \begin{align*} - S(t) & = \exp\bigl(-(\Lambda_1(t)+\Lambda_2(t)+\Lambda_3(t))\bigr) \\ - & = \exp\bigl(-\Lambda_1(t)\bigr)\times - \exp\bigl(-\Lambda_2(t)\bigr)\times - \exp\bigl(-\Lambda_3(t)\bigr) - \end{align*} - \item In the second approach we choose the smallest of three - independent survival times, with survival functions - $\exp(-\Lambda_i), i=1,2,3$. Now, the survival function for the - minimum of three independent survival times is: - \begin{align*} - S_\text{min}(t) & = \pmat{\min(t_1,t_2,t_3)>t} \\ - & = \pmat{t_1>t} \times - \pmat{t_2>t} \times - \pmat{t_3>t} \\ - & = \exp\bigl(-\Lambda_1(t)\bigr)\times - \exp\bigl(-\Lambda_2(t)\bigr)\times - \exp\bigl(-\Lambda_3(t)\bigr) - \end{align*} - which is the same survival function as derived above. - \end{enumerate} -\item Type of transition: - \begin{enumerate} - \item In the first instance the probability of a transition to state - $i$, conditional on the transition time being $t$, is as known - from standard probability theory: - $\lambda_i(t)/\bigl(\lambda_1(t)+\lambda_2(t)+\lambda_3(t)\bigr)$. - \item In the second approach we choose the transition corresponding - to the the smallest of the transition times. So when we condition - on the event that a transition takes place at time $t$, we have to - show that the conditional probability that the smallest of the - three simulated transition times was actually the $i$th, is as above. - - But conditional on \emph{survival} till $t$, the probabilities - that events of type $1,2,3$ takes place in the interval $(t,t+\dif - t)$ are $\lambda_1(t)\dif t$, $\lambda_2(t)\dif t$ and - $\lambda_1(t)\dif t$, respectively (assuming that the probability - of more than one event in the interval of length $\dif t$ is - 0). Hence the conditional probabilities \emph{given a transition - time} in $(t,t+\dif t)$ is: - \[ - \frac{\lambda_i(t)\dif t}{\lambda_1(t)\dif t+\lambda_2(t)\dif t+\lambda_3(t)\dif t}= - \frac{\lambda_i(t)}{\lambda_1(t)+\lambda_2(t)+\lambda_3(t)} - \] - --- exactly as above. - \end{enumerate} -\end{enumerate} - -\section{Components of \texttt{simLexis}} - -This section explains the actually existing code for -\texttt{simLexis}, as it is in the current version of \texttt{Epi}. -<>= -source( "../R/simLexis.R", keep.source=TRUE ) -@ % -The function \texttt{simLexis} takes a \texttt{Lexis} object as -input. This defines the initial state(s) and times of the start for a -number of persons. Since the purpose is to simulate a history through -the estimated multistate model, the input values of the outcome -variables \texttt{lex.Xst} and \texttt{lex.dur} are ignored --- the -aim is to simulate values for them. - -Note however that the attribute \texttt{time.since} must be present in the -object. This is used for initializing timescales defined as time since -entry into a particular state, it is a character vector of the same -length as the \texttt{time.scales} attribute, with value equal to a -state name if the corresponding time scale is defined as time since -entry into that state. In this example the 4th timescale is time since -entry into the ``Ins'' state, and hence: -<<>>= -cbind( -attr( ini, "time.scale" ), -attr( ini, "time.since" ) ) -@ % -\texttt{Lexis} objects will have this attribute set for time scales created -using \texttt{cutLexis}. - -The other necessary argument is a transition object \texttt{Tr}, which -is a list of lists. The elements of the lists should be \texttt{glm} -objects derived by fitting Poisson models to a \texttt{Lexis} object -representing follow-up data in a multistate model. It is assumed (but -not checked) that timescales enter in the model via the timescales of -the \texttt{Lexis} object. Formally, there are no assumptions about -how \texttt{lex.dur} enters in the model. - -Optional arguments are \texttt{t.range}, \texttt{n.int} or -\texttt{time.pts}, specifying the times after entry at which the -cumulative rates will be computed (the maximum of which will be taken -as the censoring time), and \texttt{N} a scalar or numerical vector of -the number of persons with a given initial state each record of the -\texttt{init} object should represent. - -The central part of the functions uses a \texttt{do.call} / -\texttt{lapply} / \texttt{split} construction to do simulations for -different initial states. This is the construction in the middle that -calls \texttt{simX}. \texttt{simLexis} also calls \texttt{get.next} -which is further detailed below. -<<>>= -simLexis -@ % - -\subsection{\texttt{simX}} - -\texttt{simX} is called by \texttt{simLexis} and simulates -transition-times and -types for a set of patients assumed to be in the -same state. It is called from \texttt{simLexis} with a data frame as -argument, uses the state in \texttt{lex.Cst} to select the relevant -component of \texttt{Tr} and compute predicted cumulative intensities -for all states reachable from this state. - -Note that it is here the switch between \texttt{glm}, \texttt{coxph} -and objects of class \texttt{function} is made. - -The dataset on which this prediction is done has -\texttt{length(time.pts)} rows per person. -<<>>= -simX -@ % -As we see, \texttt{simX} calls \texttt{sim1} which simulates the -transition for one person. - -\subsection{\texttt{sim1}} - -The predicted cumulative intensities are fed, person by person, to -\texttt{sim1} --- again via a \texttt{do.call} / \texttt{lapply} / -\texttt{split} construction --- and the resulting time and state is -appended to the \texttt{nd} data frame. This way we have simulated -\emph{one} transition (time and state) for each person: -<<>>= -sim1 -@ % -The \texttt{sim1} function uses \texttt{lint} to do linear interpolation. - -\subsection{\texttt{lint}} - -We do not use \texttt{approx} to do the linear interpolation, because -this function does not do the right thing if the cumulative incidences -(\texttt{ci}) are constant across a number of times. Therefore we have -a custom made linear interpolator that does the interpolation -exploiting the fact the the \texttt{ci} is non-decreasing and -\texttt{tt} is strictly monotonously increasing: -<<>>= -lint -@ - -\subsection{\texttt{get.next}} - -We must repeat the simulation operation on those that have a simulated -entry to a transient state, and also make sure that any time scales -defined as time since entry to one of these states be initialized to 0 -before a call to \texttt{simX} is made for these persons. This -accomplished by the function \texttt{get.next}: -<<>>= -get.next -@ - -\subsection{\texttt{chop.lex}} - -The operation so far has censored individuals \texttt{max(time.pts)} -after \emph{each} new entry to a transient state. In order to groom -the output data we use \texttt{chop.lex} to censor all persons at the -same designated time after \emph{initial} entry. -<<>>= -chop.lex -@ - -\section{Probabilities from simulated \texttt{Lexis} objects} - -Once we have simulated a Lexis object we will of course want to use it -for estimating probabilities, so basically we will want to enumerate -the number of persons in each state at a pre-specified set of time -points. - -\subsection{\texttt{nState}} - -Since we are dealing with multistate model with potentially multiple -time scales, it is necessary to define the timescale -(\texttt{time.scale}), the starting point on this timescale -(\texttt{from}) and the points after this where we compute the number -of occupants in each state, (\texttt{at}). -<<>>= -nState -@ % - -\subsection{\texttt{pState}, \texttt{plot.pState} and \texttt{lines.pState}} - -In order to plot probabilities of state-occupancy it is useful to -compute cumulative probabilities across states in any given order; -this is done by the function \texttt{pState}, which returns a matrix -of class \texttt{pState}: -<<>>= -pState -@ % -There is also a \texttt{plot} and \texttt{lines} method for the -resulting \texttt{pState} objects: -<<>>= -plot.pState -lines.pState -@ % - -\bibliographystyle{plain} - \begin{thebibliography}{1} - -\bibitem{Carstensen.2011a} -B.~Carstensen and M.~Plummer. -\newblock Using {L}exis objects for multi-state models in {R}. -\newblock {\em Journal of Statistical Software}, 38(6):1--18, 1 2011. - -\bibitem{Iacobelli.2013} -S.~Iacobelli and B.~Carstensen. -\newblock {{M}ultiple time scales in multi-state models}. -\newblock {\em Stat Med}, 32(30):5315--5327, Dec 2013. - -\bibitem{Plummer.2011} -M.~Plummer and B.~Carstensen. -\newblock Lexis: An {R} class for epidemiological studies with long-term - follow-up. -\newblock {\em Journal of Statistical Software}, 38(5):1--12, 1 2011. - -\end{thebibliography} - -\addcontentsline{toc}{chapter}{References} - -\end{document} Binary files /tmp/tmp1TClAk/RJn9eX0x2u/r-cran-epi-2.19/inst/doc/yll.pdf and /tmp/tmp1TClAk/HK_qO2S7tg/r-cran-epi-2.30/inst/doc/yll.pdf differ diff -Nru r-cran-epi-2.19/inst/doc/yll.R r-cran-epi-2.30/inst/doc/yll.R --- r-cran-epi-2.19/inst/doc/yll.R 2017-08-08 20:03:04.000000000 +0000 +++ r-cran-epi-2.30/inst/doc/yll.R 2018-05-03 14:36:46.000000000 +0000 @@ -1,8 +1,8 @@ -### R code from vignette source 'yll.rnw' +### R code from vignette source 'yll' ### Encoding: UTF-8 ################################################### -### code chunk number 1: yll.rnw:146-149 +### code chunk number 1: yll.rnw:21-24 ################################################### options( width=90, SweaveHooks=list( fig=function() @@ -12,7 +12,6 @@ ################################################### ### code chunk number 2: states ################################################### -getOption("SweaveHooks")[["fig"]]() library( Epi ) TM <- matrix(NA,4,4) rownames(TM) <- @@ -24,7 +23,6 @@ ################################################### ### code chunk number 3: states ################################################### -getOption("SweaveHooks")[["fig"]]() zz$Arrowtext <- c( expression(lambda), expression(mu[W]), expression(mu[D][M]) ) @@ -32,20 +30,20 @@ ################################################### -### code chunk number 4: yll.rnw:392-393 +### code chunk number 4: yll.rnw:265-266 ################################################### data( DMepi ) ################################################### -### code chunk number 5: yll.rnw:398-400 +### code chunk number 5: yll.rnw:271-273 ################################################### str( DMepi ) head( DMepi ) ################################################### -### code chunk number 6: yll.rnw:420-424 +### code chunk number 6: yll.rnw:293-297 ################################################### DMepi <- transform( subset( DMepi, A>30 ), D.T = D.nD + D.DM, @@ -54,7 +52,7 @@ ################################################### -### code chunk number 7: yll.rnw:430-456 +### code chunk number 7: yll.rnw:303-329 ################################################### # Knots used in all models ( a.kn <- seq(40,95,,6) ) @@ -85,7 +83,7 @@ ################################################### -### code chunk number 8: yll.rnw:463-500 +### code chunk number 8: yll.rnw:336-373 ################################################### a.ref <- 30:90 p.ref <- 1996:2016 @@ -129,7 +127,6 @@ ################################################### ### code chunk number 9: imm ################################################### -getOption("SweaveHooks")[["fig"]]() plyll <- function(wh){ par( mfrow=c(1,2), mar=c(3,3,1,1), mgp=c(3,1,0)/1.6, bty="n", las=1 ) @@ -157,49 +154,57 @@ ################################################### ### code chunk number 10: tot ################################################### -getOption("SweaveHooks")[["fig"]]() plyll("Tot") ################################################### ### code chunk number 11: sus ################################################### -getOption("SweaveHooks")[["fig"]]() plyll("Sus") ################################################### -### code chunk number 12: yll.rnw:584-585 +### code chunk number 12: CHANGE1 (eval = FALSE) ################################################### -source( "../R/erl.R", keep.source=TRUE ) +## source( "../R/erl.R", keep.source=TRUE ) ################################################### -### code chunk number 13: yll.rnw:595-596 +### code chunk number 13: CHANGE2 +################################################### +surv1 <- Epi::surv1 +surv2 <- Epi::surv2 +erl1 <- Epi::erl1 +erl <- Epi::erl +yll <- Epi::yll + + +################################################### +### code chunk number 14: yll.rnw:484-485 ################################################### surv1 ################################################### -### code chunk number 14: yll.rnw:600-601 +### code chunk number 15: yll.rnw:489-490 ################################################### erl1 ################################################### -### code chunk number 15: yll.rnw:608-609 +### code chunk number 16: yll.rnw:497-498 ################################################### surv2 ################################################### -### code chunk number 16: yll.rnw:613-614 +### code chunk number 17: yll.rnw:502-503 ################################################### erl ################################################### -### code chunk number 17: yll.rnw:618-619 +### code chunk number 18: yll.rnw:507-508 ################################################### yll diff -Nru r-cran-epi-2.19/inst/doc/yll.rnw r-cran-epi-2.30/inst/doc/yll.rnw --- r-cran-epi-2.19/inst/doc/yll.rnw 2017-04-14 14:59:35.000000000 +0000 +++ r-cran-epi-2.30/inst/doc/yll.rnw 1970-01-01 00:00:00.000000000 +0000 @@ -1,641 +0,0 @@ -\SweaveOpts{results=verbatim,keep.source=TRUE,include=FALSE,eps=FALSE} -%\VignetteIndexEntry{Years of life lost: simLexis} -\documentclass[a4paper,twoside,12pt]{report} - -% ---------------------------------------------------------------------- -% General information for the title page and the page headings -\newcommand{\Title}{Years of Life Lost (YLL) - to disease\\Diabetes in DK as example} -\newcommand{\Tit}{YLL} -\newcommand{\Version}{Version 1.2} -\newcommand{\Dates}{February 2017} -\newcommand{\Where}{SDC} -\newcommand{\Homepage}{\url{http://bendixcarstensen.com/Epi}} -\newcommand{\Faculty}{\begin{tabular}{rl} -Bendix Carstensen - & Steno Diabetes Center, Gentofte, Denmark\\ - & {\small \& Department of Biostatistics, - University of Copenhagen} \\ - & \texttt{b@bxc.dk}\\ - & \url{http://BendixCarstensen.com} \\[1em] - \end{tabular}} -% Packages -\usepackage[utf8]{inputenc} -\usepackage[T1]{fontenc} -\usepackage[english]{babel} -\usepackage[font=it,labelfont=normalfont]{caption} -\usepackage[colorlinks,urlcolor=blue,linkcolor=red,,citecolor=Maroon]{hyperref} -\usepackage[ae,hyper]{Rd} -\usepackage[dvipsnames]{xcolor} -\usepackage[super]{nth} -\usepackage[noae]{Sweave} -\usepackage{makeidx,floatflt,amsmath,amsfonts,amsbsy,enumitem,dcolumn,needspace} -\usepackage{ifthen,calc,eso-pic,everyshi} -\usepackage{booktabs,longtable,rotating,graphicx,subfig} -\usepackage{pdfpages,verbatim,fancyhdr,datetime,% -afterpage} -\usepackage[abspath]{currfile} -% \usepackage{times} -\renewcommand{\textfraction}{0.0} -\renewcommand{\topfraction}{1.0} -\renewcommand{\bottomfraction}{1.0} -\renewcommand{\floatpagefraction}{0.9} -% \usepackage{mslapa} -\definecolor{blaa}{RGB}{99,99,255} -\DeclareGraphicsExtensions{.png,.pdf,.jpg} -% Make the Sweave output nicer -\DefineVerbatimEnvironment{Sinput}{Verbatim}{fontsize=\small,fontshape=sl,formatcom=\color{Blue}} -\DefineVerbatimEnvironment{Soutput}{Verbatim}{fontsize=\small,formatcom=\color{Maroon},xleftmargin=0em} -\DefineVerbatimEnvironment{Scode}{Verbatim}{fontsize=\small} -\fvset{listparameters={\setlength{\topsep}{-0.1ex}}} -\renewenvironment{Schunk}% -{\renewcommand{\baselinestretch}{0.85} \vspace{\topsep}}% -{\renewcommand{\baselinestretch}{1.00} \vspace{\topsep}} -\providecommand{\ptxt}[1]{\Pp\left\{\text{#1}\right\}} -\providecommand{\dif}{{\,\mathrm d}} -\DeclareMathOperator{\YLL}{YLL} -\DeclareMathOperator{\Pp}{P} - -%---------------------------------------------------------------------- -% Set up layout of pages -\oddsidemargin 1mm -\evensidemargin 1mm -\topmargin -10mm -\headheight 8mm -\headsep 5mm -\textheight 240mm -\textwidth 165mm -%\footheight 5mm -\footskip 15mm -\renewcommand{\topfraction}{0.9} -\renewcommand{\bottomfraction}{0.9} -\renewcommand{\textfraction}{0.1} -\renewcommand{\floatpagefraction}{0.9} -\renewcommand{\headrulewidth}{0.1pt} -\setcounter{secnumdepth}{4} -% \setcounter{tocdepth}{2} - -%---------------------------------------------------------------------- -% How to insert a figure in a .rnw file -\newcommand{\rwpre}{./graph/gr} -\newcommand{\insfig}[3]{ -\begin{figure}[h] - \centering - \includegraphics[width=#2\textwidth]{\rwpre-#1} - \caption{#3} - \label{fig:#1} -% \afterpage{\clearpage} -\end{figure}} - -%---------------------------------------------------------------------- -% Here is the document starting with the titlepage -\begin{document} - -%---------------------------------------------------------------------- -% The title page -\setcounter{page}{1} -\pagenumbering{roman} -\pagestyle{plain} -\thispagestyle{empty} -% \vspace*{0.05\textheight} -\flushright -% The blank below here is necessary in order not to muck up the -% linespacing in title if it has more than 2 lines -{\Huge \bfseries \Title - -}\ \\[-1.5ex] -\noindent\textcolor{blaa}{\rule[-1ex]{\textwidth}{5pt}}\\[2.5ex] -\large -\Where \\ -\Dates \\ -\Homepage \\ -\Version \\[1em] -\normalsize -Compiled \today,\ \currenttime\\ -from: \texttt{\currfileabspath}\\[1em] -% \input{wordcount} -\normalsize -\vfill -\Faculty -% End of titlepage -% \newpage - -%---------------------------------------------------------------------- -% Table of contents -\tableofcontents - -%---------------------------------------------------------------------- -% General text layout -\raggedright -\parindent 1em -\parskip 0ex -\cleardoublepage - -%---------------------------------------------------------------------- -% General page style -\pagenumbering{arabic} -\setcounter{page}{1} -\pagestyle{fancy} -\renewcommand{\chaptermark}[1]{\markboth{\textsl{#1}}{}} -\renewcommand{\sectionmark}[1]{\markright{\thesection\ \textsl{#1}}{}} -\fancyhead[EL]{\bf \thepage \quad \rm \leftmark} -\fancyhead[ER,OL]{\sl \Tit} -\fancyhead[OR]{\rm \rightmark \quad \bf \thepage} -\fancyfoot{} - -<>= -options( width=90, - SweaveHooks=list( fig=function() - par(mar=c(3,3,1,1),mgp=c(3,1,0)/1.6,las=1,bty="n") ) ) -@ % -\renewcommand{\rwpre}{./yll} - -%---------------------------------------------------------------------- -% Here comes the substance part -\chapter{Theory and technicalities} - -This vignette for the \texttt{Epi} package describes the -probabilistic/demographic background for and technical implementation -of the \texttt{erl} and \texttt{yll} functions that computes the -expected residual life time and years of life lost in an illness-death -model. - -\section{Years of life lost (YLL)} - -\ldots to diabetes or any other disease for that matter. - -The general concept in calculation of ``years lost to\ldots'' is the -comparison of the expected lifetime between two groups of persons; one -with and one without disease (in this example DM). The expected -lifetime is the area under the survival curve, so basically the -exercise requires that two survival curves that are deemed relevant be -available. - -The years of life lost is therefore just the area between the survival -curves for those ``Well'', $S_W(t)$, and for those ``Diseased'', -$S_D(t)$: -\[ - \YLL = \int_0^\infty S_W(t) - S_D(t) \dif t -\] -The time $t$ could of course be age, but it could also be ``time after -age 50'' and the survival curves compared would then be survival -curves \emph{conditional} on survival till age 50, and the YLL would -be the years of life lost for a 50-year old person with diabetes. - -If we are referring to the expected lifetime we will more precisely use -the label expected residual lifetime, ERL. - -\section{Constructing the survival curves} - -YLL can be computed in two different ways, depending on the way the -survival curve and hence the expected lifetime of a person -\emph{without} diabetes is computed: -\begin{itemize} -\item Assume that the ``Well'' persons are \emph{immune} to disease - --- using only the non-DM mortality rates throughout for calculation - of expected life time. -\item Assume that the ``Well'' persons \emph{can} acquire the disease and - thereby see an increased mortality, thus involving all three rates - shown in figure \ref{fig:states}. -\end{itemize} -The former gives a higher YLL because the comparison is to persons -assumed immune to DM (and yet with the same mortality as non-immune -prior to diagnosis), the latter gives a more realistic picture of the -comparison of group of persons with and without diabetes at a given -age that can be interpreted in the real world. - -The differences can be illustrated by figure \ref{fig:states}; the -immune approach corresponds to an assumption of $\lambda(t)=0$ in the -calculation of the survival curve for a person in the ``Well'' state. - -Calculation of the survival of a diseased person already in the ``DM'' -state is unaffected by assumptions about $\lambda$. - -<>= -library( Epi ) -TM <- matrix(NA,4,4) -rownames(TM) <- -colnames(TM) <- c("Well","DM","Dead","Dead(DM)") -TM[1,2:3] <- TM[2,4] <- 1 -zz <- boxes( TM, boxpos=list(x=c(20,80,20,80),y=c(80,80,20,20)), wm=1.5, hm=4 ) -@ -<>= -zz$Arrowtext <- c( expression(lambda), - expression(mu[W]), - expression(mu[D][M]) ) -boxes( zz ) -@ % -\insfig{states}{0.7}{Illness-death model describing diabetes incidence - and -mortality.} - -\subsection{Total mortality --- a shortcut?} - -A practical crude shortcut could be to compare the ERL in the diabetic -population to the ERL for the \emph{entire} population (that is use -the total mortality ignoring diabetes status). - -Note however that this approach also counts the mortality of persons -that acquired the disease earlier, thus making the comparison -population on average more ill than the population we aim at, namely -those well at a given time, which only then become more gradually ill. - -How large these effects are can however be empirically explored, as we -shall do later. - -\subsection{Disease duration} - -In the exposition above there is no explicit provision for the effect of -disease duration, but if we were able to devise mortality rates for -any combination of age and duration, this could be taken into account. - -There are however severe limitations in this as we in principle would -want to have duration effects as long as the age-effects --- in -principle for all $(a,d)$ where $d\leq A$, where $A$ is the age at -which we condition. So even if we were only to compute ERL from -age, say, 40 we would still need duration effects up to 60 years -(namely to age 100). - -The incorporation of duration effects is in principle trivial from a -computational point of view, but we would be forced to entertain -models predicting duration effects way beyond what is actually -observed disease duration in any practical case. - -\subsection{Computing integrals} - -The practical calculations of survival curves, ERL and YLL involves -calculation of (cumulative) integrals of rates and functions of these -as we shall see below. This is easy if we have a closed form -expression of the function, so its value may be computed at any time -point --- this will be the case if we model rates by smooth parametric -functions. - -Computing the (cumulative) integral of a function is done as follows: -\begin{itemize} -\item Compute the value of the function (mortality rate for example) - at the midpoints of a sequence of narrow equidistant intervals --- - for example one- or three month intervals of age, say. -\item Take the cumulative sum of these values multiplied by the - interval length --- this will be a very close approximation to the - cumulative integral evaluated at the end of each interval. -\item If the intervals are really small (like 1/100 year), the - distinction between the value at the middle and at the end of each - interval becomes irrelevant. -\end{itemize} -Note that in the above it is assumed that the rates are given in units -corresponding to the interval length --- or more precisely, as the -cumulative rates over the interval. - -\section{Survival functions in the illness-death model} - -The survival functions for persons in the ``Well'' state can be -computed under two fundamentally different scenarios, depending on -whether persons in the ``Well'' state are assumed to be immune to the -disease ($\lambda(a)=0$) or not. - -\subsection{Immune approach} - -In this case both survival functions for person in the two states are -the usual simple transformation of the cumulative mortality rates: -\[ - S_W(a) = \exp\left(-\int_0^a\!\!\mu_W(u) \dif u \right), \qquad - S_D(a) = \exp\left(-\int_0^a\!\!\mu_D(u) \dif u \right) -\] - -\subsubsection{Conditional survival functions} - -If we want the \emph{conditional} survival functions given survival to -age $A$, say, they are just: -\[ - S_W(a|A) = S_W(a)/S_W(A), \qquad S_D(a|A) = S_D(a)/S_D(A) -\] - -\subsection{Non-immune approach} - -For a diseased person, the survival function in this states is the same -as above, but the survival function for a person without disease (at -age 0) is (see figure \ref{fig:states}): -\[ -S(a) = \ptxt{Well}\!(a) + \ptxt{DM}\!(a) -\] -In the appendix of the paper \cite{Carstensen.2008c} is an indication -of how to compute the probability of being in any of the four states -shown in figure \ref{fig:states}, which I shall repeat here: - -In terms of the rates, the probability of being in the ``Well'' box is -simply the probability of escaping both death (at a rate of $\mu_W(a)$) -and diabetes (at a rate of $\lambda(a)$): -\[ - \ptxt{Well}(a) = \exp\left(\!-\int_0^a\!\!\mu_W(u)+\lambda(u) \right) \dif u -\] -The probability of being alive with diabetes at age $a$, is computed given that - diabetes occurred at age $s$ ($s>= -data( DMepi ) -@ % -The dataset \texttt{DMepi} contains diabetes events, deaths and -person-years for persons without diabetes and deaths and person-years -for persons with diabetes: -<<>>= -str( DMepi ) -head( DMepi ) -@ % -For each combination of sex, age, period and date of birth in 1 year -age groups, we have the person-years in the ``Well'' (\texttt{Y.nD}) -and the ``DM'' (\texttt{Y.DM}) states, as well as the number of deaths -from these (\texttt{D.nD}, \texttt{D.DM}) and the number of incident -diabetes cases from the ``Well'' state (\texttt{X}). - -In order to compute the years of life lost to diabetes and how this -has changed over time, we fit models for the mortality and incidence -of both groups (and of course, separately for men and women). The -models we use will be age-period-cohort models \cite{Carstensen.2007a} -providing estimated mortality rates for ages 0--99 and dates -1.1.1996--1.1.2016. - -First we transform the age and period variables to reflect the mean -age and period in each of the Lexis triangles. We also compute the -total number of deaths and amount of risk time, as we are going to -model the total mortality as well. Finally we restrict the dataset to -ages over 30 only: -<<>>= -DMepi <- transform( subset( DMepi, A>30 ), - D.T = D.nD + D.DM, - Y.T = Y.nD + Y.DM ) -head(DMepi) -@ % -With the correct age and period coding in the Lexis triangles, we fit -models for the mortalities and incidences. Note that we for -comparative purposes also fit a model for the \emph{total} mortality, -ignoring the -<<>>= -# Knots used in all models -( a.kn <- seq(40,95,,6) ) -( p.kn <- seq(1997,2015,,4) ) -( c.kn <- seq(1910,1976,,6) ) -# Check the number of events between knots -ae <- xtabs( cbind(D.nD,D.DM,X) ~ cut(A,c(30,a.kn,Inf)) + sex, data=DMepi ) -ftable( addmargins(ae,1), col.vars=3:2 ) -pe <- xtabs( cbind(D.nD,D.DM,X) ~ cut(P,c(1990,p.kn,Inf)) + sex, data=DMepi ) -ftable( addmargins(pe,1), col.vars=3:2 ) -ce <- xtabs( cbind(D.nD,D.DM,X) ~ cut(P-A,c(-Inf,c.kn,Inf)) + sex, data=DMepi ) -ftable( addmargins(ce,1), col.vars=3:2 ) -# Fit an APC-model for all transitions, seperately for men and women -mW.m <- glm( D.nD ~ -1 + Ns(A ,knots=a.kn,int=TRUE) + - Ns( P,knots=p.kn,ref=2005) + - Ns(P-A,knots=c.kn,ref=1950), - offset = log(Y.nD), - family = poisson, - data = subset( DMepi, sex=="M" ) ) -mD.m <- update( mW.m, D.DM ~ . , offset=log(Y.DM) ) -mT.m <- update( mW.m, D.T ~ . , offset=log(Y.T ) ) -lW.m <- update( mW.m, X ~ . ) -# Model for women -mW.f <- update( mW.m, data = subset( DMepi, sex=="F" ) ) -mD.f <- update( mD.m, data = subset( DMepi, sex=="F" ) ) -mT.f <- update( mT.m, data = subset( DMepi, sex=="F" ) ) -lW.f <- update( lW.m, data = subset( DMepi, sex=="F" ) ) -@ % - -\section{Residual life time and years lost to DM} - -We now collect the estimated years of life lost classified by method -(immune assumption or not), sex, age and calendar time: -<<>>= -a.ref <- 30:90 -p.ref <- 1996:2016 -aYLL <- NArray( list( type = c("Imm","Tot","Sus"), - sex = levels( DMepi$sex ), - age = a.ref, - date = p.ref ) ) -str( aYLL ) -system.time( -for( ip in p.ref ) - { - nd <- data.frame( A = seq(30,90,0.2)+0.1, - P = ip, - Y.nD = 1, - Y.DM = 1, - Y.T = 1 ) - muW.m <- ci.pred( mW.m, nd )[,1] - muD.m <- ci.pred( mD.m, nd )[,1] - muT.m <- ci.pred( mT.m, nd )[,1] - lam.m <- ci.pred( lW.m, nd )[,1] - muW.f <- ci.pred( mW.f, nd )[,1] - muD.f <- ci.pred( mD.f, nd )[,1] - muT.f <- ci.pred( mT.f, nd )[,1] - lam.f <- ci.pred( lW.f, nd )[,1] - aYLL["Imm","M",,paste(ip)] <- yll( int=0.2, muW.m, muD.m, lam=NULL, - A=a.ref, age.in=30, note=FALSE )[-1] - aYLL["Imm","F",,paste(ip)] <- yll( int=0.2, muW.f, muD.f, lam=NULL, - A=a.ref, age.in=30, note=FALSE )[-1] - aYLL["Tot","M",,paste(ip)] <- yll( int=0.2, muT.m, muD.m, lam=NULL, - A=a.ref, age.in=30, note=FALSE )[-1] - aYLL["Tot","F",,paste(ip)] <- yll( int=0.2, muT.f, muD.f, lam=NULL, - A=a.ref, age.in=30, note=FALSE )[-1] - aYLL["Sus","M",,paste(ip)] <- yll( int=0.2, muW.m, muD.m, lam=lam.m, - A=a.ref, age.in=30, note=FALSE )[-1] - aYLL["Sus","F",,paste(ip)] <- yll( int=0.2, muW.f, muD.f, lam=lam.f, - A=a.ref, age.in=30, note=FALSE )[-1] - } ) -round( ftable( aYLL[,,seq(1,61,10),], col.vars=c(3,2) ), 1 ) -@ % -We now have the relevant points for the graph showing YLL to diabetes -for men and women by age, and calendar year, both under the immunity -and susceptibility models for the calculation of YLL. -<>= -plyll <- function(wh){ -par( mfrow=c(1,2), mar=c(3,3,1,1), mgp=c(3,1,0)/1.6, bty="n", las=1 ) - -matplot( a.ref, aYLL[wh,"M",,], - type="l", lty=1, col="blue", lwd=1:2, - ylim=c(0,12), xlab="Age", - ylab="Years lost to DM", yaxs="i" ) -abline(v=50,h=1:10,col=gray(0.7)) -text( 90, 11, "Men", col="blue", adj=1 ) -text( 40, aYLL[wh,"M","40","1996"], "1996", adj=c(0,0), col="blue" ) -text( 43, aYLL[wh,"M","44","2016"], "2016", adj=c(1,1), col="blue" ) - -matplot( a.ref, aYLL[wh,"F",,], - type="l", lty=1, col="red", lwd=1:2, - ylim=c(0,12), xlab="Age", - ylab="Years lost to DM", yaxs="i" ) -abline(v=50,h=1:10,col=gray(0.7)) -text( 90, 11, "Women", col="red", adj=1 ) -text( 40, aYLL[wh,"F","40","1996"], "1996", adj=c(0,0), col="red" ) -text( 43, aYLL[wh,"F","44","2016"], "2016", adj=c(1,1), col="red" ) -} -plyll("Imm") -@ % -<>= -plyll("Tot") -@ % -<>= -plyll("Sus") -@ % -\begin{figure}[h] - \centering - \includegraphics[width=\textwidth]{yll-imm} - \caption{Years of life lost to DM: the difference in expected - residual life time at different ages between persons with and - without diabetes, assuming the persons without diabetes at a given - age remain free from diabetes (immunity assumption --- not - reasonable). The lines refer to date of evaluation; the top lines - refer to 1.1.1996 the bottom ones to 1.1.2016. Blue curves are - men, red women.} - \label{fig:imm} -\end{figure} - -\begin{figure}[h] - \centering - \includegraphics[width=\textwidth]{yll-sus} - \caption{Years of life lost to DM: the difference in expected - residual life time at different ages between persons with and - without diabetes, allowing the persons without diabetes at a given - to contract diabetes and thus be subject to higher mortality. The - lines refer to date of evaluation; the top lines refer to 1.1.1996 - the bottom ones to 1.1.2016. Blue curves are men, red women.} - \label{fig:sus} -\end{figure} - -\begin{figure}[h] - \centering - \includegraphics[width=\textwidth]{yll-tot} - \caption{Years of life lost to DM: the difference in expected - residual life time at different ages between persons with and - without diabetes. Allowance for susceptibility is approximated by - using the total population mortality instead of non-DM - mortality. The lines refer to date of evaluation; the top lines - refer to 1.1.1996 the bottom ones to 1.1.2016. Blue curves are - men, red women.} - \label{fig:tot} -\end{figure} - -From figure \ref{fig:sus} we see that for men aged 50 the years lost to -diabetes has decreased from a bit over 8 to a bit less than 6 years, -and for women from 8.5 to 5 years; so a greater improvement for women. - -\chapter{Practical implementation} - -We have devised functions that wraps these formulae up for practical -use. - -\section{Function definitions} - -<>= -source( "../R/erl.R", keep.source=TRUE ) -@ -When using the functions it is assumed that the functions $\mu_W$, -$\mu_D$ and $\lambda$ are given as vectors corresponding to -equidistantly (usually tightly) spaced ages from 0 to $K$ where K is -the age where everyone can safely be assumed dead. - -\texttt{surv1} is a simple function that computes the survival -function from a vector of mortality rates, and optionally the -conditional survival given being alive at prespecified ages: -<<>>= -surv1 -@ % -\texttt{erl1} basically just expands the result of \texttt{surv1} with -a column of expected residual life times: -<<>>= -erl1 -@ % -We also define a function, \texttt{surv2}, that computes the survival -function for a non-diseased person that may become diseased with rate -\texttt{lam} and after that die at a rate of \texttt{muD} -(corresponding to the formulae above). This is the sane way of -handling years of life lost to a particular illness: -<<>>= -surv2 -@ % -Finally we devised a function using these to compute the expected -residual lifetime at select ages: -<<>>= -erl -@ % -\ldots and a wrapper for this if we only want the years of life lost -returned: -<<>>= -yll -@ % - -\bibliographystyle{plain} - -\begin{thebibliography}{1} - -\bibitem{Carstensen.2007a} -B~Carstensen. -\newblock Age-{P}eriod-{C}ohort models for the {L}exis diagram. -\newblock {\em Statistics in Medicine}, 26(15):3018--3045, July 2007. - -\bibitem{Carstensen.2008c} -B.~Carstensen, J.K. Kristensen, P.~Ottosen, and K.~Borch-Johnsen. -\newblock The {D}anish {N}ational {D}iabetes {R}egister: {T}rends in incidence, - prevalence and mortality. -\newblock {\em Diabetologia}, 51:2187--2196, 2008. - -\end{thebibliography} - -\addcontentsline{toc}{chapter}{References} - -\end{document} diff -Nru r-cran-epi-2.19/man/addCov.Lexis.Rd r-cran-epi-2.30/man/addCov.Lexis.Rd --- r-cran-epi-2.19/man/addCov.Lexis.Rd 2017-07-06 10:11:11.000000000 +0000 +++ r-cran-epi-2.30/man/addCov.Lexis.Rd 2017-10-29 15:12:32.000000000 +0000 @@ -25,7 +25,7 @@ A Lexis object with follow-up of a cohort. } \item{clin}{ - A data frame with the covariates to add (typically clinical + A data frame with covariates to add (typically clinical measurements). Must contain a variable \code{lex.id} identifying the persons represented in \code{Lx}, as well as a variable with the same name as one of the \code{\link{timeScales}} in \code{Lx}, @@ -42,27 +42,25 @@ name indicating the time at which the covariate measurements were taken. } - \item{exnam}{ + \item{exnam}{ Character. Name of the variable in \code{clin} with the examination names (such as \code{wave1}, \code{wave2} etc.). Values may not be repeated within person. Will be carried over to the resulting - \code{Lexis} object. If there is no variable of this name in - \code{clin} it will be constructed; if argument omitted, a variable - called \code{exnam} with values \code{ex.1}, \code{ex.2} etc. will - be constructed. + \code{Lexis} object. If there is no such variable in \code{clin} it + will be constructed; if the argument is omitted, a variable called + \code{exnam} with values \code{ex1}, \code{ex2} etc. will be + constructed. } \item{tfc}{ Character (\code{t}ime \code{f}rom \code{c}ovariate). Name of the variable in the result which will contain the time since the most - recent covariate date. This is not a time scale as it is reset to - 0 at each new covariate time. Also note that by this very token, - this variable will be meaningless if you \code{splitLexis} - \emph{after} using \code{addCov.Lexis}. + recent covariate date. If the argument is omitted a variable called + \code{tfc} will be constructed. If \code{addScales} is \code{TRUE} + this is included among the time scales. } \item{addScales}{ - Logical. Should timescales representing time since each covariate - time be added? They will be named \code{paste("tf",exnam)}. Not - implemented, argument currently ignored. + Logical. Should timescales representing time since each examination time + time be added? They will be named \code{paste("tf",exnam)}. } } \value{ @@ -111,21 +109,35 @@ clin <- data.frame( lex.id = c(1,1,3,2), per = c(1977.3,1971.7,1996.2,1990.6), bp = c(120,140,160,157), - chol = c(5,7,8,9) ) + chol = c(5,7,8,9), + xnam = c("X2","X1","X1","X2") ) Lx clin -# Works with time split BEFORE adding clinical data: +# Different behavours using exnam and addScales +addCov.Lexis( Lx, clin ) +addCov.Lexis( Lx, clin, exnam="xnam" ) +addCov.Lexis( Lx, clin, addScales=TRUE ) +addCov.Lexis( Lx, clin, addScales=TRUE, exnam="xnam" ) + +# Works with time split BEFORE Lb <- addCov.Lexis( splitLexis( Lx, time.scale="age", breaks=seq(0,80,5) ), clin, exnam="clX" ) Lb -# With time split AFTER adding clinincal data, variable tfc is largely meaningless: +# ...or AFTER, but only if addScales is given +La <- splitLexis( addCov.Lexis( Lx, + clin, + exnam = "xnam" ), + breaks=seq(0,80,5), + time.scale="age" ) +La La <- splitLexis( addCov.Lexis( Lx, - clin, - exnam="clX" ), + clin, + exnam = "xnam", + addScales = TRUE ), breaks=seq(0,80,5), time.scale="age" ) La diff -Nru r-cran-epi-2.19/man/apc.fit.Rd r-cran-epi-2.30/man/apc.fit.Rd --- r-cran-epi-2.19/man/apc.fit.Rd 2017-02-18 02:56:16.000000000 +0000 +++ r-cran-epi-2.30/man/apc.fit.Rd 2018-05-03 12:43:43.000000000 +0000 @@ -181,27 +181,29 @@ time of observation in the subset. This is essential since \code{A} and \code{P} are used as quantitative variables in the models. - This is a different approach relative to the vast majority of the - uses of APC-models in the literature where a factor model is used for - age, perido and cohort effects. The latter can be obtained by using + This approach is different from to the vast majority of the uses of + APC-models in the literature where a factor model is used for age, + period and cohort effects. The latter can be obtained by using \code{model="factor"}. } \references{ The considerations behind the parametrizations used in this function - are given in details in: + are given in detail in: B. Carstensen: Age-Period-Cohort models for the Lexis diagram. Statistics in Medicine, 10; 26(15):3018-45, 2007. - Various links to course material etc. is available through \url{http://BendixCarstensen.com/APC} + Various links to course material etc. is available through + \url{http://BendixCarstensen.com/APC} } \author{ Bendix Carstensen, \url{http://BendixCarstensen.com} } \seealso{ - \code{\link{LCa.fit}}, \code{\link{apc.frame}}, \code{\link{apc.lines}}, - \code{\link{apc.plot}}. + \code{\link{apc.plot}}, + \code{\link{LCa.fit}}, + \code{\link{apc.LCa}}. } \examples{ library( Epi ) diff -Nru r-cran-epi-2.19/man/apc.frame.Rd r-cran-epi-2.30/man/apc.frame.Rd --- r-cran-epi-2.19/man/apc.frame.Rd 2016-11-10 13:29:18.000000000 +0000 +++ r-cran-epi-2.30/man/apc.frame.Rd 2018-05-13 19:06:10.000000000 +0000 @@ -109,6 +109,8 @@ points( 50, 10, pch=16, cex=2, col="blue" ) # How to plot in the cohort-period-part: a point at (1960,0.3) points( 1960-res[1], 0.3*res[2], pch=16, cex=2, col="red" ) +# or referring to the period-cohort part of the plot +pc.points( 1960, 0.3, pch=16, cex=1, col="green" ) } \seealso{ \code{\link{apc.lines},\link{apc.fit}} diff -Nru r-cran-epi-2.19/man/apc.LCa.Rd r-cran-epi-2.30/man/apc.LCa.Rd --- r-cran-epi-2.19/man/apc.LCa.Rd 2016-11-10 13:29:18.000000000 +0000 +++ r-cran-epi-2.30/man/apc.LCa.Rd 2018-04-02 11:22:16.000000000 +0000 @@ -7,7 +7,7 @@ \description{ \code{apc.LCa} fits an Age-Period-Cohort model and sub-models (using \code{\link{apc.fit}}) as well as Lee-Carter models (using -\code{\link{LCa.fit}}). \code{boxes.apc.LCa} plots the models in little +\code{\link{LCa.fit}}). \code{show.apc.LCa} plots the models in little boxes with their residual deviance with arrows showing their relationships. } @@ -26,7 +26,8 @@ \item{keep.models}{Logical. Should the \code{apc} object and the 5 \code{LCa} objects be returned too? } - \item{...}{Further parameters passed on to \code{\link{LCa.fit}} or \code{\link{boxes.matrix}}. + \item{...}{Further parameters passed on to \code{\link{LCa.fit}} or + \code{\link{boxes.matrix}}. } \item{x}{The result from a call to \code{apc.LCa}.} \item{dev.scale}{Should the vertical position of the boxes with the @@ -80,10 +81,10 @@ show.apc.LCa( al, top="AP" ) # Fit a reasonable model to Danish mortality data and plot results -mACa <- LCa.fit( mdk, model="ACa", npar=c(15,15,20,6,6), c.ref=1930, +mAPa <- LCa.fit( mdk, model="APa", npar=c(15,15,20,6,6), c.ref=1930, a.ref=70, quiet=FALSE, maxit=250 ) par( mfrow=c(1,3) ) -plot( mACa )} +plot( mAPa ) } } \keyword{regression} \keyword{models} diff -Nru r-cran-epi-2.19/man/bootLexis.Rd r-cran-epi-2.30/man/bootLexis.Rd --- r-cran-epi-2.19/man/bootLexis.Rd 1970-01-01 00:00:00.000000000 +0000 +++ r-cran-epi-2.30/man/bootLexis.Rd 2018-03-08 14:09:04.000000000 +0000 @@ -0,0 +1,79 @@ +\name{bootLexis} +\alias{nid} +\alias{nid.Lexis} +\alias{bootLexis} +\title{ +Create a bootstrap sample of persons (\code{lex.id}) from a Lexis object +} +\description{ +\code{lex.id} is the identifier of persons in a \code{\link{Lexis}} +object. This is used to sample persons from a Lexis object. If a person +is sampled, all records from this persons is transported to the +bootstrap sample. +} +\usage{ +nid( Lx, \dots ) +\method{nid}{Lexis}( Lx, by=NULL, \dots ) +bootLexis( Lx, size = NULL, by = NULL ) +} +\arguments{ + \item{Lx}{A \code{Lexis} object.} + \item{\dots}{Parameters passed on to other methods.} + \item{size}{Numeric. How many persons should be sampled from the + \code{Lexis} object. Defaults to the number of persons in the + \code{Lx}, or, if \code{by} is given, to the number of persons in + each level of \code{by}. If by is given \code{size} can have length + \code{nid(Lx)}, to indicate how many are sampled from each level of + \code{by}.} + \item{by}{Character. Name of a variable (converted to factor) in the + \code{Lexis} object. + + Bootstrap sampling is done within each level of by. + + Calculation of the number of persons (\code{lex.id}) is done within + each level of \code{by}, and a vector returned. + } +} +\value{ + A Lexis object of the same structure as the input, with \emph{persons} + bootstrapped. The variable \code{lex.id} has values + 1,2,...,\code{nid(Lx)}. + + \code{nid} counts the number of persons in a Lexis object, possibly by + \code{by}. If \code{by} is given, a named vector is returned. +} +\author{Bendix Carstensen, \url{BendixCarstensen.com}.} +\seealso{\code{\link{Relevel.Lexis}}} +\examples{ +# A small bogus cohort +xcoh <- data.frame( id = c("A", "B", "C"), + birth = c("14/07/1952", "01/04/1954", "10/06/1987"), + entry = c("04/08/1965", "08/09/1972", "23/12/1991"), + exit = c("27/06/1997", "23/05/1995", "24/07/1998"), + fail = c(1, 0, 1), + sex = c("M","F","M") ) +xcoh <- data.frame( id = c("A", "B", "C"), + birth = as.Date(c("14/07/1952","01/04/1954","10/06/1987"),format="\%d/\%m/\%Y"), + entry = as.Date(c("04/08/1965","08/09/1972","23/12/1991"),format="\%d/\%m/\%Y"), + exit = as.Date(c("27/06/1997","23/05/1995","24/07/1998"),format="\%d/\%m/\%Y"), + fail = c(1, 0, 1), + sex = c("M","F","M") ) +xcoh <- cal.yr( xcoh ) +Lcoh <- Lexis( entry = list( per=entry ), + exit = list( per=exit, age=exit-birth ), + exit.status = fail, + data = xcoh ) +Lx <- Lcoh[sample(1:3,10,replace=TRUE),] +Lx$per <- Lx$per + runif(10,0,10) +Lx$lex.id <- 1:10 +Lx <- splitLexis( Lx, breaks=0:10*10, "age" ) +Lx +nid( Lx ) +nid( Lx, by="sex" ) +bootLexis( Lx ) +nid( bootLexis( Lx, size=7 ) ) +Li <- bootLexis( Lx, by="id" ) # superfluous +L2 <- bootLexis( Lx, by="sex", size=c(2,5) ) +nid( L2, by="sex" ) +} +\keyword{manip} diff -Nru r-cran-epi-2.19/man/BrCa.Rd r-cran-epi-2.30/man/BrCa.Rd --- r-cran-epi-2.19/man/BrCa.Rd 1970-01-01 00:00:00.000000000 +0000 +++ r-cran-epi-2.30/man/BrCa.Rd 2018-03-11 15:55:03.000000000 +0000 @@ -0,0 +1,56 @@ +\name{BrCa} +\alias{BrCa} +\docType{data} +\title{Clinical status, +relapse, metastasis and death in 2982 women with breast cancer. +} +\description{ +This dataset is a transformation of the example dataset used by Crowther +and Lambert in their multistate paper.} +\usage{data(BrCa)} +\format{ + A data frame with 2982 observations on the following 17 variables: + \describe{ + \item{\code{pid}}{Person-id; numeric} + \item{\code{year}}{Calendar year of diagnosis} + \item{\code{age}}{Age at diagnosis} + \item{\code{meno}}{Menopausal status; a factor with levels \code{pre} \code{post}} + \item{\code{size}}{Tumour size; a factor with levels \code{<=20 mm} \code{>20-50 mm} \code{>50 mm}} + \item{\code{grade}}{Tumour grade; a factor with levels \code{2} \code{3}} + \item{\code{nodes}}{Number of positive lymph nodes, a numeric vector} + \item{\code{pr}}{Progesteron receptor level} + \item{\code{pr.tr}}{Transformed progesteron level} + \item{\code{er}}{Estrogen receptor level} + \item{\code{hormon}}{Hormon therapy at diagnosis; a factor with levels \code{no} \code{yes}} + \item{\code{chemo}}{Chemotherapy treatment; a factor with levels \code{no} \code{yes}} + \item{\code{tor}}{Time of relapse, years since diagnosis} + \item{\code{tom}}{Time of metastasis, years since diagnosis} + \item{\code{tod}}{Time of death, years since diagnosis} + \item{\code{tox}}{Time of exit from study, years since diagnosis} + \item{\code{xst}}{Vital status at exit; a factor with levels \code{Alive} \code{Dead}} + } +} +\details{ +The dataset has been modified to contain the times (since diagnosis) of the events of +interest, to comply with the usual structure of data. +} +\source{ + The original data were extracted from: + \url{http://fmwww.bc.edu/repec/bocode/m/multistate_example.dta}, this + is modified representation of the same amount of information. +} +\references{ + The data were used as example in the paper by Crowther and + Lambert: Parametric multistate survival models: Flexible + modelling allowing transition-specific distributions with application + to estimating clinically useful measures of effect differences; Stat + Med 36 (29), pp 4719-4742, 2017. (No, it is not the paper, just the + title.) + + A parallel analysis using the \code{\link{Lexis}} machinery is available + as: \url{http://bendixcarstensen.com/AdvCoh/papers/bcMS.pdf} + } +\examples{ +data(BrCa) +} +\keyword{datasets} diff -Nru r-cran-epi-2.19/man/ci.lin.Rd r-cran-epi-2.30/man/ci.lin.Rd --- r-cran-epi-2.19/man/ci.lin.Rd 2017-05-20 17:31:36.000000000 +0000 +++ r-cran-epi-2.30/man/ci.lin.Rd 2018-04-26 17:10:42.000000000 +0000 @@ -19,6 +19,7 @@ ctr.mat = NULL, subset = NULL, subint = NULL, + xvars = NULL, diffs = FALSE, fnam = !diffs, vcov = FALSE, @@ -26,7 +27,7 @@ df = Inf, Exp = FALSE, sample = FALSE ) -ci.exp( ..., Exp = TRUE, pval=FALSE ) +ci.exp( ..., Exp = TRUE, pval = FALSE ) Wald( obj, H0=0, ... ) ci.mat( alpha = 0.05, df = Inf ) ci.pred( obj, newdata, @@ -40,9 +41,9 @@ pval = FALSE ) } \arguments{ - \item{obj}{A model object (of class + \item{obj}{A model object (in general of class \code{glm}, but for + \code{ci.lin} and \code{ci.exp} it may also be of class \code{lm}, - \code{glm}, \code{coxph}, \code{survreg}, \code{clogistic}, @@ -58,16 +59,28 @@ \code{polr}, or \code{rq}). } - \item{ctr.mat}{Contrast matrix to be multiplied to the parameter - vector, i.e. the desired linear function of the parameters.} + \item{ctr.mat}{Matrix, data frame or list (of two data frames). + + If it is a matrix, it should be a contrast matrix to be multiplied + to the parameter vector, i.e. the desired linear function of the + parameters. + + If it is a data.frame it should have columns corresponding to a + prediction frame, see details. + + If it is a list, it must contain two data frames that are (possibly + partial) prediction frames for \code{obj}, see \code{xvars} and details.} + \item{xvars}{Character vector. If variables in the model are omitted + from data frames supplied in a list to \code{ctr.mat}, they should + be listed here. Omitted factors need not be mentioned here.} \item{subset}{The subset of the parameters to be used. If given as a character vector, the elements are in turn matched against the parameter names (using \code{grep}) to find the subset. Repeat parameters may result from using a character vector. This is considered a facility.} - \item{subint}{\code{sub}set selection like for \code{subset}, except that - elements of a character vector given as argument will be used to - select a number of subsets of parameters and only the \code{int}ersection + \item{subint}{Character. \code{sub}set selection, but where each + element of the character vector is used to + select a subset of parameters and only the \code{int}ersection of these is returned.} \item{diffs}{If TRUE, all differences between parameters in the subset are computed. \code{ctr.mat} is ignored. If \code{obj} @@ -84,19 +97,22 @@ \item{alpha}{Significance level for the confidence intervals.} \item{df}{Integer. Number of degrees of freedom in the t-distribution used to compute the quantiles used to construct the confidence intervals.} - \item{Exp}{If \code{TRUE} columns 5:6 are replaced with exp( columns - 1,5,6 ). For \code{ci.pred} it indicates whether the predictions - should be exponentiated - the default is to make a prediction on the - scale of the linear predictor and transform it by the inverse link - function; if FALSE, the prediction on the link scale is returned.} + \item{Exp}{For \code{ci.lin}, if \code{TRUE} columns 5:6 are replaced + with exp( columns 1,5,6 ). For \code{ci.exp} of \code{FALSE}, the + untransformed parameters are returned. For \code{ci.pred} it + indicates whether the predictions should be exponentiated - the + default (\code{Exp=NULL}) is to make a prediction with a Wald CI on + the scale of the linear predictor and back-transform it by the + inverse link function; if \code{FALSE}, the prediction on the link + scale is returned.} \item{sample}{Logical or numerical. If \code{TRUE} or numerical a sample of size \code{as.numeric(sample)} is drawn from the multivariate normal with mean equal to the (\code{subset} defined) coefficients and variance equal to the estimated variance-covariance of these. These are then transformed by \code{ctr.mat} and returned.} - \item{pval}{Logical. Should a column of P-values be included with the estimates - and confidence intervals output by \code{ci.exp}.} + \item{pval}{Logical. Should a column of P-values be included with the + estimates and confidence intervals output by \code{ci.exp}.} \item{H0}{Numeric. The null values for the selected/transformed parameters to be tested by a Wald test. Must have the same length as the selected parameter vector.} @@ -111,20 +127,53 @@ \code{r2} represent log-rates with confidence intervals.} } \value{ + \code{ci.lin} returns a matrix with number of rows and row names as \code{ctr.mat}. The columns are Estimate, Std.Err, z, P, 2.5\% and 97.5\% (or according to the value of \code{alpha}). If - \code{vcov=TRUE} a list with components \code{est}, the - desired functional of the parameters and \code{vcov}, the variance - covariance matrix of this, is returned but not printed. If - \code{Exp==TRUE} the confidence intervals for the parameters are - replaced with three columns: exp(estimate,c.i.). + \code{vcov=TRUE} a list of length 2 with components \code{coef} (a + vector), the desired functional of the parameters and \code{vcov} (a + square matrix), the variance covariance matrix of this, is returned + but not printed. If \code{Exp==TRUE} the confidence intervals for the + parameters are replaced with three columns: exp(estimate,c.i.). \code{ci.exp} returns only the exponentiated parameter estimates with confidence intervals. It is merely a wrapper for \code{ci.lin}, fishing out the last 3 columns from \code{ci.lin(...,Exp=TRUE)}. If you just want the estimates and confidence limits, but not exponentiated, use \code{ci.exp(...,Exp=FALSE)}. + + If \code{ctr.mat} is a data frame, the model matrix corresponding to + this is constructed and supplied, so the default behaviour will be to + produce the same as \code{ci.pred}, apparently superfluous. The purpose + of this is to allow the use of the arguments \code{vcov} that produces + the variance-covariance matrix of the predictions, and \code{sample} + that produces a sample of predictions using sampling from the + multivariate normal with mean equal to parameters and variance equal + to the hessian. + + If \code{ctr.mat} is a list of two data frames, the difference of the + predictions from using the first versus the last as newdata arguments + to predict is computed. Columns that are identical in the two data + frames can be omitted (see example), but names of numerical variables + omitted must be supplied in a character vector \code{xvars}. Factors + omitted need not be named. If the second data frame has only one row, + this is replicated to match the number of rows in the first. The + facility is primarily aimed at teasing out RRs that are non-linear + functions of a quantitative variable without setting up contrast + matrices using the same code as in the model. + + Finally, only arguments \code{Exp}, \code{vcov}, \code{alpha} and + \code{sample} from \code{ci.lin} are honored when \code{ctr.mat} is a + data frame or a list of two data frames. + + You can leave out variables (columns) from the two data frames that + would be constant and identical, basically variables not relevant for + the calculation of the contrast. In many cases \code{ci.lin} (really + \code{Epi:::ci.dfr}) can figure out the names of the omitted columns, + but occasionally you will have to supply the names of the omitted + variables in the \code{xvars} argument. Factors omitted need not be + listed in \code{xvars}, though no harm is done doing so. \code{Wald} computes a Wald test for a subset of (possibly linear combinations of) parameters being equal to the vector of null @@ -158,11 +207,13 @@ } \author{ Bendix Carstensen, - \url{BendixCarstensen.com} & - Michael Hills + \url{BendixCarstensen.com} & Michael Hills } \seealso{See also \code{\link{ci.cum}} for a function computing - cumulative sums of (functions of) parameter estimates.} + cumulative sums of (functions of) parameter estimates. The example + code for \code{\link{matshade}} has an application of predicting a + rate-ratio using a list of two prediction frame in the \code{ctr.mat} + argument.} \examples{ # Bogus data: f <- factor( sample( letters[1:5], 200, replace=TRUE ) ) @@ -181,6 +232,26 @@ # Use character defined subset to get ALL contrasts: ci.lin( mm, subset="f", diff=TRUE ) +# Suppose the x-effect differs across levels of g: +mi <- update( mm, . ~ . + g:x ) +ci.lin( mi ) +# RR a vs. b by x: +nda <- data.frame( x=-3:3, g="a", f="b" ) +ndb <- data.frame( x=-3:3, g="b", f="b" ) +# +ci.lin( mi, list(nda,ndb) ) +# Same result if f column is omitted because "f" columns are identical +ci.lin( mi, list(nda[,-3],ndb[,-3]) ) +# However, Crashes if knots in spline is supplied, and non-factor omitted +xk <- -1:1 +xi <- c(-0.5,0.5) +ww <- rnorm(200) +mi <- update( mm, . ~ . -x +ww + Ns(x,knots=xk) + g:Ns(x,knots=xi) ) +# Will crash +try( cbind( nda$x, ci.lin( mi, list(nda,ndb) ) ) ) +# Must specify num vars (not factors) omitted from nda, ndb +cbind( nda$x, ci.lin( mi, list(nda,ndb), xvars="ww" ) ) + # A Wald test of whether the g-parameters are 0 Wald( mm, subset="g" ) # Wald test of whether the three first f-parameters are equal: @@ -190,9 +261,58 @@ ( CM <- rbind( c(1,-1,0,0), c(0,1,-1,0)) ) Wald( mm, subset="f", ctr.mat=CM ) -# Confidnece intervas for ratio of rates +# Confidence intervals for ratio of rates +# Rates and ci supplied, but only the range (lower and upper ci) is used ci.ratio( cbind(10,8,12.5), cbind(5,4,6.25) ) ci.ratio( cbind(8,12.5), cbind(4,6.25) ) + +# Beware of the offset when making predictions with ci.pred and ci.exp +\dontrun{ +library( mgcv ) +data( mortDK ) +m.arg <- glm( dt ~ age , offset=log(risk) , family=poisson, data=mortDK ) +m.form <- glm( dt ~ age + offset(log(risk)), family=poisson, data=mortDK ) +a.arg <- gam( dt ~ age , offset=log(risk) , family=poisson, data=mortDK ) +a.form <- gam( dt ~ age + offset(log(risk)), family=poisson, data=mortDK ) + +nd <- data.frame( age=60:65, risk=100 ) +round( ci.pred( m.arg , nd ), 4 ) +round( ci.pred( m.form, nd ), 4 ) +round( ci.exp ( m.arg , nd ), 4 ) +round( ci.exp ( m.form, nd ), 4 ) +round( ci.pred( a.arg , nd ), 4 ) +round( ci.pred( a.form, nd ), 4 ) +round( ci.exp ( a.arg , nd ), 4 ) +round( ci.exp ( a.form, nd ), 4 ) + +nd <- data.frame( age=60:65 ) +try( ci.pred( m.arg , nd ) ) +try( ci.pred( m.form, nd ) ) +try( ci.exp ( m.arg , nd ) ) +try( ci.exp ( m.form, nd ) ) +try( ci.pred( a.arg , nd ) ) +try( ci.pred( a.form, nd ) ) +try( ci.exp ( a.arg , nd ) ) +try( ci.exp ( a.form, nd ) ) +} +# The offset may be given as an argument (offset=log(risk)) +# or as a term (+offset(log)), and depending on whether we are using a +# glm or a gam Poisson model and whether we use ci.pred or ci.exp to +# predict rates the offset is either used or ignored and either +# required or not; the state of affairs can be summarized as: +# +# offset +# ------------------------------------- +# usage required? +# ------------------ --------------- +# function model argument term argument term +# --------------------------------------------------------- +# ci.pred glm used used yes yes +# gam ignored used no yes +# +# ci.exp glm ignored ignored no yes +# gam ignored ignored no yes +# --------------------------------------------------------- } \keyword{models} \keyword{regression} diff -Nru r-cran-epi-2.19/man/cutLexis.Rd r-cran-epi-2.30/man/cutLexis.Rd --- r-cran-epi-2.19/man/cutLexis.Rd 2017-07-06 10:05:45.000000000 +0000 +++ r-cran-epi-2.30/man/cutLexis.Rd 2017-10-23 10:45:22.000000000 +0000 @@ -121,6 +121,7 @@ } \seealso{ \code{\link{mcutLexis}}, + \code{\link{addCov.Lexis}}, \code{\link{splitLexis}}, \code{\link{Lexis}}, \code{\link{summary.Lexis}}, diff -Nru r-cran-epi-2.19/man/detrend.Rd r-cran-epi-2.30/man/detrend.Rd --- r-cran-epi-2.19/man/detrend.Rd 2016-11-10 13:29:18.000000000 +0000 +++ r-cran-epi-2.30/man/detrend.Rd 2018-05-13 18:41:13.000000000 +0000 @@ -1,34 +1,40 @@ \name{detrend} \alias{detrend} +\alias{decurve} \title{ Projection of a model matrix on to the orthogonal - complement of a trend. } + complement of a trend or curvature.} \description{ - The columns of the model matrix \code{M} is projected on the - orthogonal complement to the matrix \code{(1,t)}. Orthogonality - is defined w.r.t. an inner product defined by the weights \code{weight}. + The columns of a model matrix \code{M} is projected on the + orthogonal complement to the matrix \code{(1,t)}, + resp. \code{(1,t,t^2)}. + + Orthogonality is defined w.r.t. an inner product defined by the + matrix \code{diag(weight)}. } \usage{ detrend( M, t, weight = rep(1, nrow(M)) ) + decurve( M, t, weight = rep(1, nrow(M)) ) } \arguments{ - \item{M}{ A model matrix. } - \item{t}{ The trend defining a subspace. A numerical vector of length + \item{M}{A model matrix. } + \item{t}{The trend defining a subspace. A numerical vector of length \code{nrow(M)} } \item{weight}{ Weights defining the inner product of vectors \code{x} and \code{y} as \code{sum(x*w*y)}. A numerical vector of length \code{nrow(M)}, defaults to a vector of - \code{1}s.} + \code{1}s. Must be all non-negative.} } \details{ - The functions is intended to be used in parametrization of + The functions are intended to be used in parametrization of age-period-cohort models. } \value{ - A full-rank matrix with columns orthogonal to \code{(1,t)}. + A full-rank matrix with columns orthogonal to \code{(1,t)}, for + \code{decurv}, \code{(1,t,t^2)}. } \author{ Bendix Carstensen, Steno Diabetes Center, - \url{http://BendixCarstensen.com}, with help from Peter Dalgaard. + \url{http://BendixCarstensen.com}, with essential help from Peter Dalgaard. } \seealso{ \code{\link{projection.ip}} } \keyword{array} diff -Nru r-cran-epi-2.19/man/DMepi.Rd r-cran-epi-2.30/man/DMepi.Rd --- r-cran-epi-2.19/man/DMepi.Rd 2017-02-17 01:56:41.000000000 +0000 +++ r-cran-epi-2.30/man/DMepi.Rd 2018-03-05 00:07:37.000000000 +0000 @@ -37,5 +37,41 @@ data=DMepi ), 2 ), row.vars = 1 ) ) + +# Model for age-specific incidence rates; +inc <- glm( X ~ sex + Ns( A, knots=seq(30,80,10) ) + P, + offset = log(Y.nD), + family = poisson, + data = DMepi ) + +# Predict for men and women separately in 2010: +ndm <- data.frame( sex="M", A=20:90, P=2010, Y.nD=1000 ) +ndf <- data.frame( sex="F", A=20:90, P=2010, Y.nD=1000 ) +prM <- ci.pred( inc, ndm ) +prF <- ci.pred( inc, ndf ) +matplot( ndm$A, cbind(prM,prF), + type="l", lty=1, lwd=c(3,1,1), + col=rep(c("blue","red"),each=3), + log="y", xlab="Age", ylab="DM incidence per 1000 PY" ) + +# This is a proportional hazards model - add sex-age interaction +int <- update( inc, . ~ . + sex:Ns( A, knots=seq(30,80,10) ) ) +prM <- ci.pred( int, ndm ) +prF <- ci.pred( int, ndf ) +matplot( ndm$A, cbind(prM,prF), + type="l", lty=1, lwd=c(3,1,1), + col=rep(c("blue","red"),each=3), + log="y", xlab="Age", ylab="DM incidence per 1000 PY" ) + +# The rate-ratio is teased out using the ci.exp: +RRp <- ci.exp( inc, list(ndm,ndf) ) +RRi <- ci.exp( int, list(ndm,ndf) ) + +# and added to the plot +matlines( ndm$A, cbind(RRi,RRp), + type="l", lty=1, lwd=c(3,1,1), col=gray(rep(c(0.3,0.7),each=3)) ) +abline(h=1) +axis(side=4) +mtext( "Male/Female IRR", side=4, line=2 ) } \keyword{datasets} diff -Nru r-cran-epi-2.19/man/erl.Rd r-cran-epi-2.30/man/erl.Rd --- r-cran-epi-2.19/man/erl.Rd 2017-04-14 10:14:21.000000000 +0000 +++ r-cran-epi-2.30/man/erl.Rd 2018-03-17 12:51:36.000000000 +0000 @@ -92,9 +92,14 @@ (with presumably higher mortality rates) are assumed not to occur. This is a slightly peculiar assumption (but presumably the most used in the epidemiological literature) and the resulting object is - therefore given an attribute, \code{NOTE}, that point this out. The - default of the \code{surv2} function is to take the possibility of - disease into account in order to potentially rectify this.} + therefore given an attribute, \code{NOTE}, that point this out. + + If however \code{muW} is the total mortality in the population + (including the diseased) the result is a good approximation to the + correct YLL. + + The default of the \code{surv2} function is to take the possibility of + disease into account.} \value{\code{surv1} and \code{surv2} return a matrix whose first column is the ages at the ends of the diff -Nru r-cran-epi-2.19/man/foreign.Lexis.Rd r-cran-epi-2.30/man/foreign.Lexis.Rd --- r-cran-epi-2.19/man/foreign.Lexis.Rd 2016-11-10 13:29:18.000000000 +0000 +++ r-cran-epi-2.30/man/foreign.Lexis.Rd 2018-02-11 06:42:49.000000000 +0000 @@ -4,7 +4,7 @@ \alias{msdata.Lexis} \alias{etm} \alias{etm.Lexis} -\title{Create a data structures suitable for use with packages mstate, etm. +\title{Create a data structures suitable for use with packages \code{mstate} or \code{etm}. } \description{ The \code{mstate} package requires input in the form of a stacked @@ -21,17 +21,18 @@ \method{msdata}{Lexis}(obj, time.scale = timeScales(obj)[1], ... ) -\method{etm}{Lexis}( obj, - time.scale = timeScales(obj)[1], - cens.name = "cens", - s = 0, - t = "last", - covariance = TRUE, - delta.na = TRUE, - ... ) +\method{etm}{Lexis}( data, + time.scale = timeScales(data)[1], + cens.name = "cens", + s = 0, + t = "last", + covariance = TRUE, + delta.na = TRUE, + ... ) } \arguments{ \item{obj}{A \code{\link{Lexis}} object.} + \item{data}{A \code{\link{Lexis}} object.} \item{time.scale}{Name or number of timescale in the \code{Lexis} object.} \item{cens.name}{Name of the code for censoring used by \code{etm}. It @@ -81,6 +82,7 @@ { dmi <- subset(dmi,lex.id<1000) etm.D <- etm.Lexis( dmi, time.scale=3 ) +str( etm.D ) plot( etm.D, col=rainbow(5), lwd=2, lty=1, xlab="DM duration" ) } } diff -Nru r-cran-epi-2.19/man/LCa.fit.Rd r-cran-epi-2.30/man/LCa.fit.Rd --- r-cran-epi-2.19/man/LCa.fit.Rd 2017-08-08 12:14:39.000000000 +0000 +++ r-cran-epi-2.30/man/LCa.fit.Rd 2018-05-03 12:42:43.000000000 +0000 @@ -94,7 +94,8 @@ number of events (\code{D}) between each, and half as many below the first and above the last knot. If \code{npar} is a list of scalars the behavior is the same. If \code{npar} is a list of vectors, these are - taken as the knots for the natural splines. + taken as the knots for the natural splines. See details for naming + convention. } \item{VC}{ Logical. Should the variance-covariance matrix of the parameters be @@ -137,10 +138,21 @@ The multiplicative age by period term could equally well have been a multiplicative age by cohort or even both. Thus the most extensive - model is: + model has 5 continuous functions: - \deqn{\log(\lambda(a,p)) = f(a) + b_p(a)k_p(a) + b_c(a)k_c(a)}{% - log(lambda(a,p)) = f(a) + b_p(a)k_p(a) + b_c(a)k_c(a)} + \deqn{\log(\lambda(a,p)) = f(a) + b_p(a)k_p(p) + b_c(a)k_c(p-a)}{% + log( lambda(a,p)) = f(a) + b_p(a)k_p(p) + b_c(a)k_c(p-a)} + + Each of these is fitted by a natural spline, with knots placed at the + quantiles of the events along the age (a), calendar time (p) respective + cohort (p-a) scales. Alternatively the knots can be specified explicitly + in the argument \code{npar} as a named list, where + \code{a} refers to \eqn{f(a)}{f(a)}, + \code{p} refers to \eqn{k_p(p)}{k_p(p)}, + \code{c} refers to \eqn{k_c(p-a)}{k_c(p-a)}, + \code{pi} (\code{p}eriod code{i}nteraction) refers to \eqn{b_p(a)}{b_p(a)} + and + \code{ci} (\code{c}ohort \code{i}nteraction) refers to \eqn{b_c(p-a)}{b_c(p-a)}. The naming convention for the models is a capital \code{P} and/or \code{C} if the effect is in the model followed by a lower case @@ -148,18 +160,19 @@ models that can be fitted: \code{APa}, \code{ACa}, \code{APaC} \code{APCa} and \code{APaCa}. - The standard errors of the parameters from the two model fits are however - wrong; they are conditional on some of terms having a fixed value. And - the symbolic calculation of the Hessian is a nightmare, so this is done - numerically using the \code{hessian} function from the \code{numDeriv} - package if \code{VC=TRUE}. + The standard errors of the parameters from the two separate model fits + in the iterations are however wrong; they are conditional on a subset + of the parameters having a fixed value. However, analytic calculation + of the Hessian is a bit of a nightmare, so this is done numerically + using the \code{hessian} function from the \code{numDeriv} package if + \code{VC=TRUE}. The coefficients and the variance-covariance matrix of these are used - in \code{predict.LCa} for a parametric bootstrap (that is, a simulation from - a multivariate normal with mean equal to the parameter estimates and - variance as the estimated variance-covariance) to get confidence - intervals for the predictions if \code{sim} is \code{TRUE} --- which - it is by default if they are part of the object. + in \code{predict.LCa} for a parametric bootstrap (that is, a + simulation from a multivariate normal with mean equal to the parameter + estimates and variance as the estimated variance-covariance) to get + confidence intervals for the predictions if \code{sim} is \code{TRUE} + --- which it is by default if they are part of the object. The \code{plot} for \code{LCa} objects merely produces between 3 and 5 panels showing each of the terms in the model. These are mainly for @@ -174,30 +187,31 @@ \code{APCa} or \code{APaCa}, indicating the variable(s) interacting with age.} \item{ax}{3-column matrix of age-effects, c.i. from the age-time - model. Rownames are the unique occurring ages in the + model. Row names are the unique occurring ages in the dataset. Estimates are rates.} \item{pi}{3-column matrix of age-period interaction effects, c.i. from the age - model. Rownames are the actually occurring ages in the - dataset. Estimates are multipliers of the log-RRs in \code{kt}, - centered at 1 at \code{ci.ref}.} + model. Row names are the actually occurring ages in the + dataset. Estimates are multipliers of the log-RRs in \code{kp}, + centered at 1 at \code{pi.ref}.} \item{kp}{3-column matrix of period-effects, with c.i.s from the - age-time model. Rownames are the actually occurring times in the + age-time model. Row names are the actually occurring times in the dataset. Estimates are rate-ratios centered at 1 at \code{p.ref}.} \item{ci}{3-column matrix of age-cohort interaction effects, c.i. from the age - model. Rownames are the actually occurring ages in the - dataset. Estimates are multipliers of the log-RRs in \code{kt}, + model. Row names are the actually occurring ages in the + dataset. Estimates are multipliers of the log-RRs in \code{kc}, centered at 1 at \code{ci.ref}.} -\item{kc}{3-column matrix of period-effects, with c.i.s from the age-time - model. Rownames are the actually occurring times in the - dataset. Estimates are rate-ratios centered at 1 at \code{p.ref}.} -\item{mod.at}{\code{glm} object with the final age-time model. Gives - the same fit as the \code{mod.b} model.} -\item{mod.b}{\code{glm} object with the final age model. Gives - the same fit as the \code{mod.at} model.} +\item{kc}{3-column matrix of cohort-effects, with c.i.s from the age-time + model. Row names are the actually occurring times in the + dataset. Estimates are rate-ratios centered at 1 at \code{c.ref}.} +\item{mod.at}{\code{glm} object with the final age-time model --- estimates + the terms \code{ax}, \code{kp}, \code{kc}. Gives + the same fit as the \code{mod.b} model after convergence.} +\item{mod.b}{\code{glm} object with the final age model --- estimates + the terms \code{pi}, \code{ci}. Gives + the same fit as the \code{mod.at} model after convergence.} \item{coef}{All coefficients from both models, in the order \code{ax}, \code{kp}, \code{kc}, \code{pi}, \code{ci}. Only present if - \code{LCa.fit} were called with - \code{VC=TRUE} (the default).} + \code{LCa.fit} were called with \code{VC=TRUE} (the default).} \item{vcov}{Variance-covariance matrix of coefficients from both models, in the same order as in the \code{coef}. Only present if \code{LCa.fit} were called with \code{VC=TRUE}.} @@ -214,9 +228,9 @@ (\code{kt}). For the \code{APaCa} model 5 panels are plotted. \code{summary.LCa} returns (invisibly) a matrix with the parameters - from the models and a column of the conditional se.s and of the se.s - derived from the numerically computed Hessian (if \code{LCa.fit} were - called with \code{VC=TRUE}.) + from the models and a column of the conditional se.s and additionally + of the se.s derived from the numerically computed Hessian (if + \code{LCa.fit} were called with \code{VC=TRUE}.) \code{predict.LCa} returns a matrix with one row per row in \code{newdata}. If \code{LCa.fit} were called with \code{VC=TRUE} @@ -224,20 +238,21 @@ based on a simulation of parameters from a multivariate normal with mean \code{coef} and variance \code{vcov} using the median and \code{alpha}/2 quantiles from the \code{sim} simulations. If - \code{LCa.fit} were called with \code{VC=FALSE} there - will be 6 columns, namely estimates and c.i.s from age-time model (\code{mod.at}), and - from the age-interaction model (\code{mod.b}), both using conditional variances. + \code{LCa.fit} were called with \code{VC=FALSE} there will be 6 + columns, namely estimates and c.i.s from age-time model + (\code{mod.at}), and from the age-interaction model (\code{mod.b}), + both using conditional variances, and therefore likely with too narrow + confidence limits. } \author{ Bendix Carstensen, \url{http://BendixCarstensen.com} - This function was conceived during a course on APC models at the Max - Planck Institute of Demographic Research (MPIDR, + This function was conceived while teaching a course on APC models at + the Max Planck Institute of Demographic Research (MPIDR, \url{https://www.demogr.mpg.de/en/}) in Rostock in May 2016 (\url{http://bendixcarstensen.com/APC/MPIDR-2016/}), and finished - during a research stay there, kindly sponsored by the MPIDR. -} - + during a week long research stay there, kindly sponsored by the MPIDR. + } \seealso{ \code{\link{apc.fit}}, \code{\link{apc.LCa}}, @@ -266,18 +281,18 @@ par( mfrow=c(1,3) ) plot( LCa.tc ) -# Prediction data frame for ages 15 to 60 for three time points: +# Prediction data frame for ages 15 to 60 for two time points: nd <- data.frame( A=15:60 ) - -p50 <- predict.LCa( LCa.tc, newdata=cbind(nd,P=1950), sim=10000 ) -p70 <- predict.LCa( LCa.tc, newdata=cbind(nd,P=1970), sim=10000 ) -p90 <- predict.LCa( LCa.tc, newdata=cbind(nd,P=1990), sim=10000 ) +# LCa predictions +p70 <- predict.LCa( LCa.tc, newdata=cbind(nd,P=1970), sim=1000 ) +p90 <- predict.LCa( LCa.tc, newdata=cbind(nd,P=1990), sim=1000 ) # Inspect the curves from the parametric bootstrap (simulation): par( mfrow=c(1,1) ) -matplot( nd$A, cbind(p50,p70,p90), type="l", lwd=c(6,3,3), lty=1, - col=rep( c("red","green","blue"), each=3 ), log="y", - ylab="Testis cancer incidence per 100,000 PY, 1970, 80, 90", xlab="Age" ) +head( cbind(p70,p90) ) +matplot( nd$A, cbind(p70,p90), type="l", lwd=c(6,3,3), lty=c(1,3,3), + col=rep( 2:3, each=3 ), log="y", + ylab="Testis cancer incidence per 100,000 PY in 1970 resp. 1990", xlab="Age" ) } \keyword{models} \keyword{regression} diff -Nru r-cran-epi-2.19/man/lines.apc.Rd r-cran-epi-2.30/man/lines.apc.Rd --- r-cran-epi-2.19/man/lines.apc.Rd 2016-11-10 13:29:18.000000000 +0000 +++ r-cran-epi-2.30/man/lines.apc.Rd 2018-05-13 19:15:11.000000000 +0000 @@ -82,13 +82,17 @@ for plotting the three sets of curves.} } \details{ + There is no difference between the functions \code{apc.lines} and + \code{lines.apc}, except the the latter is the \code{lines} method + for \code{apc} objects. + The drawing of three effects in an APC-frame is a rather trivial task, and the main purpose of the utility is to provide a function that easily adds the functionality of adding a drift so that several sets of lines can be easily produced in the same frame. } \value{ - \code{APC.lines} returns (invisibly) a list of three matrices of the + \code{apc.lines} returns (invisibly) a list of three matrices of the effects plotted. } \author{ diff -Nru r-cran-epi-2.19/man/matshade.Rd r-cran-epi-2.30/man/matshade.Rd --- r-cran-epi-2.19/man/matshade.Rd 1970-01-01 00:00:00.000000000 +0000 +++ r-cran-epi-2.30/man/matshade.Rd 2018-05-15 07:49:55.000000000 +0000 @@ -0,0 +1,100 @@ +\name{matshade} +\alias{matshade} +\title{ + Plot confidence intervals as shaded areas around lines. +} +\description{ + Uses an x-vector and a matrix of 3*N columns with estimates and ci.s + to produce the lines of estimates and confidence intervals as shaded + areas in transparent colours around the lines of the estimates. +} +\usage{ +matshade( x, y, lty = 1, + col = 1:(ncol(y)/3), col.shade=col, alpha=0.15, + plot = dev.cur()==1, + ... ) +} +\arguments{ + \item{x}{Numerical vector. Unlike \code{\link{matplot}} this can only be a vector. +} + \item{y}{A matrix with 3*N columns --- representing estimates and + confidence bounds for N curves. Order of columns are assumed to be + (est,lo,hi,est,lo,hi...) (or (est,hi,lo...)) +} + \item{lty}{Line types for the curves. +} + \item{col}{Color(s) of the estimated curves. +} + \item{col.shade}{Color(s) of the shaded areas. These are the colors + that are made transparent by the \code{alpha} factor. Defaults to + the same colors as the lines. +} + \item{alpha}{Number in [0,1] indicating the transparency of the colors for + the confidence intervals. Larger values makes the shades + darker. Can be a vector which then applies to the curves in turn. +} + \item{plot}{Logical. Should a new plot frame be started? If no device + is active, the default is to start one, and plot all \code{y}s versus + x in transparent color. On the rare occasion a device is open, but + no plot have been called you will get an error telling that plot.new + has not been called yet, in which case you should explicitly set + \code{plot} to \code{TRUE}. +} + \item{\dots}{Arguments passed on to \code{\link{matplot}} (if + \code{plot=TRUE}) and \code{\link{matlines}} for use when plotting + the lines. Note that \code{lwd=0} will cause lines to be omitted and + only the shades be plotted. +} +} + +\details{All shaded areas are plotted first, the curves added + afterwards, so that lines are not 'overshadowed'. + + Also not that if you repeat the same command, you will get the curves + and the shaded areas overplotted in the same frame, so the effect is to + have the shades darker, because the transparent colors are plotted on + top of those from the first command. +} +\value{NULL. Used for its side effects. +} +\author{Bendix Carstensen, \url{BendixCarstensen.com} +} +\examples{ +# Follow-up data of Danish DM patients +data( DMlate ) +mL <- Lexis( entry=list(age=dodm-dobth,per=dodm), + exit=list(per=dox), + exit.status=factor(!is.na(dodth),labels=c("Alive","Dead")), + data=DMlate ) +# Split follow-up and model by splines +sL <- splitLexis( mL, breaks=0:100, time.scale="age") +\dontrun{ +# the same thing with popEpi +sL <- splitMulti( mL, age=0:100 ) +} +mort <- glm( (lex.Xst=="Dead") ~ sex*Ns(age,knots=c(15,3:8*10)), + offset = log(lex.dur), + family = poisson, + data = sL ) +\dontrun{ +# The counterpart with gam +library( mgcv ) +mort <- gam( (lex.Xst=="Dead") ~ s(age,by=sex) + sex, + offset = log(lex.dur), + family = poisson, + data = sL ) + } +# predict rates (per 1000 PY) for men and women +ndM <- data.frame( age=10:90, sex="M", lex.dur=1 ) +ndF <- data.frame( age=10:90, sex="F", lex.dur=1 ) +# gam objects ignores the offset in prediction +prM <- ci.pred( mort, ndM )*1000 +prF <- ci.pred( mort, ndF )*1000 +# predict rate-ratio +MFr <- ci.exp( mort, ctr.mat=list(ndM,ndF) ) +# plot lines with shaded confidence limits +matshade( ndM$age, cbind( MFr, prF, prM ), col=c(1,2,4), lwd=3, + log="y", xlab="Age", ylab="Mortality per 1000 PY (and RR)" ) +abline( h=1 ) +} +\keyword{color} diff -Nru r-cran-epi-2.19/man/mcutLexis.Rd r-cran-epi-2.30/man/mcutLexis.Rd --- r-cran-epi-2.19/man/mcutLexis.Rd 2017-04-03 22:02:49.000000000 +0000 +++ r-cran-epi-2.30/man/mcutLexis.Rd 2017-10-23 13:04:45.000000000 +0000 @@ -5,7 +5,7 @@ } \description{ A generalization of \code{\link{cutLexis}} to the case where different -events may occur in any order. +events may occur in any order (but at most once for each). } \usage{ mcutLexis( L0, timescale = 1, wh, @@ -26,8 +26,16 @@ \item{precursor.states}{Which states are precursor states. See \code{\link{cutLexis}} for definition of precursor states.} \item{seq.states}{Should the sequence of events be kept track of? That - is, should A-B be considered different from B-A. If \code{FALSE}, the - state with combined preceding events A and B will be called A+B.} + is, should A-B be considered different from B-A. If \code{FALSE}, + the state with combined preceding events A and B will be called + A+B (alphabetically sorted). + + May also be supplied as character: \code{s} - sequence, keep + track of sequence of states occupied (same as \code{TRUE}), \code{u} + - unordered, keep track only of states visited (same as + \code{FALSE}) or \code{l}, \code{c} - last or current state, only + record the latest state visited. If given as character, only the + first letter converted to lower case is used.} \item{new.scales}{Should we construct new time scales indicating the time since each of the event occurrences.} \item{ties.resolve}{Should tied event times be resolved by adding @@ -46,7 +54,10 @@ Bendix Carstensen, \url{http://BendixCarstensen.com} } \seealso{ -\code{\link{cutLexis}}, \code{\link{Lexis}}, \code{\link{splitLexis}} + \code{\link{cutLexis}}, + \code{\link{addCov.Lexis}}, + \code{\link{Lexis}}, + \code{\link{splitLexis}} } \examples{ # A dataframe of times diff -Nru r-cran-epi-2.19/man/merge.data.frame.Rd r-cran-epi-2.30/man/merge.data.frame.Rd --- r-cran-epi-2.19/man/merge.data.frame.Rd 2016-11-10 13:29:18.000000000 +0000 +++ r-cran-epi-2.30/man/merge.data.frame.Rd 1970-01-01 00:00:00.000000000 +0000 @@ -1,24 +0,0 @@ -\name{merge.data.frame} -\alias{merge.data.frame} -\title{Merge data frame with a Lexis object} -\description{ - Merge two data frames, or a data frame with a \code{Lexis} object. -} -\usage{ -\method{merge}{data.frame}(x, y, ...) -} -\arguments{ - \item{x, y}{data frames, or objects to be coerced into one} - \item{...}{optional arguments for the merge method} -} -\details{ - This version of \code{merge.default} masks the one in the \code{base}. - It ensures that, if either \code{x} or \code{y} is a \code{Lexis} - object, then \code{merge.Lexis} is called. -} -\value{ - A merged \code{Lexis} object or data frame. -} -\author{Martyn Plummer} -\seealso{\code{\link{Lexis}}} -\keyword{ts} diff -Nru r-cran-epi-2.19/man/N2Y.Rd r-cran-epi-2.30/man/N2Y.Rd --- r-cran-epi-2.19/man/N2Y.Rd 2016-11-10 13:29:18.000000000 +0000 +++ r-cran-epi-2.30/man/N2Y.Rd 2018-03-05 00:42:19.000000000 +0000 @@ -24,10 +24,6 @@ \item{return.dfr}{Logical. Should the results be returned as a data frame (default \code{TRUE}) or as a table.} } -\details{The calculation of the risk time from the population figures is - done as described in: B. Carstensen: Age-Period-Cohort models for the - Lexis diagram. Statistics in Medicine, 26: 3018-3045, 2007. -} \value{A data frame with variables \code{A}, \code{P} and \code{Y}, representing the mean age and period in the Lexis triangles and the person-time in them, respectively. The person-time is in units of the @@ -38,6 +34,23 @@ the values \code{up} and \code{lo} corresponding to upper (early cohort) and lower (late cohort) Lexis triangles. } +\details{The calculation of the risk time from the population figures is + done as described in: B. Carstensen: Age-Period-Cohort models for the + Lexis diagram. Statistics in Medicine, 26: 3018-3045, 2007. + + The number of periods in the result is one less than the number + of dates (\code{nP=length(table(P))}) in the input, so the number of + distinct values is \code{2*(nP-1)}, because the \code{P} in the output + is coded differently for upper and lower Lexis triangles. + + The number of age-classes is the same as in the input. In the paper + "Age-Period-Cohort models for the Lexis diagram" I suggest that the + risk time in the lower triangles in the first age-class and in the + upper triangles in the last age-class are computed so that the total + risk time in the age-class corresponds to the average of the two + population figures for the age-class at either end of the period. This + is the method used. +} \references{ B. Carstensen: Age-Period-Cohort models for the Lexis diagram. Statistics in Medicine, 26: 3018-3045, 2007. @@ -70,7 +83,7 @@ # Blue numbers are population size at 1 January # Red numbers are the computed person-years in Lexis triangles: -Lexis.diagram(age=c(0,4), date=c(1970,1975), int=1, coh.grid=TRUE ) +Lexis.diagram(age=c(0,5), date=c(1970,1975), int=1, coh.grid=TRUE ) with( Nx, text(P,A+0.5,paste(N),srt=90,col="blue") ) with( Nt, text(P,A,formatC(Y,format="f",digits=1),col="red") ) text( 1970.5, 2, "Population count 1 January", srt=90, col="blue") diff -Nru r-cran-epi-2.19/man/Ns.Rd r-cran-epi-2.30/man/Ns.Rd --- r-cran-epi-2.19/man/Ns.Rd 2016-11-10 13:29:18.000000000 +0000 +++ r-cran-epi-2.30/man/Ns.Rd 2018-03-14 13:44:49.000000000 +0000 @@ -40,7 +40,8 @@ \code{c(FALSE,TRUE)}. Ignored if \code{!(detrend==FALSE)}.} \item{detrend}{If \code{TRUE}, the columns of the spline basis will be projected to the orthogonal of \code{cbind(1,x)}. Optionally - \code{detrend} can be given as a vector of non-negative numbers used + \code{detrend} can be given as a vector of non-negative numbers og + length \code{length(x)}, used to define an inner product as \code{diag(detrend)} for projection on the orthogonal to \code{cbind(1,x)}. The default is projection w.r.t. the inner product defined by the identity matrix.} @@ -48,15 +49,17 @@ \value{ A matrix of dimension c(length(x),df) where either \code{df} was supplied or if \code{knots} were supplied, \code{df = length(knots) - - intercept}. \code{Ns} returns a spline basis which is centered at + 1 + intercept}. \code{Ns} returns a spline basis which is centered at \code{ref}. \code{Ns} with the argument \code{detrend=TRUE} returns a spline basis which is orthogonal to \code{cbind(1,x)} with respect to the inner product defined by the positive definite matrix - \code{diag(weight)} (an assumption which is checked). + \code{diag(detrend)} (an assumption which is checked). Note the latter + is data dependent and therefore making predictions + with a \code{newdata} argument will be senseless. } \author{ Bendix Carstensen \email{b@bxc.dk}, - Lars Jorge D\'iaz, Steno Diabetes Center. + Lars Jorge D\'iaz, Steno Diabetes Center Copenhagen. } \note{ The need for this function is primarily from analysis of rates in diff -Nru r-cran-epi-2.19/man/pc.lines.Rd r-cran-epi-2.30/man/pc.lines.Rd --- r-cran-epi-2.19/man/pc.lines.Rd 2016-11-10 13:29:18.000000000 +0000 +++ r-cran-epi-2.30/man/pc.lines.Rd 2018-05-13 19:08:38.000000000 +0000 @@ -3,32 +3,36 @@ \alias{pc.lines} \alias{pc.matpoints} \alias{pc.matlines} +\alias{pc.matshade} \alias{cp.points} \alias{cp.lines} \alias{cp.matpoints} \alias{cp.matlines} +\alias{cp.matshade} \title{ Plot period and cohort effects in an APC-frame. } \description{ When an APC-frame has been produced by \code{\link{apc.frame}}, this - function draws curves in the period/cohort part of the frame. + function draws curves or points in the period/cohort part of the frame. } \usage{ pc.points( x, y, ... ) pc.lines( x, y, ... ) pc.matpoints( x, y, ... ) pc.matlines( x, y, ... ) + pc.matshade( x, y, ... ) cp.points( x, y, ... ) cp.lines( x, y, ... ) cp.matpoints( x, y, ... ) cp.matlines( x, y, ... ) + cp.matshade( x, y, ... ) } \arguments{ \item{x}{vector of \code{x}-coordinates.} - \item{y}{vector of \code{y}-coordinates.} + \item{y}{vector or matrix of \code{y}-coordinates.} \item{...}{Further parameters to be transmitted to points, lines, - matpoints or matlines used for plotting the three sets of curves.} + matpoints, matlines or matshade used for plotting the three sets of curves.} } \details{Since the Age-part of the frame is referred to by its real coordinates plotting in the calendar time part requires translation @@ -37,7 +41,7 @@ The functions \code{cp.points} etc. are just synonyms for these, in recognition of the fact that you can never remember whether it is "pc" - pr "cp". + or "cp". } \value{ The functions return nothing. diff -Nru r-cran-epi-2.19/man/Relevel.Rd r-cran-epi-2.30/man/Relevel.Rd --- r-cran-epi-2.19/man/Relevel.Rd 2016-11-10 13:29:18.000000000 +0000 +++ r-cran-epi-2.30/man/Relevel.Rd 2018-03-08 13:55:57.000000000 +0000 @@ -4,31 +4,84 @@ \title{Reorder and combine levels of a factor} \description{ The levels of a factor are re-ordered so that the levels specified by - \code{ref} is first and the others are moved down. This is useful for - \code{contr.treatment} contrasts which take the first level as the - reference. Levels may also be combined. + \code{ref} appear first and remaining levels are moved down. This is + useful for \code{contr.treatment} contrasts which take the first level + as the reference. Factor levels may also be combined; two possibilities for + specifying this are supported: hard coding or table look-up. } \usage{ -\method{Relevel}{factor}( x, ref, first = TRUE, collapse="+", \dots ) +\method{Relevel}{factor}( x, ref, first = TRUE, collapse="+", + xlevels=TRUE, nogroup=TRUE, \dots ) } \arguments{ - \item{x}{An unordered factor} - \item{ref}{The names or numbers of levels to be the first. If \code{ref} - is a list, factor levels mentioned in each list element are - combined. If the list is named the names are used as new factor levels.} - \item{first}{Should the levels mentioned in ref come before those not?} - \item{collapse}{String used when collapsing factor levels.} + \item{x}{A(n unordered) factor} + \item{ref}{Vector, list, matrix or table. + + If \code{ref} is a vector (integer or character), it is assumed it + contains the names or numbers of levels to be the first ones; non + mentioned levels are kept. + + If \code{ref} is a list, factor levels mentioned in each list + element are combined. If the list is named the names are used as new + factor levels, otherwise new level names are constructed from the + old. + + If \code{ref} is a matrix or table, the first column is assumed + to have unique levels of \code{x} and the second to have groupings + of this, respectively. + + } + \item{first}{Should the levels mentioned in \code{ref} (if it is a + list) come before those not?} + \item{collapse}{String used when constructing names for combined factor levels.} + \item{xlevels}{Logical. Should all levels in the 2nd column of + \code{ref} be maintained as levels of the result, or (if + \code{FALSE}) only the actually occurring.} + \item{nogroup}{Logical. Should levels present in the input but not in + the 1st column of \code{ref} be maintained as levels after the + grouping? If \code{FALSE}), NAs will be returned for such elements.} \item{\dots}{Arguments passed on to other methods.} } \value{ An unordered factor, where levels of \code{x} have been reordered and/or collapsed. } +\details{ + The facility where \code{ref} is a two-column matrix mimics the + SAS-facility of formats where a dataset can be used to construct a + format --- SAS format is the grouping tool for variable values. Using + this facility will preserve the order of levels if \code{ref[,2]} is a + factor. +} \author{Bendix Carstensen, \url{BendixCarstensen.com}.} \seealso{\code{\link{Relevel.Lexis}}} \examples{ +# Grouping using a list (hard coding) +# ff <- factor( sample( letters[1:5], 100, replace=TRUE ) ) table( ff, Relevel( ff, list( AB=1:2, "Dee"=4, c(3,5) ) ) ) -table( ff, rr=Relevel( ff, list( 5:4, Z=c("c","a") ), coll="-und-", first=FALSE ) ) +table( ff, Relevel( ff, list( 5:4, Z=c("c","a") ), coll="-und-",first=FALSE ) ) + +## Grouping using a two-column matrix as input: +## A factor with levels to be grouped together +ff <- factor( c("Bear","Bear","Crocodile","Snake","Crocodile","Bear") ) +ff +## A grouping table +( gg <- data.frame( Animal=c("Bear","Whale","Crocodile","Snake","Eagle"), + Class=c("Mammal","Mammal","Reptile","Reptile","Bird") ) ) +Relevel( ff, gg, xlevels=FALSE ) +Relevel( ff, gg ) +Relevel( ff, gg[c(1:5,5:1),] ) + +## This crashes with an error +## Relevel( ff, rbind( gg, c("Bear","Reptile") ) ) + +ff <- factor(c(as.character(ff),"Jellyfish","Spider")) +Relevel( ff, gg ) +Relevel( ff, gg, xlevels=FALSE ) +# If you do not want unknown animals classified, this returns NAs: +Relevel( ff, gg, nogroup=FALSE ) +# Will give error because re-classification is ambiguous +try( Relevel( ff, rbind(gg,c("Eagle","Mammal"),c("Snake","Mammal")) ) ) } \keyword{manip} diff -Nru r-cran-epi-2.19/man/transform.Lexis.Rd r-cran-epi-2.30/man/transform.Lexis.Rd --- r-cran-epi-2.19/man/transform.Lexis.Rd 2017-06-29 22:14:16.000000000 +0000 +++ r-cran-epi-2.30/man/transform.Lexis.Rd 2017-11-27 15:50:08.000000000 +0000 @@ -12,7 +12,6 @@ Modify a Lexis object. } \usage{ -%transform(`_data`, \dots) \method{transform}{Lexis}( `_data`, \dots) \method{Relevel}{Lexis}( x, states, print = TRUE, \dots ) \method{levels}{Lexis}( x ) @@ -34,25 +33,32 @@ } \details{ The transform method for \code{Lexis} objects works exactly as the - method for data frames. \code{factorize} transforms the variables + method for data frames, but keeps the \code{Lexis} attributes. + + \code{factorize} transforms the variables \code{lex.Cst} and \code{lex.Xst} to factors with identical set of levels, optionally with names given in \code{states}, and optionally - collapsing states. \code{Relevel} is merely an alias for + collapsing states. + + \code{Relevel} is merely an alias for \code{factorize}, since the function does the same as \code{\link{Relevel}}, but for both the factors \code{lex.Cst} and \code{lex.Xst}. A default sideeffect is to produce a table of old - states versus new states if \code{states} is a list. + states versus new states if \code{states} is a list. Unlike + \code{Relevel} for factors, \code{Relevel.Lexis} does not accept a + matrix as a second argument - the number of levels of \code{lex.Cst} + is rarely (if ever) large. If \code{states} is \code{NULL}, as when for example the argument is not passed to the function, the returned object have levels of \code{lex.Cst}, \code{lex.Xst} (and for \code{stacked.Lexis} objects \code{lex.Tr}) shaved down to the actually occurring values. - \code{order} returns the order of the rows in a Lexis object to sort - it by (\code{lex.id},\code{timeScales[x]}). +% \code{order} returns the order of the rows in a Lexis object to sort +% it by (\code{lex.id},\code{timeScales[x]}). - \code{sort} returns the Lexis object sorted by - (\code{lex.id},\code{timeScales[x]}). +% \code{sort} returns the Lexis object sorted by +% (\code{lex.id},\code{timeScales[x]}). } \value{ A transformed \code{Lexis} object. diff -Nru r-cran-epi-2.19/MD5 r-cran-epi-2.30/MD5 --- r-cran-epi-2.19/MD5 2017-08-09 04:39:25.000000000 +0000 +++ r-cran-epi-2.30/MD5 2018-05-15 09:44:58.000000000 +0000 @@ -1,32 +1,33 @@ -9a824270cc5a762fea93b89f7be6af50 *CHANGES -7ec191f7f9198e00e243799ac541a98a *DESCRIPTION -3889ddbc9a5ac1ba4f04ec84d8dfc26e *NAMESPACE +9b0ac862d38946f87a23a59eba645e48 *CHANGES +fd618ea6b7ab07c85f184718f74031ca *DESCRIPTION +1ad5ca1cf8d84695d8de9d7bf330b3c3 *NAMESPACE fac3c7f01ab0bd6930cf3cbea20e6bcc *R/Aplot.R a031997be7388601dd1df6a28fc55328 *R/Cplot.R 6a1c4f4cbbf509d1f3c40b9f0adb6399 *R/Icens.R -6dcc1bf2ed458a8898d366791de02f7d *R/LCa.fit.R +71f08f45b55ea83d4430b3817cc3b737 *R/LCa.fit.R d802fc697f55ab0ef5ccd45e51776e46 *R/Lexis.diagram.R c6b992622881e60fb82ee4b8c3357c9d *R/Lexis.lines.R 05c38381cc48ab25b681e8615ac31ea8 *R/Life.lines.R -a25e4901b0899a6b00285713117e3f21 *R/N2Y.r +330f1cf58e92430145b958a0f9d41571 *R/N2Y.r 8743609003b32036a7a1d02fb4b8e17b *R/NArray.R -876bd6183d0fdb8a180e462715675e0c *R/Ns.r +74aace0e5df5492a0dbf3b94bd25406e *R/Ns.r e9f582b5dc17d07bf2df96de74441963 *R/Pplot.R -b1c0f04443499205645ad578a3dd8573 *R/ROC.R -0a7bbd37e75920b54fd0fbe00a86511d *R/Relevel.R +144bd0af57017a5d92c4a7bde47da4e7 *R/ROC.R +dc5942024c78483c2f1f7a75862e9fcb *R/Relevel.R 0e3e3d23b8ab79dcf3046828190462b2 *R/Termplot.R -28a0ec0c8584e22759b516ff69e5bd78 *R/Wald.R -6f35204baee4552a741c7262ac33a05d *R/addCov.Lexis.R +3f30311f13dff23a36857e49a320536f *R/Wald.R +171324300a75a19f01fab28c1eb4ab44 *R/addCov.Lexis.R b1f0000430d6e15f59fe949ed8f1beb1 *R/apc.LCa.R 847f61901b2c9123367be3ebef30abe6 *R/apc.fit.R 3c53b3d667196c204fab811d2231ccd5 *R/apc.frame.R -40d04e6ce1304d51df6911621fe28b52 *R/apc.plot.R +bd898be838955d2fd5e2b2af764fc273 *R/apc.plot.R 8ce5a4c989947cb9d93becdecfde642c *R/as.Date.cal.yr.R +7d5600ddbfd17861ed882617b2312da8 *R/bootLexis.R f4e2a039dda786e4cad0b2961e3dd599 *R/boxes.MS.R 44e5f888817c94b8a190eae4aa964a2c *R/cal.yr.R 6988e76eb64f004a8c5029c23f8603af *R/ccwc.R -d5171eb23f0b431a7cc89b69d03308a6 *R/ci.cum.R -ab5fd1bbdc59ef536f103bf533a5f143 *R/ci.lin.R +47f2dfd260230b96f7e5daafb61e9db5 *R/ci.cum.R +7afe4f4bb0a259f50f585ad532dc2846 *R/ci.lin.R 6f80c7b2b2120068cd16c043fbc29db3 *R/ci.mat.R 886f87cbd08266fa281e2ed7b3b3c253 *R/ci.pd.R 7637da11fdfb3612df5d73a64ee044e4 *R/clear.R @@ -36,10 +37,10 @@ 8064804ee2b1cfa0cea2b44a58106b0a *R/contr.diff.R 8ffc3eabb90b6b9b710109015779d853 *R/contr.orth.R efb8f1a3b97b41244275534855707b8b *R/crr.Lexis.r -281380e913564ab522664f2255b2e8a8 *R/cutLexis.R -14270f90ed00f4f043286d50e2de9f26 *R/detrend.R +ddffd14a0d695b6de25c997b2aeb4440 *R/cutLexis.R +e2d647c79d7aaebd51d55017925291d7 *R/detrend.R a86250cce524f9fa2215c0c0cac4c6e7 *R/effx.match.r -9b46438b1c6fdc926ac1a253a85096aa *R/effx.r +79292fb5601cdd6ead350751f4c7ae59 *R/effx.r abd3fdeb059a214f253e2f3e4658a302 *R/erl.R 822e02862de39625e5ed9ef6f84bdf39 *R/expand.data.r 0466dba42d10766e0b9cd220ad39dcf4 *R/factorize.R @@ -47,12 +48,13 @@ 44eec3e5b106ee3f518beed49b4e0f9b *R/fit.baseline.R 8ffaebb81a92649b137ff8a4911841da *R/fit.mult.r 7c7059bf47c6421d6412a4bc53fc5950 *R/float.R -23a68a19f8f8e70c5664eb5110b3e807 *R/foreign.R +8ff6ab0fcba0c2d962e51e757fbe417c *R/foreign.R da7b5d41fe7876ef6bab01839b613939 *R/ftrend.R f0f84fc26294b1cb3016a9e23974c545 *R/gen.exp.R -06f8a1ac3afc69610b9edfebc638f17b *R/lexis.R +c1e34d4ca4bbbcfeda98cece6f49a1c4 *R/lexis.R f8eba91de17a8207c6303ffb7b34f0a7 *R/lls.R -d593af0ab833c8655469a24891e9116f *R/mcutLexis.R +3ddae22cd389ad9d5d93b7d1c514b8c6 *R/matshade.R +06ebf2179362f9376b6dd3e7520a3bc9 *R/mcutLexis.R 79d6e0b02a2c4e98fbbfcd936b21d377 *R/mh.R 9e48e47769874a34f63a2e2d4e6ffffc *R/ncut.r 2a44a66ac4fe7eccab191f94957265e2 *R/pctab.R @@ -73,8 +75,8 @@ 8ceccd0ffa9694670f38c34b65d5be10 *R/summary.Lexis.r 88983ba8314a37910993d469a982d800 *R/twoby2.R 53904292629435d35ec42f11329b2c60 *R/xgrep.R -1ffde7eb6c78d26bcb1e51a138e63485 *build/vignette.rds b9d3112271b9545896c98ec6406f5f02 *data/B.dk.rda +713014c7934c558deeba7accf8fd8e6c *data/BrCa.rda 666e336a1aefb9c4298abb83a81a3300 *data/DMconv.rda ea83ed46a5581ea8489646abd3d1c0ae *data/DMepi.rda e200a06eee9305b2fa2c59af5982d62b *data/DMlate.rda @@ -100,86 +102,85 @@ 404cc2c826ccdfb5b5edbe9834561e02 *data/testisDK.rda 1953ddd7d750051c5682fa08b11cb777 *data/thoro.rda 9bd39178a387f935acc54519109c9f4a *inst/CITATION -e61ded63f726562d6f3172838e4f7337 *inst/doc/Follow-up.R -d8bfa83259c458592bdf1e8cabca4247 *inst/doc/Follow-up.pdf -a6978ca7d45a578a4fb047fb92eeed73 *inst/doc/Follow-up.rnw -364611344a5e791ae93f506e961b5fcc *inst/doc/index.html -8d64b0389082fb54478f8e9667934265 *inst/doc/simLexis.R -441f8e7242cf6b1ef990ffd642ff56fa *inst/doc/simLexis.pdf -37f8cd2365c025fbb1d348052c67f712 *inst/doc/simLexis.rnw -3cf7875e75ee2ef80cf2eda579764d6f *inst/doc/yll.R -b4a034b2c6be5a4e5b63d7f38007a142 *inst/doc/yll.pdf -5b597c102643a5597a70ec91abc1899c *inst/doc/yll.rnw +79f9f6f2e09e51ae2a51369afe860840 *inst/doc/flup.R +a3f34682ae33cb1854a686ba290d1726 *inst/doc/flup.pdf +45c955bf379daab5d2d8b0fea4432249 *inst/doc/index.html +f89cf850bd20072be74838eb4137ee09 *inst/doc/simLexis.R +5dfc49b898f0125b15e3e471a90caf42 *inst/doc/simLexis.pdf +a80a260dbf7d640a43c921d042aae6e2 *inst/doc/yll.R +247c9b4a1b95e00c254520eb85592866 *inst/doc/yll.pdf 317bb6ebeaec5e7376b609d7e382c166 *man/B.dk.Rd +ffe380e7252698c820a448b587714b49 *man/BrCa.Rd 3abc5c20e62c874d70dcb2688ad2cce5 *man/DMconv.Rd -e9a988b029749169403ab0bf749d53da *man/DMepi.Rd +ae0088f5434c66484668b6d085145889 *man/DMepi.Rd 362200b221355a7bcbcd1c7c8df5243f *man/DMlate.Rd c87e6a0d0a5c461ecf4df07c2c96f1c1 *man/Icens.Rd -c20a70a6f6c4f815e9a7c90d26529cab *man/LCa.fit.Rd +c94a1f15f90406b35c24ff9e3e6b9147 *man/LCa.fit.Rd e90e2ce2e2bc9daf1283492261c8ebbd *man/Lexis.Rd c97dabafce248e4d47c5b18b6f04efe0 *man/Lexis.diagram.Rd 2e71ca58213788c8a358708e776e2112 *man/Lexis.lines.Rd b4b2ca144feacece204f86bc6640d51a *man/Life.lines.Rd daf66f950320e836b8ce1d2bd37cdc41 *man/M.dk.Rd 6ae7494a426801b9944c37190c09b3ee *man/N.dk.Rd -0983e4031974eea1a30e67db90310c25 *man/N2Y.Rd +5718b371e9062cd7af8efd2ef0f245a5 *man/N2Y.Rd 424aecfdd664b3123ad4815880aa3761 *man/NArray.Rd -03f9885a6c523d7a221cf0a2272628fc *man/Ns.Rd +cddc9fd049b34de27a61fd1bb119d998 *man/Ns.Rd 335b75739b8a25293690da804cd5e522 *man/ROC.Rd -ca827755eae8ee8286860a715cbd1744 *man/Relevel.Rd +95b580abbfcafdc404d28cd194e8c214 *man/Relevel.Rd bb2162d557dfb7685edac64ac3213b6c *man/S.typh.Rd 9351888a37017068f2c9ac8b843c846d *man/Termplot.Rd b4879df831de32e9e298b7b3d90389a9 *man/Y.dk.Rd -b813883bd7caca8d97ee45f456932eb4 *man/addCov.Lexis.Rd -62c36534ab2b6043f68306dd33407891 *man/apc.LCa.Rd -d5976837ba976d5969aaaedb98b215eb *man/apc.fit.Rd -8e42054ad19c34f2bf77210cfafc560e *man/apc.frame.Rd +89a0abc6ca87f561fc48e0f24f268622 *man/addCov.Lexis.Rd +5d347f8d93ad0ca09b8992fa7aa2d241 *man/apc.LCa.Rd +14636e67b2485cba556cfcb832f8c950 *man/apc.fit.Rd +345568f28bad999578b39277080a7283 *man/apc.frame.Rd 92b663829915395cdabf3e5d37cdd2b4 *man/bdendo.Rd 968ba8cef65153e9c9c521dd1541123e *man/bdendo11.Rd d74a4f6e66ce91c73cd2817bb690446a *man/births.Rd 8dc11840302a388e4f26892335d5d888 *man/blcaIT.Rd +46d07584ef468f0ff58338f5379a5b0e *man/bootLexis.Rd 6b9f6bc7ca862808b89f6439914b8a0a *man/boxes.MS.Rd 18d942d375c5d5f6899291bf9cd1fe11 *man/brv.Rd a2bedd44df3606b6585fd1da897de7ec *man/cal.yr.Rd effba2c17ade00f0afd10184a073d34c *man/ccwc.Rd 90337e7772c67e4e140e21165c30f079 *man/ci.cum.Rd -5fca264afe8e1bfd25f93d365d9156fa *man/ci.lin.Rd +a046acb056ce540ce221e78236e2020b *man/ci.lin.Rd e7477402aa4746d653a0c72b2239d185 *man/ci.pd.Rd 128c0b1d00a69e41d131a6f74a1bae23 *man/clogistic.Rd cf5e5a6e18e07b7bb8d70dc0417e6166 *man/contr.cum.Rd 24343adc886e90c75f7af178201bd6cd *man/crr.Lexis.rd -7d42e6a788c81361f03f03af15a2187c *man/cutLexis.Rd -a1ab8c14b721360ba0cb17da598bfa84 *man/detrend.Rd +7257b81a897faca909e48c6e065efd25 *man/cutLexis.Rd +fb67a9ca34c612618927beeab4097ae8 *man/detrend.Rd 386d0a4aa3515cf754ef458e1f32089c *man/diet.Rd 64dd6599bb830e98f4717b1a34ee12ab *man/effx.Rd 3c3cc3b1ecabece6b7709c1b147bab4e *man/effx.match.Rd -a4922d13042b2d343868518a072e64d3 *man/erl.Rd +f281dede5c1a091f9dca30199fb41bec *man/erl.Rd 0dc62e726cdff6694774fab66595ee81 *man/ewrates.Rd eb0c50c4a2572fd514c475358165fa57 *man/expand.data.rd 344adb137962cfa8fc7c2cf2858de3a3 *man/fit.add.Rd 3b44fed4ae70541c965c4cfbed966a68 *man/fit.baseline.rd b92589537d0e1c4abebdf1aad6abf88b *man/fit.mult.Rd bca8b8c3452180e63d4d22885b04e3ce *man/float.Rd -3afd2a7cb9c812dfa783cb916828b7cf *man/foreign.Lexis.Rd +56b183ca7e5c4a40db1f4c425b9abf04 *man/foreign.Lexis.Rd e5a57a203cb91e9680dedc9023779e59 *man/ftrend.Rd e7e33d1ae13dec7a5cf3c3f96e1c085e *man/gen.exp.Rd cbbc2a23f902d83ba2527b27b5ed3adb *man/gmortDK.Rd d9f1d31e109d6058a2160ca0aa49bf81 *man/hivDK.Rd 9c3b059921d728298ccff73298b4dd31 *man/lep.Rd 470bf279897b9c9ee2600126127d8ada *man/lgrep.Rd -ec0e27968d4ee35f390161e49abd903c *man/lines.apc.Rd +e14c71b1f693c680db3824ab97b0da49 *man/lines.apc.Rd 78125ba49d54905fb727765dc8997796 *man/lls.Rd 661ec298ffcd0f39cf55c5d033e04fd1 *man/lungDK.Rd -cb1f50d16fe21259603273d2004d14b3 *man/mcutLexis.Rd +f542098c60185b5dbb9dd7dfe95c5490 *man/matshade.Rd +a378ebdcb3944f3844508f04775f59b9 *man/mcutLexis.Rd 3c462c4ce7ca1ce4b8c6fef50b3c7623 *man/merge.Lexis.Rd -4c26c9edf5d64b0af3e63ad791b2a280 *man/merge.data.frame.Rd 34c530c36a3a1a7447ff8eea58f42084 *man/mh.Rd 2ecfec2d86e80b9bbcc2c3a253348edd *man/mortDK.Rd 9a531412399076898e3e39c1c20eb69a *man/ncut.Rd 880466737da2c1037259c3f712a7e9fe *man/nice.Rd 05bd25340716d0805cbc7eb433896206 *man/nickel.Rd 0e0ff2e0894523b3878213d03afcc8cf *man/occup.Rd -fb7ba7ea4cc2a9a67529348a4d971a4f *man/pc.lines.Rd +09ac8ba52984a30b21e3404198b208a7 *man/pc.lines.Rd 86cac5433077a16960bca5d5865586a4 *man/pctab.Rd 50c0516a536004d7f4ce91ef1bbfd8c9 *man/plot.Lexis.Rd d30b27f4c3ef397de514614ba3e6ba3e *man/plot.apc.Rd @@ -202,14 +203,48 @@ e3446824b09ea4ad8e8a0ddfc1987eef *man/thoro.Rd 6ea9cf64475187fa2f402275c40573a7 *man/time.band.Rd 2edb93539fa7b9f47a3083b624443f9d *man/time.scales.Rd -f80e2b3fe7e1d9df9c40dcd577a3f499 *man/transform.Lexis.Rd +ca5a17eba0753419b75dcbe4e78ab26d *man/transform.Lexis.Rd 02fc311541fae16a5d3a580a04925050 *man/twoby2.Rd bcc0556f8f0a158637592731928cf835 *src/chinv2.c 3e3a51847b5767274f0945350e947726 *src/cholesky2.c 09bf10fed80e805e7e58db9794e5edcc *src/chsolve2.c 872ef5e7ef0c1941c17d7dd612f6b1c3 *src/clogit.c d77bd820f6ce7dc7883c9e2813a44c89 *src/init.c -a6978ca7d45a578a4fb047fb92eeed73 *vignettes/Follow-up.rnw -364611344a5e791ae93f506e961b5fcc *vignettes/index.html -37f8cd2365c025fbb1d348052c67f712 *vignettes/simLexis.rnw -5b597c102643a5597a70ec91abc1899c *vignettes/yll.rnw +c2720c2f2634c041d8e0060d5e5d28ba *vignettes/fixall +3979bd89a72b00b468c2729b49ab8966 *vignettes/flup-nic-box.pdf +dbc159e5816e0edc098649fed8cc2f02 *vignettes/flup-nicL1.pdf +e59dc88157dfd364041097d769e03123 *vignettes/flup-nicL2.pdf +f04332278df5c7ba7414d28b0d585284 *vignettes/flup-pr-a.pdf +15f4cb80dac3cbfd568a247404322a44 *vignettes/flup-pr-at-af.pdf +dbf3d42b2e53c8eebaed3bb68f816ced *vignettes/flup-pr-at.pdf +79f9f6f2e09e51ae2a51369afe860840 *vignettes/flup.R +a3f34682ae33cb1854a686ba290d1726 *vignettes/flup.pdf +e571e2d2c713944c9e4a2e29572b4404 *vignettes/flup.rnw +6b18a4a5c5a1b4b921d24ee0ffe9ef7b *vignettes/flup.rwl +0f5a474691708d55b8c25654f672837d *vignettes/flup.tex +e92a0bc28e9f4f59a1557ce6d8cfe2e2 *vignettes/pr.Rda +5dfc49b898f0125b15e3e471a90caf42 *vignettes/sL.pdf +53297bb8f3d8d5f236a57a74094b99d6 *vignettes/simLexis-boxes.pdf +c1adf5ea9bfcb5730826815b3d9d603e *vignettes/simLexis-comp-0.pdf +c302f562af43a525ba4071f70e0832a4 *vignettes/simLexis-mort-int.pdf +93275a9279f994f5fb4164f5844ee002 *vignettes/simLexis-pstate0.pdf +29955ab2796c724d25b50685ec212b1f *vignettes/simLexis-pstatex.pdf +2cf807f77cb0b8e9f08e6cedf3593967 *vignettes/simLexis-pstatey.pdf +f89cf850bd20072be74838eb4137ee09 *vignettes/simLexis.R +8f77aa11a639cf50b07e16a71550dd77 *vignettes/simLexis.pdf +2754bf3a1b46be9334d0f2024a7caeaf *vignettes/simLexis.rnw +d1a872adecbee48f573d1579de5a709e *vignettes/simLexis.rwl +ad055b0b30f9a327d41570db28bfac96 *vignettes/simLexis.tex +e18d102389aa23a92012542bb87debdd *vignettes/toparticle.tex +b4b55559db3e7a6c0a8eb3632ead0d8e *vignettes/topreport.tex +0ee2e7b027912753d8b3b92b46ed524a *vignettes/useful.tex +247c9b4a1b95e00c254520eb85592866 *vignettes/yl.pdf +c3958eb51451ee4673e9483a22d4f9d9 *vignettes/yll-imm.pdf +3997496b6fd79696e08282f3b54fa7b1 *vignettes/yll-states.pdf +6ee3aee63eb25042373adede269dbb7d *vignettes/yll-sus.pdf +43a542c2046b737a9b37ba7f3ccfaaea *vignettes/yll-tot.pdf +a80a260dbf7d640a43c921d042aae6e2 *vignettes/yll.R +4a254f0768f19e3c8288a052896cfd6e *vignettes/yll.pdf +3fbca640fc202dcc4736ac53a89a32cf *vignettes/yll.rnw +c6221b49df645e31fa1567fe0e75331f *vignettes/yll.rwl +bf1bea3b9bebd8bab7155eefb9d515e4 *vignettes/yll.tex diff -Nru r-cran-epi-2.19/NAMESPACE r-cran-epi-2.30/NAMESPACE --- r-cran-epi-2.19/NAMESPACE 2017-06-29 22:19:41.000000000 +0000 +++ r-cran-epi-2.30/NAMESPACE 2018-05-14 15:21:28.000000000 +0000 @@ -11,6 +11,12 @@ pc.lines, pc.matpoints, pc.matlines, + pc.matshade, + cp.points, + cp.lines, + cp.matpoints, + cp.matlines, + cp.matshade, cal.yr, as.Date.cal.yr, ccwc, @@ -21,6 +27,7 @@ ci.pred, ci.ratio, ci.mat, + matshade, lls, clear, contr.orth, @@ -71,6 +78,8 @@ splitLexis, transform.Lexis, levels.Lexis, + nid.Lexis, + bootLexis, Relevel.Lexis, factorize.Lexis, cutLexis, @@ -96,7 +105,6 @@ timeBand, timeScales, breaks, - merge.data.frame, tbox, dbox, fillarr, @@ -110,7 +118,6 @@ nState, pState, msdata, - etm, mh, ncut, nice, @@ -127,7 +134,9 @@ Aplot, Pplot, Cplot, + nid, Relevel, + Relevel.factor, ROC, twoby2, Wald, @@ -146,7 +155,7 @@ importFrom( numDeriv, hessian ) importFrom( Matrix, nearPD ) importFrom( zoo, na.locf ) -importFrom("grDevices", "gray", "rainbow") +importFrom("grDevices", "gray", "rainbow", "adjustcolor", "dev.cur") importFrom("graphics", "abline", "arrows", "axis", "box", "layout", "lines", "locator", "matlines", "matplot", "matpoints", "mtext", "par", "plot", "plot.new", "plot.window", "points", @@ -154,7 +163,7 @@ "strwidth", "text") importFrom("stats", ".getXlevels", "addmargins", "anova", "approx", "ave", "binomial", "coef", "complete.cases", "contr.sum", - "fisher.test", "fitted", "gaussian", "glm", + "fisher.test", "fitted", "formula", "gaussian", "glm", "is.empty.model", "median", "model.extract", "model.matrix", "model.offset", "model.response", "nlm", "pchisq", "pnorm", "poisson", "predict", "qnorm", "qt", "quantile", "runif", @@ -183,6 +192,8 @@ S3method(transform, Lexis) S3method(transform, stacked.Lexis) S3method( levels, Lexis) +S3method( nid, Lexis) +S3method( nid, default) S3method( Relevel, Lexis) S3method( Relevel, factor) S3method( Relevel, default) @@ -199,7 +210,6 @@ S3method( summary, LCa) S3method( predict, LCa) S3method( plot, LCa) -S3method( merge, data.frame) S3method( print, stat.table) S3method( print, clogistic) S3method( coef, clogistic) diff -Nru r-cran-epi-2.19/R/addCov.Lexis.R r-cran-epi-2.30/R/addCov.Lexis.R --- r-cran-epi-2.19/R/addCov.Lexis.R 2017-07-06 12:45:58.000000000 +0000 +++ r-cran-epi-2.30/R/addCov.Lexis.R 2018-05-01 08:53:48.000000000 +0000 @@ -22,12 +22,12 @@ # Is the timescale argument a timescale in Lx and is it a variable in clin? ts <- if( is.numeric(timescale) ) timeScales( Lx )[timescale] else timescale if( !( ts %in% timeScales(Lx) ) ) - stop( "timescale argument (", ts, ") must be one of timescales in in the Lexis object ", + stop( "timescale argument (", ts, ") must be among the timescales in the Lexis object ", deparse(substitute(Lx)),":", timeScales(Lx), ".\n" ) clin.nam <- deparse(substitute(clin)) if( !( ts %in% names(clin) & "lex.id" %in% names(clin) ) ) - stop( "'lex.id' and timescale '", ts, "' must be a variables in the clin object ", + stop( "'lex.id' and timescale '", ts, "' must be variables in the clin object ", clin.nam, "\n" ) # variables to merge by @@ -53,7 +53,7 @@ ave( clin$lex.id, clin$lex.id, FUN = function(x) cumsum(x/x) ), - sep="." ) + sep="" ) # Add copy of the time of examination to be carried forward clin[,tfc] <- clin[,ts] @@ -73,14 +73,15 @@ mc <- cutLexis( mc, cut = cfr[cfr$new.state==st,], timescale = ts, - precursor.states = NULL ) + precursor.states = NULL, + new.scale = addScales ) -# Merge in the old states +# Merge in states from the original object mx, but take attributes from mc mx <- Lx[,mvar] mx$org.Cst <- Lx$lex.Cst mx$org.Xst <- Lx$lex.Xst -mx <- merge( mx, mc, by = mvar, all.y = TRUE, sort = TRUE ) - +mx <- merge( mc, mx, by = mvar, all.x = TRUE, sort = TRUE ) + # Complete the state variables ( wh <- which(is.na(mx$org.Cst)) ) mx$org.Cst[wh] <- na.locf( mx$org.Cst, nx.rm=FALSE )[wh] @@ -97,17 +98,32 @@ mx <- merge( mx, clin, by=mvar, all.x=TRUE, sort=TRUE ) # And carry them forward within each lex.id + # locf within each person (should be easier in data.table) locf.df <- function( df ) as.data.frame( lapply( df, na.locf, na.rm=FALSE ) ) + # ave does not like character variables so we convert to factors wh <- which( sapply( mx[,cvar], is.character ) ) for( j in wh ) mx[,cvar[j]] <- factor( mx[,cvar[j]] ) # then we can carry forward mx[,cvar] <- ave( mx[,cvar], mx$lex.id, FUN=locf.df ) -# Finally update the time from clinical measurement +# Finally update the time from last clinical measurement mx[,tfc] <- mx[,ts] - mx[,tfc] + +# Add as a time-scale +if( addScales ) + { +new.scales <- setdiff( timeScales(mx), timeScales(Lx) ) +op <- options(warn = (-1)) # suppress warnings +mx[,tfc] <- apply( mx[,new.scales,drop=FALSE], 1, min, na.rm=TRUE ) +options(op) # reset the previous value +attr( mx, "time.scales") <- c( attr( mx, "time.scales"), tfc ) +attr( mx, "time.since" ) <- c( attr( mx, "time.since"), "" ) +brt <- list( x=NULL ) ; names( brt ) <- tfc +attr( mx, "breaks") <- c( attr( mx, "breaks"), brt ) + } -# Done! -mx +# Done! - well order first +mx[order(mx[,"lex.id"],mx[,timeScales(mx)[1]]),] } diff -Nru r-cran-epi-2.19/R/apc.plot.R r-cran-epi-2.30/R/apc.plot.R --- r-cran-epi-2.19/R/apc.plot.R 2015-05-27 15:26:55.000000000 +0000 +++ r-cran-epi-2.30/R/apc.plot.R 2018-05-01 18:02:24.000000000 +0000 @@ -170,3 +170,11 @@ fp <- options()[["apc.frame.par"]] matlines( x - fp[1], y * fp[2], ... ) } + +cp.matshade <- +pc.matshade <- +function( x, y, ... ) +{ +fp <- options()[["apc.frame.par"]] +matshade( x - fp[1], y * fp[2], ... ) +} diff -Nru r-cran-epi-2.19/R/bootLexis.R r-cran-epi-2.30/R/bootLexis.R --- r-cran-epi-2.19/R/bootLexis.R 1970-01-01 00:00:00.000000000 +0000 +++ r-cran-epi-2.30/R/bootLexis.R 2018-03-08 12:02:08.000000000 +0000 @@ -0,0 +1,82 @@ +# The method +nid <- function ( Lx, ... ) UseMethod("nid") + +nid.default <- +nid.Lexis <- +function( Lx, by=NULL, ... ) +{ +if( !is.null(by) ) + { + if( !(by %in% names(Lx)) ) stop( "'by' must be the name of a variable in Lx" ) + if( !is.factor(Lx[,by]) ) Lx[,by] <- factor(Lx[,by]) + } +if( is.null(by) ) + length( unique( Lx$lex.id ) ) +else + sapply( split( Lx, Lx[,by] ), nid.Lexis ) +} + +# Make a boostrap sample of a Lexis object: +# Sample the *persons* with replacement, possibly sampling within levels of by= +bootLexis <- +function( Lx, + size = NULL, + by = NULL ) +{ +if( !inherits( Lx, "Lexis" ) ) stop("Only meaningful for Lexis objects.") + +isDT <- inherits( Lx, "data.table" ) +if( isDT ) class( Lx ) <- c("Lexis","data.frame") + +# determine size of the bootstrap samples if not given +if( is.null( size ) ) size <- nid.Lexis( Lx, by = by ) + +# allowing for a length 1 x-vector +REsample <- function(x,sz) x[sample.int(length(x),size=sz,replace=TRUE)] + +if( is.null(by) ) { # Simple bootstrap + bLx <- subid.Lexis( Lx, REsample( unique(Lx$lex.id), size ) ) +} else { # Bootstrap by groups + bLx <- NULL + spL <- split( Lx, Lx[,by] ) + for( i in 1:length(spL) ) + { + bLx <- rbind( bLx, + cbind( bootLexis( spL[[i]], size = size[i] ), + bgr = paste(i) ) ) + } + bLx$lex.id <- as.integer( interaction(bLx$lex.id,bLx$bgr) ) + bLx <- bLx[,-grep("bgr",names(bLx))] + } +# return the result after converting to data.table if needed +if( isDT ) class( bLx ) <- c("Lexis","data.table","data.frame") +bLx +} + +# A utility function that returns a Lexis object subsetted to a set of +# lex.ids, allowing for repeat values of lex.id +subid.Lexis <- +function( Lx, ids ) +{ +tt <- table( ids ) +bLx <- NULL +max.id <- 0 +for( j in 1:max(tt) ) + { + # avoid note about no visible binding + wh <- NULL + lex.id <- NULL + # who appears at least j times in the sample ? + wh <<- names(tt[tt>=j]) + sb <- subset( Lx, lex.id %in% wh ) + # remember their original id + sb$old.id <- sb$lex.id + # to assure that different samples of the same person has different lex.id + sb$lex.id <- sb$lex.id + max.id + max.id <- max( sb$lex.id ) + bLx <- rbind( bLx, sb ) + } +# Generate new lex.ids in the order 1:N +bLx$lex.id <- as.integer( factor(bLx$lex.id) ) +bLx +} diff -Nru r-cran-epi-2.19/R/ci.cum.R r-cran-epi-2.30/R/ci.cum.R --- r-cran-epi-2.19/R/ci.cum.R 2015-07-23 11:00:52.000000000 +0000 +++ r-cran-epi-2.30/R/ci.cum.R 2017-12-08 07:18:22.000000000 +0000 @@ -20,12 +20,12 @@ # the coefficients vector in case of (extrinsic) aliasing. if( any( is.na( cf ) ) ) { -vM <- matrix( 0, length( cf ), length( cf ) ) -dimnames( vM ) <- list( names( cf ), names( cf ) ) -vM[!is.na(cf),!is.na(cf)] <- vcv -cf[is.na(cf)] <- 0 -vcv <- vM - } + vM <- matrix( 0, length( cf ), length( cf ) ) + dimnames( vM ) <- list( names( cf ), names( cf ) ) + vM[!is.na(cf),!is.na(cf)] <- vcv + cf[is.na(cf)] <- 0 + vcv <- vM + } if( is.character( subset ) ) { sb <- numeric(0) diff -Nru r-cran-epi-2.19/R/ci.lin.R r-cran-epi-2.30/R/ci.lin.R --- r-cran-epi-2.19/R/ci.lin.R 2017-04-13 13:42:23.000000000 +0000 +++ r-cran-epi-2.30/R/ci.lin.R 2018-04-26 13:28:51.000000000 +0000 @@ -1,9 +1,10 @@ -# The coef() methods in nlme and lme4 do something different -# so we make a workaround by specifying our own generic methods +# The coef() methods in nlme and lme4 do something different, +# other objects do not even have coef or vcov methods defined, +# so we make a workaround by specifying our own generic methods: COEF <- function( x, ... ) UseMethod("COEF") COEF.default <- function( x, ... ) coef( x, ... ) VCOV <- function( x, ... ) UseMethod("VCOV") -VCOV.default <- function( x, ... ) vcov( x, ... ) +VCOV.default <- function( x, ... ) vcov( x, complete=FALSE, ... ) # Then we can get from these methods what we want from lme, mer etc. COEF.lme <- function( x, ... ) nlme::fixed.effects( x ) @@ -25,11 +26,69 @@ VCOV.gnlm <- function( object, ... ) object$cov VCOV.rq <- function( object, ... ) summary(object, cov=TRUE)$cov +df2ctr <- +function( obj, nd ) + { +# Factors in the prediction frame must have more than one level, which +# they typically do not have in the specification, so we find the +# factors in the prediction frame and expand levels to the complete +# set of levels which should secure the working of model.matrix() +dcl <- attr( obj$terms, "dataClasses" ) +whf <- ( dcl == "factor" ) +if( any(whf) ) for( fn in names(dcl)[which(whf)] ) + nd[,fn] <- factor( nd[,fn], levels=obj$xlevels[[fn]] ) +# The contrast matrix - differ a bit between glm and gam +if( inherits(obj,"gam") ) model.matrix( obj , newdata=nd ) + else model.matrix( formula(obj)[-2], data=nd ) + } + +ci.dfr <- +function( obj, ndx, ndr, + xvars = NULL, + vcov = FALSE, + alpha = 0.05, + Exp = FALSE, + sample = FALSE ) +{ +if( nrow(ndr)==1 ) ndr <- ndr[rep(1,nrow(ndx)),,drop=FALSE] +if( ( ( nrow(ndx) != nrow(ndr)) ) | + ( any(names(ndx) != names(ndr)) ) ) + stop("\nThe two prediction frames must have same dimensions and column names:", + "but dimensions are: ", dim(ndx), " and ", dim(ndr), "\n", + "and column names are:\n", + "exp: ", names(ndx), "\n", + "ref: ", names(ndr), "\n") +# Now supply and fix those variables that are needed in order to get model.matrix working: +# Supplied variable names: + cols <- names( ndx ) +# Factors in model; which are supplied; derive names of omitted factors (ofacs) + facs <- names( obj$xlevels ) +ofacs <- setdiff( facs, cols ) +# omitted *variables* must be supplied +ovars <- setdiff( xvars, facs ) +# Construct the extra columns with bogus data (their contribution will be null) +xcols <- ndx[,NULL] +if( length(ofacs) > 0 ) for( fn in ofacs ) xcols[,fn] <- obj$xlevels[[fn]][1] +if( length(ovars) > 0 ) for( vn in ovars ) xcols[,vn] <- 1 +if( dim(xcols)[2]>0 ) + { + ndx <- cbind( ndx, xcols ) + ndr <- cbind( ndr, xcols ) + } +ci.lin( obj, + ctr.mat = df2ctr( obj, ndx ) - df2ctr( obj, ndr ), + vcov = vcov, + alpha = alpha, + Exp = Exp, + sample = sample ) +} + ci.lin <- function( obj, ctr.mat = NULL, subset = NULL, subint = NULL, + xvars = NULL, diffs = FALSE, fnam = !diffs, vcov = FALSE, @@ -38,6 +97,23 @@ Exp = FALSE, sample = FALSE ) { +# If ctr.mat is a data frame, call df2ctr +if( inherits( ctr.mat, "data.frame" ) ) ctr.mat <- df2ctr( obj, ctr.mat ) + +# If ctr.mat is a list of two dataframes then call ci.dfr +if( inherits( ctr.mat, "list" ) ) + { + if( !inherits( ctr.mat[[1]], "data.frame" ) | + !inherits( ctr.mat[[2]], "data.frame" ) ) + stop( "If ctr.mat is a list it must be a list of two data frames" ) + return( ci.dfr( obj, ctr.mat[[1]], ctr.mat[[2]], + xvars = xvars, + vcov = vcov, + alpha = alpha, + Exp = Exp, + sample = sample ) ) + } + # First extract all the coefficients and the variance-covariance matrix cf <- COEF( obj ) vcv <- VCOV( obj ) @@ -45,25 +121,28 @@ # the coefficients vector in case of (extrinsic) aliasing. if( any( is.na( cf ) ) ) { -if( inherits( obj, c("coxph") ) ) - { # aliased parameters are only NAs in coef, but omitted from vcov - wh <- !is.na(cf) - cf <- cf[wh] - vcv <- vcv[wh,wh] - } -else -if( inherits( obj, c("clogistic") ) ) - { - cf[is.na(cf)] <- 0 - } -else - { -vM <- matrix( 0, length( cf ), length( cf ) ) -dimnames( vM ) <- list( names( cf ), names( cf ) ) -vM[!is.na(cf),!is.na(cf)] <- vcv -cf[is.na(cf)] <- 0 -vcv <- vM - } + if( inherits( obj, c("coxph") ) ) + { # aliased parameters are only NAs in coef, but omitted from vcov + wh <- !is.na(cf) + cf <- cf[wh] + vcv <- vcv[wh,wh] + } + else + { + if( inherits( obj, c("clogistic") ) ) + { + cf[is.na(cf)] <- 0 + } + else + { + vM <- matrix( 0, length( cf ), length( cf ) ) + dimnames( vM ) <- list( names( cf ), names( cf ) ) + vM[!is.na(cf),!is.na(cf)] <- vcv + # vM <- vcv + cf[is.na(cf)] <- 0 + vcv <- vM + } + } } # Function for computing a contrast matrix for all possible @@ -104,7 +183,7 @@ cm[cbind(1:nr,ctr[,2])] <- -1 rownames( cm ) <- rn cm -# end of function for all differences +# end of the function all.dif for all differences } # Were all differences requested? @@ -127,7 +206,7 @@ # Get the relevant subset, and stick in 0s for NAs cf <- coef( obj )[wh] cf[is.na( cf )] <- 0 - vcv <- vcov( obj )[wh,wh] + vcv <- vcov( obj, complete=FALSE )[wh,wh] vcv[is.na( vcv )] <- 0 names( cf ) <- rownames( vcv ) <- colnames( vcv ) <- paste( subset, ": ", fn, sep="" ) @@ -175,9 +254,9 @@ # Finally, here is the actual computation if( sample ) { - # mvrnorm() returns a vector if sample=1, otherwise a sample x + # mvrnorm() returns a vector if sample=1, otherwise a sample by # length(cf) matrix - hence the rbind so we always get a row - # matrix and res then becomes an nrow(ctr.mat) x sample matrix + # matrix and res then becomes an nrow(ctr.mat) by sample matrix res <- ctr.mat %*% t( rbind(mvrnorm( sample, cf, vcv )) ) } else @@ -198,7 +277,7 @@ } # Return the requested structure if( sample ) invisible( res ) else -if( vcov ) invisible( list( est=ct, vcov=vc ) ) else res +if( vcov ) invisible( list( coef=ct[,1], vcov=vc ) ) else res } # Handy wrapper @@ -211,7 +290,7 @@ ci.lin( ..., Exp=FALSE )[,if(pval) c(1,5,6,4) else c(1,5,6),drop=FALSE] } -# Wrapper for predict.glm to give estimates and confidnece intervals +# Wrapper for predict.glm to give estimates and confidence intervals ci.pred <- function( obj, newdata, Exp = NULL, @@ -229,6 +308,9 @@ } } +# Function to calculate RR with CIs from independent rates with CIs; +# r1 and r2 are assumed to be vectors or 2 or 3-column matrices with +# rate, lower and upper confidence limits repectively. ci.ratio <- function( r1, r2, se1 = NULL, # standard error of rt1 @@ -237,14 +319,10 @@ alpha = 0.05, pval = FALSE ) { -# Function to calculate RR with CIs from independent rates with CIs; -# r1 and r2 are assumed to be vectors or 2 or 3-column matrices with -# rate, lower and upper confidence limits repectively. - if( is.matrix(r1) & !is.null(se1) ) warning("r1 is matrix, se1 is ignored") if( is.matrix(r2) & !is.null(se2) ) warning("r2 is matrix, se2 is ignored") -# if supplied as 1-column matrix chnage to vector +# if supplied as 1-column matrix change to vector if( is.matrix(r1) ) if( ncol(r1)==1 ) r1 <- as.vector( r1 ) if( is.matrix(r2) ) if( ncol(r2)==1 ) r2 <- as.vector( r2 ) diff -Nru r-cran-epi-2.19/R/cutLexis.R r-cran-epi-2.30/R/cutLexis.R --- r-cran-epi-2.19/R/cutLexis.R 2015-05-27 15:26:55.000000000 +0000 +++ r-cran-epi-2.30/R/cutLexis.R 2017-10-29 13:39:52.000000000 +0000 @@ -168,7 +168,7 @@ ## If we want to add a new timescale, construct the name if( is.logical(new.scale) ) { - if( new.scale ) scale.name <- paste( new.state[1], "dur", sep="." ) + if( new.scale ) scale.name <- paste( "tf", new.state[1], sep="" ) } else { scale.name <- new.scale diff -Nru r-cran-epi-2.19/R/detrend.R r-cran-epi-2.30/R/detrend.R --- r-cran-epi-2.19/R/detrend.R 2015-05-27 15:26:55.000000000 +0000 +++ r-cran-epi-2.30/R/detrend.R 2018-05-13 18:40:54.000000000 +0000 @@ -11,3 +11,17 @@ # Now detrend the matrix using the weighted inner product. Thin.col( projection.ip( cbind( 1, t ), M , orth = TRUE, weight = weight ) ) } + +decurve <- +function( M, t, weight=rep(1,nrow(M)) ) +{ +Thin.col <- +function ( X, tol = 1e-06) +# Function to remove lin. dep. columns from a matrix +{ + QR <- qr(X, tol = tol, LAPACK = FALSE) + X[, QR$pivot[seq(length = QR$rank)], drop = FALSE] +} +# Now detrend and -curv the matrix using the weighted inner product. + Thin.col( projection.ip( cbind( 1, t, t^2 ), M , orth = TRUE, weight = weight ) ) +} diff -Nru r-cran-epi-2.19/R/effx.r r-cran-epi-2.30/R/effx.r --- r-cran-epi-2.19/R/effx.r 2015-05-27 15:26:55.000000000 +0000 +++ r-cran-epi-2.30/R/effx.r 2018-02-07 11:20:49.000000000 +0000 @@ -32,12 +32,6 @@ if(!missing(control)) { control.arg <- substitute(control) - if (length(control.arg) > 1) { - control.names <- sapply(control.arg, deparse)[-1] - } - else { - control.names <- deparse(control.arg) - } } ## Match the type argument @@ -118,12 +112,22 @@ ## Fix up the control argument as a named list if (!missing(control)) { - if (is.list(control)) { - names(control) <- control.names + if (is.list(control)) { + control.names <- sapply(control.arg, deparse) + if (control.names[1] == "list" && + length(control.names) == length(control) + 1) + { + control.names <- control.names[-1] + } + else { + control.names <- + paste0(deparse(control.arg), "[", 1:length(control), "]") + } + names(control) <- control.names } else { - control <- list(control) - names(control) <- control.names + control <- list(control) + names(control) <- deparse(control.arg) } } diff -Nru r-cran-epi-2.19/R/foreign.R r-cran-epi-2.30/R/foreign.R --- r-cran-epi-2.19/R/foreign.R 2015-12-30 13:34:36.000000000 +0000 +++ r-cran-epi-2.30/R/foreign.R 2018-02-11 06:31:05.000000000 +0000 @@ -25,35 +25,29 @@ } # The etm method -etm <- function (obj, ...) UseMethod("etm") - -etm.data.frame <- -function (obj, ...) -{ -etm::etm( data=obj, ... ) -} - +etm <- function (data, ...) UseMethod("etm") + etm.Lexis <- -function( obj, - time.scale = timeScales(obj)[1], - cens.name = "cens", - s = 0, - t = "last", - covariance = TRUE, - delta.na = TRUE, - ... - ) +function( data, + time.scale = timeScales(data)[1], + cens.name = "cens", + s = 0, + t = "last", + covariance = TRUE, + delta.na = TRUE, + ... + ) { -dfr <- data.frame( id = obj$lex.id, - from = as.character(obj$lex.Cst), - to = as.character(obj$lex.Xst), - entry = obj[,time.scale], - exit = obj[,time.scale] + obj$lex.dur, +dfr <- data.frame( id = data$lex.id, + from = as.character(data$lex.Cst), + to = as.character(data$lex.Xst), + entry = data[,time.scale], + exit = data[,time.scale] + data$lex.dur, stringsAsFactors = FALSE ) dfr$to <- with( dfr, ifelse( from==to, cens.name, to ) ) etm::etm( data = dfr, - state.names = levels( obj$lex.Cst ), - tra = tmat(obj,mode="logical"), + state.names = levels( data$lex.Cst ), + tra = tmat(data,mode="logical"), cens.name = cens.name, s = s, t = t, diff -Nru r-cran-epi-2.19/R/LCa.fit.R r-cran-epi-2.30/R/LCa.fit.R --- r-cran-epi-2.19/R/LCa.fit.R 2017-08-08 12:46:58.000000000 +0000 +++ r-cran-epi-2.30/R/LCa.fit.R 2017-12-07 07:44:36.000000000 +0000 @@ -94,16 +94,17 @@ commence <- Sys.time() # Matrices to extract the age-interaction terms at reference points -Mp <- Ns( rep(pi.ref,length(A)), knots=pi.kn, intercept=TRUE ) -Mc <- Ns( rep(ci.ref,length(A)), knots=ci.kn, intercept=TRUE ) +Ap <- Ns( rep(pi.ref,length(A)), knots=pi.kn, intercept=TRUE ) +Ac <- Ns( rep(ci.ref,length(A)), knots=ci.kn, intercept=TRUE ) # Current age-effects (in the iteration these will be term predictions) -ba <- cbind( rep(1,length(A)), 1 ) +ba <- matrix( 1, length(A), 2 ) # cbind( rep(1,length(A)), 1 ) + # set to 0 if term is not in model at all if( !mainP ) ba[,1] <- 0 if( !mainC ) ba[,2] <- 0 - -# Main effects model with (at least one) age-interactions +# Main effects model with (at least one) age-interaction +# --- at this stage it is either 0 or 1 mat <- glm( D ~ -1 + Ns( A, knots=a.kn, intercept=TRUE ) + Ns( P , knots=p.kn, ref=p.ref):ba[,1] + Ns( P-A, knots=c.kn, ref=c.ref):ba[,2], @@ -112,7 +113,7 @@ oldmb <- oldmat <- mat$deviance # Terms prediction --- three terms here. -# No need to divide by the ba, it is eiter 1 or 0 +# No need to divide by the ba at this point, it is eiter 1 or 0 pat <- predict( mat, type="terms" ) # iteration counter and continuation indicator @@ -127,25 +128,29 @@ { nit <- nit+1 -# Terms with main effects should be either in interaction or offset, -# so one of these should always be 0 +# The estimated terms from the modeling of the APC-effects to be used +# as offsets + Aoff <- pat[,1] Pint <- Poff <- pat[,2] Cint <- Coff <- pat[,3] +# P or C terms with main effects should be either in interaction or +# offset, so one of these should always be 0 if( intP ) Poff <- Poff*0 else Pint <- Pint*0 if( intC ) Coff <- Coff*0 else Cint <- Cint*0 -# Iteration of the age-interaction +# Iteration of the age-components of the interaction mb <- glm( D ~ -1 + Ns( A, knots=pi.kn, intercept=TRUE ):Pint + Ns( A, knots=ci.kn, intercept=TRUE ):Cint, - offset = pat[,1] + Poff + Coff + log(Y), + offset = Aoff + Poff + Coff + log(Y), family = poisson ) # Get the age-interaction terms only, and if one is not needed set to 0 -ba <- predict( mb, type="terms" ) / cbind(Pint,Cint) / - cbind( ci.lin( mb, subset="pi.kn", ctr.mat=Mp)[,1], - ci.lin( mb, subset="ci.kn", ctr.mat=Mc)[,1] ) +ba <- predict( mb, type="terms" ) / + cbind(Pint,Cint) / + cbind( ci.lin( mb, subset="pi.kn", ctr.mat=Ap)[,1], # These are the values at the reference + ci.lin( mb, subset="ci.kn", ctr.mat=Ac)[,1] ) # point for A; we want the RRs as effects ba[is.na(ba)] <- 0 -# If no interaction only main should be fitted; if no main effect, set to 0 +# If no interaction only main should be fitted; if no main effect, set to 0 using mainP/C if( !intP ) ba[,1] <- rep(1,length(A)) * mainP if( !intC ) ba[,2] <- rep(1,length(A)) * mainC # apc model with assumed known interactions with age @@ -155,16 +160,15 @@ offset = log(Y), family = poisson ) -# extract age and period terms +# extract age and period terms - rmoving the interactions pat <- predict( mat, type="terms" ) / cbind( 1, ba ) pat[is.na(pat)] <- 0 -# convergence? Check bot that the two models give the same deviance +# convergence? Check both that the two models give the same deviance # and that the chnage in each is small newmat <- mat$deviance newmb <- mb$deviance -conv <- ( reldif <- max( (abs(newmat-newmb)/ - (newmat+newmb)*2), +conv <- ( reldif <- max( (abs(newmat-newmb)/(newmat+newmb)/2), (oldmat-newmat)/newmat, (oldmb -newmb )/newmb ) ) < eps one.more <- ( !conv & ( nit < maxit ) ) @@ -516,10 +520,10 @@ mu = object$coef, Sigma = object$vcov ) na <- ncol( Ma ) +np <- ncol( Mp ) +nc <- ncol( Mc ) npi <- ncol( Mpi ) - np <- ncol( Mp ) nci <- ncol( Mci ) - nc <- ncol( Mc ) # Compute the linear predictor in each of the simulated samples # period and cohort effects if not in the model kp <- kc <- rep( 0, nrow(newdata) ) @@ -542,5 +546,3 @@ } else return( pr0 ) } - - diff -Nru r-cran-epi-2.19/R/lexis.R r-cran-epi-2.30/R/lexis.R --- r-cran-epi-2.19/R/lexis.R 2017-06-29 22:13:57.000000000 +0000 +++ r-cran-epi-2.30/R/lexis.R 2018-03-09 08:57:12.000000000 +0000 @@ -40,9 +40,14 @@ if( only.exit ) { if( is.logical( exit.status ) ) + { entry.status <- FALSE + cat("NOTE: entry.status has been set to FALSE for all.\n" ) + } if( is.character( exit.status ) ) + { exit.status <- factor( exit.status ) + } if( is.factor( exit.status ) ) { entry.status <- factor( rep( levels(exit.status)[1], @@ -54,7 +59,10 @@ "for all.\n" ) } if( is.numeric( exit.status ) ) + { entry.status <- rep( 0, length( exit.status ) ) + cat("NOTE: entry.status has been set to 0 for all.\n" ) + } } ## Convert character states to factors @@ -183,7 +191,7 @@ # stop("Duration must be non-negative") # } - ## Make sure id value - if supplied - is valid. Otherwise supply default id + ## Make sure id values - if supplied - are valid. Otherwise supply default id if (missing(id)) { id <- 1:nrow(entry) @@ -197,10 +205,10 @@ ## variables Use the prefix "lex." for the names of reserved ## variables. if( is.data.frame( duration ) ) duration <- duration[,1] - lex <- data.frame(entry, "lex.dur" = duration, - "lex.Cst" = entry.status, - "lex.Xst" = exit.status, - "lex.id" = id ) + lex <- data.frame( entry, "lex.dur" = duration, + "lex.Cst" = entry.status, + "lex.Xst" = exit.status, + "lex.id" = id ) #### Addition by BxC --- support for states as factors # Convert states to factors if states are given @@ -216,19 +224,20 @@ if (!missing(data) && merge) { duplicate.names <- intersect(names(lex), names(data)) if (length(duplicate.names) > 0) { - stop("Cannot merge data with duplicate names") + stop( "Cannot merge data with duplicate names:", + paste(duplicate.names,collapse=" ") ) } lex <- cbind(lex, data) } - ## Drop rows with short or negantive duration for consistency with splitLexis + ## Drop rows with short or negative duration for consistency with splitLexis short.dur <- lex$lex.dur <= tol if ( any(short.dur) ) { warning("Dropping ", sum(short.dur), " rows with duration of follow up < tol\n", if( keep.dropped ) " The dropped rows are in the attribute 'dropped'\n", if( keep.dropped ) " To see them type attr(Obj,'dropped'),\n", - if( keep.dropped ) " to get rid of them type attr(Obj,'dropped') <- NULL\n", + if( keep.dropped ) " to get rid of them type: attr(Obj,'dropped') <- NULL\n", if( keep.dropped ) " - where 'Obj' is the name of your Lexis object" ) lex <- subset(lex, !short.dur) if( keep.dropped ) attr(lex,"dropped") <- subset(data, short.dur) @@ -542,25 +551,6 @@ y } -# `[.Lexis` <- -# function( x, ... ) -# { -# structure( NextMethod(), -# breaks = attr(x, "breaks"), -# time.scales = attr(x, "time.scales"), -# time.since = attr(x, "time.since") ) -# } - -merge.data.frame <- function(x, y, ...) -{ - if (is.Lexis(x)) - merge.Lexis(x, y, ...) - else if (is.Lexis(y)) - merge.Lexis(y, x, ...) - else - base::merge.data.frame(x, y, ...) -} - merge.Lexis <- function(x, y, id, by, ...) { if (!missing(id)) { diff -Nru r-cran-epi-2.19/R/matshade.R r-cran-epi-2.30/R/matshade.R --- r-cran-epi-2.19/R/matshade.R 1970-01-01 00:00:00.000000000 +0000 +++ r-cran-epi-2.30/R/matshade.R 2018-05-14 07:47:02.000000000 +0000 @@ -0,0 +1,29 @@ +matshade <- +function( x, y, + lty = 1, + col = 1:(ncol(y)/3), col.shade=col, alpha=0.15, + plot = dev.cur()==1, + ... ) + { +# check sanity of x and y +if( !is.vector(x) ) stop( "x must be a vector\n") +if( ncol(y)%%3 != 0 ) warning( "number of columns in y, ", ncol(y), + " is not divisible by 3\n") +# create a new plot? +if( plot ) matplot( x, y, col="transparent", ... ) +# number of curves to draw +ncrv <- ncol(y) %/% 3 +# The recycling rule for colors and alpha: +col <- rep(col ,ncrv)[1:ncrv] +col.shade <- rep(col.shade,ncrv)[1:ncrv] +alpha <- rep(alpha ,ncrv)[1:ncrv] +# First shaded areas +for( i in 1:ncrv ) polygon( c( x ,rev(x) ), + c( y[,i*3-1],rev(y[,i*3-0])), + col = adjustcolor( col.shade[i], + alpha.f=alpha[i] ), + pch = NULL, + border = "transparent" ) +# then curves on top of these +matlines( x, y[,(1:ncrv)*3-2], col=col, lty=lty, ... ) + } diff -Nru r-cran-epi-2.19/R/mcutLexis.R r-cran-epi-2.30/R/mcutLexis.R --- r-cran-epi-2.19/R/mcutLexis.R 2017-04-03 21:53:00.000000000 +0000 +++ r-cran-epi-2.30/R/mcutLexis.R 2017-10-23 12:50:57.000000000 +0000 @@ -112,16 +112,25 @@ } # end of for loop through sequences (sq) -# Do we want the sequences or just the unordered set of previous events -if( !seq.states ) +# Do we want the sequences, the unordered set of previous events or +# just the current one: +old.seq <- seq.states +if( is.logical(seq.states) ) seq.states <- ifelse( seq.states, "s", "u" ) +if( is.character(seq.states) ) seq.states <- tolower( substr(seq.states,1,1) ) +if( !(seq.states %in% c("s","o","u","l","c")) ) + stop( "What do you mean by seq.states=", old.seq, + "? - it should abbreviate to one of s, o, u, l or c \n") +# Unordered or last (current) states +if( seq.states %in% c("u","l","c") ) { - lvl <- levels( Lcut ) - ulvl <- sapply( lapply( strsplit(lvl,"-"), - sort ), - paste, - collapse="+" ) + # Each list element is a vector of states visited + slvl <- strsplit( levels( Lcut ), "-" ) + # merge those that have the same elements or take the last + rlvl <- if( seq.states=="u" ) { sapply( lapply( slvl, sort ), paste, collapse="+" ) + } else sapply( slvl, function(x) x[length(x)] ) + # Relevel the states levels( Lcut$lex.Cst ) <- - levels( Lcut$lex.Xst ) <- ulvl + levels( Lcut$lex.Xst ) <- rlvl } # Did we ask for timescales as time since events? diff -Nru r-cran-epi-2.19/R/N2Y.r r-cran-epi-2.30/R/N2Y.r --- r-cran-epi-2.19/R/N2Y.r 2015-05-27 15:26:55.000000000 +0000 +++ r-cran-epi-2.30/R/N2Y.r 2018-02-28 01:13:22.000000000 +0000 @@ -25,19 +25,19 @@ Ntab <- xtabs( N ~ A + P ) # Devise a table for the risk times Ydim <- c(dimnames(Ntab),list(wh=c("lo","up"))) -# note one less age and period category -Ytab <- array( NA, dim=sapply(Ydim,length), - dimnames = Ydim )[-dim(Ntab)[1],-dim(Ntab)[2],] -# How manu age and period classes +# Note: one less period category +Ytab <- NArray( Ydim )[,-dim(Ntab)[2],] +# How many age and period classes na <- nrow(Ytab) np <- ncol(Ytab) for(a in 1:na) for(p in 1:np) { - Ytab[a,p,"up"] <- Ntab[a ,p]/3 + Ntab[a+1,p+1]/6 - if( a > 1) Ytab[a,p,"lo"] <- Ntab[a-1,p]/6 + Ntab[a ,p+1]/3 - else Ytab[a,p,"lo"] <- Ntab[a ,p]/2 + Ntab[a ,p+1]/2 - Ytab[a,p,"up"] +if( a < na ) Ytab[a,p,"up"] <- Ntab[a ,p]/3 + Ntab[a+1,p+1]/6 +if( a > 1 ) Ytab[a,p,"lo"] <- Ntab[a-1,p]/6 + Ntab[a ,p+1]/3 +if( a == 1 ) Ytab[a,p,"lo"] <- Ntab[a ,p]/2 + Ntab[a ,p+1]/2 - Ytab[a,p,"up"] +if( a ==na ) Ytab[a,p,"up"] <- Ntab[a ,p]/2 + Ntab[a ,p+1]/2 - Ytab[a,p,"lo"] } -# Remember to check the follow-up time +# Remember to multiply to get the follow-up time Ytab <- Ytab * A.int # Convert to a data frame if required (the default) if( return.dfr ) diff -Nru r-cran-epi-2.19/R/Ns.r r-cran-epi-2.30/R/Ns.r --- r-cran-epi-2.19/R/Ns.r 2017-04-19 14:55:13.000000000 +0000 +++ r-cran-epi-2.30/R/Ns.r 2018-03-14 13:33:56.000000000 +0000 @@ -11,6 +11,8 @@ if(!missing(Boundary.knots)) { + if( length(Boundary.knots)!=2 ) + stop( "Argument 'Boundary.knots' must, when provided, have length 2" ) Boundary.knots <- sort(Boundary.knots) outside <- (ol <- x < Boundary.knots[1L]) | (or <- x > Boundary.knots[2L]) } @@ -62,7 +64,7 @@ if( !is.logical(fixsl) ) warning( "fixsl elements must be of mode logical" ) # Only the 4th parameter affected, should be either 1 or 2 in the two positions const <- splines::spline.des( Aknots, Boundary.knots, 4, c(2-fixsl[1],2-fixsl[2]) )$design - + if(!intercept) { const <- const[, -1 , drop = FALSE] basis <- basis[, -1 , drop = FALSE] @@ -100,59 +102,64 @@ fixsl = c(FALSE,FALSE), detrend = FALSE ) { - ## Check sensibility of arguments - if( !is.null(ref) ) { - if( !is.vector(ref) ) - stop( "Argument 'ref' must be a scalar, but it is a ", class(ref), "." ) - if( is.vector(ref) & length(ref)>1 ) - stop( "Argument 'ref' must be a scalar, but has length ", length(ref), "." ) - if( intercept ) { - warning( "ref= specified, hence intercept=TRUE is ignored") - intercept <- FALSE - } - } - ## Detrending required? - if( any(detrend>0) ) { # covers both logical and vector - if( any(detrend<0) ) - stop( "Some elements of weight are <0, e.g. no", - (ww <- which(detrend<0))[1:min(5,length(ww))], "." ) - if( !(length(detrend) %in% c(1,length(x))) ) { - warning( "Weights in inner product diagonal matrix set to 1") - weight <- rep(1,length(x)) - } - else weight <- if( is.numeric(detrend) ) detrend else rep(1,length(x)) - detrend <- TRUE - } - if( detrend & intercept ) { - warning( "detrend= specified, hence intercept=TRUE is ignored") - intercept <- FALSE - } - if( detrend & any(!is.na(fixsl)) ) { - warning( "detrend= specified, hence fixsl argument is ignored") - fixsl=c(FALSE,FALSE) - } - ## Here is the specification of the spline basis - ## df= specified - if( !is.null(df) ) - MM <- ns.ld( x, df = df, intercept = (intercept & is.null(ref)), fixsl = fixsl ) - else + ## Check sensibility of arguments + if( !is.null(ref) ) { + if( !is.vector(ref) ) + stop( "Argument 'ref' must be a scalar, but it is a ", class(ref), "." ) + if( is.vector(ref) & length(ref)>1 ) + stop( "Argument 'ref' must be a scalar, but has length ", length(ref), "." ) + if( intercept ) { + warning( "ref= specified, hence intercept=TRUE is ignored") + intercept <- FALSE + } + } + ## Detrending required? + if( any(detrend>0) ) { # covers both logical and vector + if( any(detrend<0) ) + stop( "Some elements of weight are <0, e.g. no", + (ww <- which(detrend<0))[1:min(5,length(ww))], "." ) + if( !(length(detrend) %in% c(1,length(x))) ) { + warning( "Weights in inner product diagonal matrix set to 1") + weight <- rep(1,length(x)) + } + else weight <- if( is.numeric(detrend) ) detrend else rep(1,length(x)) + detrend <- TRUE + } + if( detrend & intercept ) { + warning( "detrend= specified, hence intercept=TRUE is ignored") + intercept <- FALSE + } + if( detrend & any(fixsl) ) { + warning( "detrend= specified, hence fixsl argument is ignored") + fixsl=c(FALSE,FALSE) + } + ## Here is the specification of the spline basis + ## knots= specified - { - if( is.null( Boundary.knots ) ) - { - if( !is.null( knots ) ) - { - knots <- sort( unique( knots ) ) - ok <- c(1,length(knots)) - Boundary.knots <- knots[ok] - knots <- knots[-ok] - } - } - MM <- ns.ld( x, knots = knots, - Boundary.knots = Boundary.knots, - intercept = (intercept & is.null(ref)), - fixsl = fixsl ) - } + if( !is.null( knots ) ) + { + if( is.null( Boundary.knots ) ) + { + if( !is.null( knots ) ) + { + knots <- sort( unique( knots ) ) + ok <- c(1,length(knots)) + Boundary.knots <- knots[ok] + knots <- knots[-ok] + } + } + MM <- ns.ld( x, knots = knots, + Boundary.knots = Boundary.knots, + intercept = (intercept & is.null(ref)), + fixsl = fixsl ) + if( !is.null(df) ) cat("NOTE: Both knots= and df= specified, df ignored") + } + ## df= specified + if( is.null( knots ) & !is.null( df ) ) + MM <- ns.ld( x, df = df, + intercept = (intercept & is.null(ref)), + fixsl = fixsl ) + ## Reference point specified ? if( !is.null(ref) ) { @@ -161,7 +168,6 @@ Boundary.knots = attr(MM,"Boundary.knots"), fixsl = fixsl ) } - ## Detrending required ? if( detrend ) { diff -Nru r-cran-epi-2.19/R/Relevel.R r-cran-epi-2.30/R/Relevel.R --- r-cran-epi-2.19/R/Relevel.R 2015-05-27 15:26:55.000000000 +0000 +++ r-cran-epi-2.30/R/Relevel.R 2018-03-08 13:52:49.000000000 +0000 @@ -1,4 +1,3 @@ - # The levels method is already defined (in the utils package) # and hence imported in the NAMESPACE file levels.Lexis <- @@ -10,13 +9,66 @@ # The Relevel method Relevel <- function (x, ...) UseMethod("Relevel") +# Utility to group a factor from a 2-column table; +# called from Relevel if second argument is a matrix or table +tRelevel <- +function( ff, # factor to be grouped + gg, # 2-column matrix or table with levels resp. grouping + xlevels = TRUE, # include also grouped levels not present in ff + nogroup = TRUE ) # levels of ff not grouped are tranferred (if FALSE set to NA) +{ +if( any( wh <- ( apply( table( gg[,1], gg[,2] )>0, 1, sum )>1 ) ) ) + stop( "\nFactor level", if(sum(wh)>1) "s", ": \n", + paste(names(wh)[wh],collapse=" / "), + "\nis grouped to more than one group.\n" ) +if( any( is.na( match( unique(as.character(ff)), + unique(as.character(gg[,1])) ) ) ) ) + cat( "Note: Some values of factor not among levels grouped.\n" ) +if( any( wh <- ( apply( table( gg[,1], gg[,2] ), 1, sum )>1 ) ) ) + warning( "Factor level", if(sum(wh)>1) "s", ": ", + paste(names(wh)[wh],collapse=" / "), + "\nappear more than once in the table of groupings." ) +# indices and names of the original factor levels +ixff <- as.integer( ff ) +chff <- as.character( ff ) +# where they are in the translation table +ixg1 <- as.integer( factor( gg[,1], levels=levels(ff) ) ) +# indices of the new levels in the translation table +ixg2 <- as.integer( g2 <- factor(gg[,2]) ) +# remove unwanted NAs (levels in g[,1] not present in ff) +ixg2 <- ixg2[!is.na(ixg1)] +ixg1 <- ixg1[!is.na(ixg1)] +# where in ixg2 are the integers ixff - match(), choose those values +grff <- levels(g2)[ixg2[match(ixff,ixg1)]] +# transfer non-grouped levels of ff +if( nogroup ) grff[is.na(grff)] <- chff[is.na(grff)] +# a factor with the correct levels in the correct order +grff <- factor( grff, levels=union(levels(g2),levels(factor(grff))) ) +# keep all levels from second column or not? +if( xlevels ) grff else factor(grff) +} + # The factor method is the default method Relevel.default <- Relevel.factor <- - function( x, ref, first=TRUE, collapse="+", ... ) + function( x, ref, first=TRUE, collapse="+", + xlevels=TRUE, nogroup=TRUE, ... ) { # Function that collapses multiple sets of levels of a factor # + # If ref is a 2-dim structure + if( is.matrix( ref) | + is.table( ref) | + is.array( ref) | + is.data.frame(ref) ) + { + if( length(dim(ref)) !=2 ) stop("ref must be 2-dimensional\n") + if( dim(ref)[2] < 2 ) stop("ref must have at least 2 colums\n") + return( tRelevel( x, ref, xlevels, nogroup ) ) + } else { + + # Otherwise use the old version + # if( !is.factor(x) ) { argnam <- deparse( substitute(x) ) @@ -36,7 +88,7 @@ if ( is.character( ref ) ) ref <- match(ref, lev) if ( any( is.na( ref ) ) ) - stop( "any values in ref must be an existing level" ) + stop( "any values in ref must be an existing level\n" ) nlev <- length( lev ) if ( any( ref < 1 ) || any( ref > nlev ) ) stop( paste( "ref=", paste( ref, collapse="," ), @@ -74,7 +126,7 @@ } else names( ref )[s] newnames[ref[[s]]] <- rep( uninames[s], length( ref[[s]] ) ) - } + } levels( fnew ) <- newnames if( !is.null( first ) ) { @@ -85,5 +137,6 @@ # This is in order to merge levels with identical names # - factor( fnew, levels=levels(fnew) ) + return( factor( fnew, levels=levels(fnew) ) ) + } } diff -Nru r-cran-epi-2.19/R/ROC.R r-cran-epi-2.30/R/ROC.R --- r-cran-epi-2.19/R/ROC.R 2015-05-27 15:26:55.000000000 +0000 +++ r-cran-epi-2.30/R/ROC.R 2017-08-27 21:21:48.000000000 +0000 @@ -98,11 +98,11 @@ resp <- eval( parse(text = deparse(form[[2]])), envir=lr$model ) Model.inf <- paste("Model: ",paste(paste(form)[c(2,1,3)], collapse=" ")) } -# Form the empirical distribution function for test for each of +# From the empirical distribution function for test for each of # the two categories of resp. # First a table of the test (continuous variable) vs. the response and -# adding a row of 0s so that we have all points fro the ROC curve +# adding a row of 0s so that we have all points for the ROC curve m <- as.matrix( base::table( switch( PS+1, test, lr$fit ), resp ) ) m <- addmargins( rbind( 0, m ), 2 ) # What values of test/eta do the rows refer to diff -Nru r-cran-epi-2.19/R/Wald.R r-cran-epi-2.30/R/Wald.R --- r-cran-epi-2.19/R/Wald.R 2015-05-27 15:26:55.000000000 +0000 +++ r-cran-epi-2.30/R/Wald.R 2018-03-05 01:43:45.000000000 +0000 @@ -2,11 +2,11 @@ function( obj, H0=0, ... ) { rl <- ci.lin( obj, ..., vcov=TRUE ) -beta <- rl$est +beta <- rl$coef vcov <- rl$vcov if( missing( H0 ) ) H0 <- beta*0 if( length(H0) != length(beta) ) stop( "H0 has length ", length(H0), - " but the set of selected paramteters has length ", + " but the set of selected parameters has length ", length(beta), ":\n", paste(round(beta,options()[["digits"]]),collapse=" ") ) chi <- t( beta-H0 ) %*% solve( vcov, beta-H0 ) diff -Nru r-cran-epi-2.19/vignettes/fixall r-cran-epi-2.30/vignettes/fixall --- r-cran-epi-2.19/vignettes/fixall 1970-01-01 00:00:00.000000000 +0000 +++ r-cran-epi-2.30/vignettes/fixall 2018-05-01 13:15:57.000000000 +0000 @@ -0,0 +1,27 @@ +# Before this is run, do this: +# 1) run rw on simLexis and yll with eval=TRUE in the lines with +# CHANGE1 in order to get a version with properly formatted +# code. +# 2) Move the .pdfs to sL.pdf and yl.pdf, respectively. +# 3) Swap TRUE and FALSE in the CHANGE1/2 lines in the rnw files. +# Then we are ready for the following: + +rw flup +rt flup +bl flup +cp flup.R ../inst/doc +cp flup.pdf ../inst/doc + +rt simLexis +rw simLexis +bl simLexis +cp simLexis.R ../inst/doc/simLexis.R +cp sL.pdf ../inst/doc/simLexis.pdf + +rt yll +rw yll +bl yll +cp yll.R ../inst/doc/yll.R +cp yl.pdf ../inst/doc/yll.pdf + +klean \ No newline at end of file Binary files /tmp/tmp1TClAk/RJn9eX0x2u/r-cran-epi-2.19/vignettes/flup-nic-box.pdf and /tmp/tmp1TClAk/HK_qO2S7tg/r-cran-epi-2.30/vignettes/flup-nic-box.pdf differ Binary files /tmp/tmp1TClAk/RJn9eX0x2u/r-cran-epi-2.19/vignettes/flup-nicL1.pdf and /tmp/tmp1TClAk/HK_qO2S7tg/r-cran-epi-2.30/vignettes/flup-nicL1.pdf differ Binary files /tmp/tmp1TClAk/RJn9eX0x2u/r-cran-epi-2.19/vignettes/flup-nicL2.pdf and /tmp/tmp1TClAk/HK_qO2S7tg/r-cran-epi-2.30/vignettes/flup-nicL2.pdf differ Binary files /tmp/tmp1TClAk/RJn9eX0x2u/r-cran-epi-2.19/vignettes/flup.pdf and /tmp/tmp1TClAk/HK_qO2S7tg/r-cran-epi-2.30/vignettes/flup.pdf differ Binary files /tmp/tmp1TClAk/RJn9eX0x2u/r-cran-epi-2.19/vignettes/flup-pr-a.pdf and /tmp/tmp1TClAk/HK_qO2S7tg/r-cran-epi-2.30/vignettes/flup-pr-a.pdf differ Binary files /tmp/tmp1TClAk/RJn9eX0x2u/r-cran-epi-2.19/vignettes/flup-pr-at-af.pdf and /tmp/tmp1TClAk/HK_qO2S7tg/r-cran-epi-2.30/vignettes/flup-pr-at-af.pdf differ Binary files /tmp/tmp1TClAk/RJn9eX0x2u/r-cran-epi-2.19/vignettes/flup-pr-at.pdf and /tmp/tmp1TClAk/HK_qO2S7tg/r-cran-epi-2.30/vignettes/flup-pr-at.pdf differ diff -Nru r-cran-epi-2.19/vignettes/flup.R r-cran-epi-2.30/vignettes/flup.R --- r-cran-epi-2.19/vignettes/flup.R 1970-01-01 00:00:00.000000000 +0000 +++ r-cran-epi-2.30/vignettes/flup.R 2018-05-03 14:34:57.000000000 +0000 @@ -0,0 +1,324 @@ +### R code from vignette source 'flup' +### Encoding: UTF-8 + +################################################### +### code chunk number 1: flup.rnw:5-8 +################################################### +options( width=90, + SweaveHooks=list( fig=function() + par(mar=c(3,3,1,1),mgp=c(3,1,0)/1.6,las=1,bty="n") ) ) + + +################################################### +### code chunk number 2: flup.rnw:101-103 +################################################### +library(Epi) +print( sessionInfo(), l=F ) + + +################################################### +### code chunk number 3: flup.rnw:109-116 +################################################### +data( nickel ) +nicL <- Lexis( entry = list( per=agein+dob, + age=agein, + tfh=agein-age1st ), + exit = list( age=ageout ), + exit.status = ( icd %in% c(162,163) )*1, + data = nickel ) + + +################################################### +### code chunk number 4: flup.rnw:126-129 +################################################### +str( nickel ) +str( nicL ) +head( nicL ) + + +################################################### +### code chunk number 5: flup.rnw:138-139 +################################################### +summary( nicL ) + + +################################################### +### code chunk number 6: nicL1 +################################################### +plot( nicL ) + + +################################################### +### code chunk number 7: nicL2 +################################################### +par( mar=c(3,3,1,1), mgp=c(3,1,0)/1.6 ) +plot( nicL, 1:2, lwd=1, col=c("blue","red")[(nicL$exp>0)+1], + grid=TRUE, lty.grid=1, col.grid=gray(0.7), + xlim=1900+c(0,90), xaxs="i", + ylim= 10+c(0,90), yaxs="i", las=1 ) +points( nicL, 1:2, pch=c(NA,3)[nicL$lex.Xst+1], + col="lightgray", lwd=3, cex=1.5 ) +points( nicL, 1:2, pch=c(NA,3)[nicL$lex.Xst+1], + col=c("blue","red")[(nicL$exp>0)+1], lwd=1, cex=1.5 ) + + +################################################### +### code chunk number 8: flup.rnw:193-196 +################################################### +nicS1 <- splitLexis( nicL, "age", breaks=seq(0,100,10) ) +summary( nicL ) +summary( nicS1 ) + + +################################################### +### code chunk number 9: flup.rnw:204-205 +################################################### +round( subset( nicS1, id %in% 8:10 ), 2 ) + + +################################################### +### code chunk number 10: flup.rnw:211-213 +################################################### +nicS2 <- splitLexis( nicS1, "tfh", breaks=c(0,1,5,10,20,30,100) ) +round( subset( nicS2, id %in% 8:10 ), 2 ) + + +################################################### +### code chunk number 11: flup.rnw:218-223 +################################################### +library( popEpi ) +nicM <- splitMulti( nicL, age = seq(0,100,10), + tfh = c(0,1,5,10,20,30,100) ) +summary( nicS2 ) +summary( nicM ) + + +################################################### +### code chunk number 12: flup.rnw:227-230 +################################################### +identical( nicS2, nicM ) +class( nicS2 ) +class( nicM ) + + +################################################### +### code chunk number 13: flup.rnw:250-258 +################################################### +timeBand( nicM, "age", "middle" )[1:20] +# For nice printing and column labelling use the data.frame() function: +data.frame( nicS2[,c("lex.id","per","age","tfh","lex.dur")], + mid.age=timeBand( nicS2, "age", "middle" ), + mid.t=timeBand( nicS2, "tfh", "middle" ), + left.t=timeBand( nicS2, "tfh", "left" ), + right.t=timeBand( nicS2, "tfh", "right" ), + fact.t=timeBand( nicS2, "tfh", "factor" ) )[1:20,] + + +################################################### +### code chunk number 14: flup.rnw:278-279 +################################################### +summary( (nicS2$age-nicS2$tfh) - nicS2$age1st ) + + +################################################### +### code chunk number 15: flup.rnw:284-286 +################################################### +summary( timeBand( nicS2, "age", "middle" ) - + timeBand( nicS2, "tfh", "middle" ) - nicS2$age1st ) + + +################################################### +### code chunk number 16: flup.rnw:308-316 +################################################### +subset( nicL, id %in% 8:10 ) +agehi <- nicL$age1st + 50 / nicL$exposure +nicC <- cutLexis( data = nicL, + cut = agehi, + timescale = "age", + new.state = 2, + precursor.states = 0 ) +subset( nicC, id %in% 8:10 ) + + +################################################### +### code chunk number 17: flup.rnw:323-331 +################################################### +subset( nicS2, id %in% 8:10 ) +agehi <- nicS2$age1st + 50 / nicS2$exposure +nicS2C <- cutLexis( data = nicS2, + cut = agehi, + timescale = "age", + new.state = 2, + precursor.states = 0 ) +subset( nicS2C, id %in% 8:10 ) + + +################################################### +### code chunk number 18: flup.rnw:390-392 +################################################### +( a.kn <- with( subset( nicM, lex.Xst==1 ), quantile( age+lex.dur, (1:5-0.5)/5 ) ) ) +( t.kn <- with( subset( nicM, lex.Xst==1 ), quantile( tfh+lex.dur, (1:5-0.5)/5 ) ) ) + + +################################################### +### code chunk number 19: flup.rnw:405-410 +################################################### +ma <- glm( (lex.Xst==1) ~ Ns(age,knots=a.kn), + family = poisson, + offset = log(lex.dur), + data = nicM ) +summary( ma ) + + +################################################### +### code chunk number 20: pr-a +################################################### +nd <- data.frame( age=40:85, lex.dur=1000 ) +pr.a <- ci.pred( ma, newdata = nd ) +matplot( nd$age, pr.a, + type="l", lty=1, col=1, lwd=c(3,1,1), + log="y", xlab="Age (years)", + ylab="Lunng cancer mortality per 1000 PY") + + +################################################### +### code chunk number 21: flup.rnw:445-447 +################################################### +mat <- update( ma, . ~ . + Ns(tfh,knots=t.kn) ) +summary( mat ) + + +################################################### +### code chunk number 22: flup.rnw:457-458 +################################################### +summary( nickel$age1st ) + + +################################################### +### code chunk number 23: flup.rnw:462-468 +################################################### +nd <- data.frame( expand.grid( age=c(20:90,NA), age1st=seq(15,45,10) ) ) +nd <- transform( nd, tfh = ifelse( age > age1st, age-age1st, NA ), + lex.dur = 1000 ) +# makes no sense to have age < age1st +nd <- transform( nd, age = ifelse( age > age1st, age, NA ) ) +head( nd ) + + +################################################### +### code chunk number 24: pr-at +################################################### +pr.at <- ci.pred( mat, newdata = nd ) +matplot( nd$age, pr.at, + type="l", lty=1, col=1, lwd=c(3,1,1), + log="y", xlab="Age (years)", + ylab="Lunng cancer mortality per 1000 PY") + + +################################################### +### code chunk number 25: flup.rnw:492-493 +################################################### +anova( ma, mat, test="Chisq" ) + + +################################################### +### code chunk number 26: flup.rnw:504-508 +################################################### +( f.kn <- with( subset( nicM, lex.Xst==1 ), quantile( age1st, (1:5-0.5)/5 ) ) ) +maf <- update( ma, . ~ . + Ns(age1st,knots=f.kn) ) +summary( maf ) +anova( maf, ma, mat, test="Chisq" ) + + +################################################### +### code chunk number 27: pr-at-af +################################################### +pr.af <- ci.pred( maf, newdata = nd ) +matplot( nd$age, pr.at, + type="l", lty=1, col=1, lwd=c(3,1,1), + log="y", xlab="Age (years)", + ylab="Lunng cancer mortality per 1000 PY") +matlines( nd$age, pr.af, + type="l", lty=1, col=2, lwd=c(3,0,0) ) + + +################################################### +### code chunk number 28: flup.rnw:536-547 +################################################### +maft <- update( mat, . ~ . + Ns(age1st,knots=f.kn) ) +summary( maft ) +mft <- update( maft, . ~ . - Ns(age,knots=a.kn) ) +mf <- update( maf , . ~ . - Ns(age,knots=a.kn) ) +mt <- update( mat , . ~ . - Ns(age,knots=a.kn) ) +allp <- anova( maft, mat, ma, maf, mf, mft, mt, mat, + maf, maft, mft, + test="Chisq" ) +mall <- as.matrix( allp ) +cbind( mod = c("maft","mat","ma","maf","mf","mft","mt","mat","maf","maft","mft"), + round( allp[,1:5], 3 ) ) + + +################################################### +### code chunk number 29: flup.rnw:562-571 +################################################### +data( nickel ) +nicL <- Lexis( entry = list( per = agein+dob, + age = agein, + tfh = agein-age1st ), + exit = list( age = ageout ), + exit.status = ( icd > 0 ) + ( icd %in% c(162,163) ), + data = nickel ) +summary( nicL ) +subset( nicL, id %in% 8:10 ) + + +################################################### +### code chunk number 30: flup.rnw:576-585 +################################################### +nicL <- Lexis( entry = list( per = agein+dob, + age = agein, + tfh = agein-age1st ), + exit = list( age = ageout ), + exit.status = ( icd > 0 ) + ( icd %in% c(162,163) ), + data = nickel, + states = c("Alive","D.oth","D.lung") ) +summary( nicL ) +str( nicL ) + + +################################################### +### code chunk number 31: flup.rnw:597-605 +################################################### +nicL$agehi <- nicL$age1st + 50 / nicL$exposure +nicC <- cutLexis( data = nicL, + cut = nicL$agehi, + timescale = "age", + new.state = "HiExp", + precursor.states = "Alive" ) +subset( nicC, id %in% 8:10 ) +summary( nicC, scale=1000 ) + + +################################################### +### code chunk number 32: flup.rnw:624-633 +################################################### +nicC <- cutLexis( data = nicL, + cut = nicL$agehi, + timescale = "age", + new.state = "HiExp", + new.scale = "tfe", + split.states = TRUE, + precursor.states = "Alive" ) +subset( nicC, id %in% 8:10 ) +summary( nicC, scale=1000, timeScales=TRUE ) + + +################################################### +### code chunk number 33: nic-box +################################################### +boxes( nicC, boxpos = list(x=c(10,10,80,80,80,80), + y=c(75,25,87,63,13,37)), + scale.Y = 1000, + show.BE = TRUE ) + + diff -Nru r-cran-epi-2.19/vignettes/flup.rnw r-cran-epi-2.30/vignettes/flup.rnw --- r-cran-epi-2.19/vignettes/flup.rnw 1970-01-01 00:00:00.000000000 +0000 +++ r-cran-epi-2.30/vignettes/flup.rnw 2018-03-05 16:59:16.000000000 +0000 @@ -0,0 +1,685 @@ +\SweaveOpts{results=verbatim,keep.source=TRUE,include=FALSE,eps=FALSE} +%\VignetteIndexEntry{Follow-up data with R and Epi} +\documentclass[a4paper,dvipsnames,twoside,12pt]{report} + +<>= +options( width=90, + SweaveHooks=list( fig=function() + par(mar=c(3,3,1,1),mgp=c(3,1,0)/1.6,las=1,bty="n") ) ) +@ % +\newcommand{\Title}{Follow-up data with \R\ and \texttt{Epi}} +\newcommand{\Tit}{Follow-up} +\newcommand{\Version}{\ } +\newcommand{\Dates}{December 2017} +\newcommand{\Where}{SDCC} +\newcommand{\Homepage}{\url{http://bendixcarstensen.com/} } +\newcommand{\Faculty}{\begin{tabular}{rl} +Bendix Carstensen + & Steno Diabetes Center, Gentofte, Denmark\\ + & {\small \& Department of Biostatistics, + University of Copenhagen} \\ + & \texttt{b@bxc.dk} \\ + & \url{http://BendixCarstensen.com} \\[1em] + \end{tabular}} + +\input{topreport} +\renewcommand{\rwpre}{./flup} + +\chapter{Follow-up data in the \texttt{Epi} package} + +In the \texttt{Epi}-package, follow-up data is represented by adding +some extra variables to a data frame. Such a data frame is called a +\texttt{Lexis} object. The tools for handling follow-up data then use +the structure of this for special plots, tabulations etc. + +Follow-up data basically consists of a time of entry, a time of exit +and an indication of the status at exit (normally either ``alive'' or +``dead''). Implicitly is also assumed a status \emph{during} the +follow-up (usually ``alive''). + +\begin{figure}[htbp] + \centering +\setlength{\unitlength}{1pt} +\begin{picture}(210,70)(0,75) +%\scriptsize +\thicklines + \put( 0,80){\makebox(0,0)[r]{Age-scale}} + \put( 50,80){\line(1,0){150}} + \put( 50,80){\line(0,1){5}} + \put(100,80){\line(0,1){5}} + \put(150,80){\line(0,1){5}} + \put(200,80){\line(0,1){5}} + \put( 50,77){\makebox(0,0)[t]{35}} + \put(100,77){\makebox(0,0)[t]{40}} + \put(150,77){\makebox(0,0)[t]{45}} + \put(200,77){\makebox(0,0)[t]{50}} + + \put( 0,115){\makebox(0,0)[r]{Follow-up}} + + \put( 80,105){\makebox(0,0)[r]{\small Two}} + \put( 90,105){\line(1,0){87}} + \put( 90,100){\line(0,1){10}} + \put(100,100){\line(0,1){10}} + \put(150,100){\line(0,1){10}} + \put(180,105){\circle{6}} + \put( 95,110){\makebox(0,0)[b]{1}} + \put(125,110){\makebox(0,0)[b]{5}} + \put(165,110){\makebox(0,0)[b]{3}} + + \put( 50,130){\makebox(0,0)[r]{\small One}} + \put( 60,130){\line(1,0){70}} + \put( 60,125){\line(0,1){10}} + \put(100,125){\line(0,1){10}} + \put(130,130){\circle*{6}} + \put( 80,135){\makebox(0,0)[b]{4}} + \put(115,135){\makebox(0,0)[b]{3}} +\end{picture} + \caption{\it Follow-up of two persons} + \label{fig:fu2} +\end{figure} + +\section{Timescales} + +A timescale is a variable that varies deterministicly \emph{within} +each person during follow-up, \textit{e.g.}: +\begin{itemize} + \item Age + \item Calendar time + \item Time since treatment + \item Time since relapse +\end{itemize} +All timescales advance at the same pace, so the time followed is the +same on all timescales. Therefore, it suffices to use only the entry +point on each of the time scale, for example: +\begin{itemize} + \item Age at entry. + \item Date of entry. + \item Time since treatment (\emph{at} treatment this is 0). + \item Time since relapse (\emph{at} relapse this is 0).. +\end{itemize} +For illustration we need to load the \texttt{Epi} package: +<<>>= +library(Epi) +print( sessionInfo(), l=F ) +@ % +In the \texttt{Epi} package, follow-up in a cohort is represented in a +\texttt{Lexis} object. A \texttt{Lexis} object is a data frame with a +bit of extra structure representing the follow-up. For the +\texttt{nickel} data we would construct a \texttt{Lexis} object by: +<<>>= +data( nickel ) +nicL <- Lexis( entry = list( per=agein+dob, + age=agein, + tfh=agein-age1st ), + exit = list( age=ageout ), + exit.status = ( icd %in% c(162,163) )*1, + data = nickel ) +@ % +The \texttt{entry} argument is a \emph{named} list with the entry +points on each of the timescales we want to use. It defines the names +of the timescales and the entry points of the follow-up of each +person. The \texttt{exit} argument gives the exit time on \emph{one} +of the timescales, so the name of the element in this list must match +one of the names of the \texttt{entry} list. This is sufficient, +because the follow-up time on all time scales is the same, in this +case \texttt{ageout - agein}. Now take a look at the result: +<<>>= +str( nickel ) +str( nicL ) +head( nicL ) +@ +The \texttt{Lexis} object \texttt{nicL} has a variable for each +timescale which is the entry point on this timescale. The follow-up +time is in the variable \texttt{lex.dur} (\textbf{dur}ation). + +There is a \texttt{summary} function for \texttt{Lexis} objects that +list the number of transitions and records as well as the total +amount of follow-up time: +<<>>= +summary( nicL ) +@ +We defined the exit status to be death from lung cancer (ICD7 +162,163), i.e. this variable is 1 if follow-up ended with a death from +this cause. If follow-up ended alive or by death from another cause, +the exit status is coded 0, i.e. as a censoring. + +Note that the exit status is in the variable \texttt{lex.Xst} +(e\texttt{X}it \texttt{st}atus. The variable \texttt{lex.Cst} is the +state where the follow-up takes place (\texttt{C}urrent +\texttt{st}atus), in this case 0 (alive). + +It is possible to get a visualization of the follow-up along the +timescales chosen by using the \texttt{plot} method for \texttt{Lexis} +objects. \texttt{nicL} is an object of \emph{class} \texttt{Lexis}, so +using the function \texttt{plot()} on it means that \R\ will look for +the function \texttt{plot.Lexis} and use this function. +<>= +plot( nicL ) +@ % +The function allows quite a bit of control over the output, and a +\texttt{points.Lexis} function allows plotting of the endpoints of +follow-up: +<>= +par( mar=c(3,3,1,1), mgp=c(3,1,0)/1.6 ) +plot( nicL, 1:2, lwd=1, col=c("blue","red")[(nicL$exp>0)+1], + grid=TRUE, lty.grid=1, col.grid=gray(0.7), + xlim=1900+c(0,90), xaxs="i", + ylim= 10+c(0,90), yaxs="i", las=1 ) +points( nicL, 1:2, pch=c(NA,3)[nicL$lex.Xst+1], + col="lightgray", lwd=3, cex=1.5 ) +points( nicL, 1:2, pch=c(NA,3)[nicL$lex.Xst+1], + col=c("blue","red")[(nicL$exp>0)+1], lwd=1, cex=1.5 ) +@ % +The results of these two plotting commands are in figure \ref{fig:Lexis-diagram}. +\begin{figure}[tb] +\centering +\label{fig:Lexis-diagram} +\includegraphics[width=0.39\textwidth]{flup-nicL1} +\includegraphics[width=0.59\textwidth]{flup-nicL2} +\caption{\it Lexis diagram of the \textrm{\tt nickel} dataset; left + panel the default version, right panel with bells and whistles. The + red lines are for persons with exposure$>0$, so it is pretty evident + that the oldest ones are the exposed part of the cohort.} +\end{figure} + +\chapter{Subdividing follow-up for analysis} + +\section{Splitting the follow-up time along a timescale} + +The follow-up time in a cohort can be subdivided by for example +current age. This is achieved by the \texttt{splitLexis} (note that it +is \emph{not} called \texttt{split.Lexis}). This requires that the +timescale and the breakpoints on this timescale are supplied. Try: +<<>>= +nicS1 <- splitLexis( nicL, "age", breaks=seq(0,100,10) ) +summary( nicL ) +summary( nicS1 ) +@ % +So we see that the number of events and the amount of follow-up is the +same in the two data sets; only the number of records differ --- the +extra records all have \texttt{lex.Cst}=0 and \texttt{lex.Xst}=0. + +To see how records are split for each individual, it is useful to list +the results for a few individuals: +<<>>= +round( subset( nicS1, id %in% 8:10 ), 2 ) +@ % +The resulting object, \texttt{nicS1}, is again a \texttt{Lexis} +object, and so follow-up may be split further along another +timescale. Subsequently we list the results for individuals 8, 9 and +10 again: +<<>>= +nicS2 <- splitLexis( nicS1, "tfh", breaks=c(0,1,5,10,20,30,100) ) +round( subset( nicS2, id %in% 8:10 ), 2 ) +@ % +A more efficient (and more intuitive) way of making this double split +is to use the \texttt{splitMulti} function from the \texttt{popEpi} +package: +<<>>= +library( popEpi ) +nicM <- splitMulti( nicL, age = seq(0,100,10), + tfh = c(0,1,5,10,20,30,100) ) +summary( nicS2 ) +summary( nicM ) +@ % +So we see that the two ways of splitting data yields the same amount of +follow-up, but the results are not identical: +<<>>= +identical( nicS2, nicM ) +class( nicS2 ) +class( nicM ) +@ % +As we see, this is because the \texttt{nicM} object also is a +\texttt{data.table} object; the \texttt{splitMulti} uses the +\texttt{data.table} machinery which makes the splitting substantially +faster --- this is of particular interest if you operate on large data +sets ($>1,000,000$ records). + +Thus the recommended way of splitting follow-up time is by +\texttt{splitMulti}. But you should be aware that the result is a +\texttt{data.table} object, which in some circumstances behaves +slightly different from \texttt{data.frame}s. See the manual for +\texttt{data.table}. + +\subsection{Time scales as covariates} + +If we want to model the effect of these timescale variables on +occurrence rates, we will for each interval use either the value of +the left endpoint in each interval or the middle. There is a function +\texttt{timeBand} which returns either of these: +<<>>= +timeBand( nicM, "age", "middle" )[1:20] +# For nice printing and column labelling use the data.frame() function: +data.frame( nicS2[,c("lex.id","per","age","tfh","lex.dur")], + mid.age=timeBand( nicS2, "age", "middle" ), + mid.t=timeBand( nicS2, "tfh", "middle" ), + left.t=timeBand( nicS2, "tfh", "left" ), + right.t=timeBand( nicS2, "tfh", "right" ), + fact.t=timeBand( nicS2, "tfh", "factor" ) )[1:20,] +@ % +Note that these are characteristics of the intervals defined by +\texttt{breaks=}, \emph{not} the midpoints nor left or right endpoints +of the actual follow-up intervals (which would be \texttt{tfh} and +\texttt{tfh+lex.dur}, respectively). + +These functions are intended for modeling timescale variables as +factors (categorical variables) in which case the coding must be +independent of the censoring and mortality pattern --- it should only +depend on the chosen grouping of the timescale. Modeling timescales as +\emph{quantitative} should not be based on these codings but directly +on the values of the time-scale variables. + +\subsection{Differences between time scales} + +The midpoint (as well as the left and right interval endpoint) should +be used with caution if the variable \texttt{age1st} is modeled too; +the age at hire is logically equal to the difference between current +age (\texttt{age}) and time since hire (\texttt{thf}): +<<>>= +summary( (nicS2$age-nicS2$tfh) - nicS2$age1st ) +@ % +This calculation refer to the \emph{start} of each interval --- the +time scale variables in the \texttt{Lexis} object. But when using the +middle of the intervals, this relationship is not preserved: +<<>>= +summary( timeBand( nicS2, "age", "middle" ) - + timeBand( nicS2, "tfh", "middle" ) - nicS2$age1st ) +@ % +If all three variable are to be included in a model, you must make +sure that the \emph{substantial} relationship between the variables be +maintained. One way is to recompute age at first hire from the two +midpoint variables, but more straightforward would be to use the left +endpoint of the intervals, that is the time scale variables in the +\texttt{Lexis} object. The latter approach however requires that the +follow-up is split in fairly small chunks. + +\section{Cutting follow up time at a specific date} + +If we have a recording of the date of a specific event as for example +recovery or relapse, we may classify follow-up time as being before or +after this intermediate event, but it requires that follow-up records +that straddle the event be cut into two record. This is achieved with +the function \texttt{cutLexis}, which takes three arguments: the time +point, the timescale, and the value of the (new) state following the +date. + +Now we define the age for the nickel workers where the cumulative +exposure exceeds 50 exposure years: +<<>>= +subset( nicL, id %in% 8:10 ) +agehi <- nicL$age1st + 50 / nicL$exposure +nicC <- cutLexis( data = nicL, + cut = agehi, + timescale = "age", + new.state = 2, + precursor.states = 0 ) +subset( nicC, id %in% 8:10 ) +@ % +(The \texttt{precursor.states=} argument is explained below). Note +that individual 6 has had his follow-up split at 25 years since hire +where 50 exposure-years were attained. This could also have been +achieved in the split dataset \texttt{nicS2} instead of \texttt{nicL}, +try: +<<>>= +subset( nicS2, id %in% 8:10 ) +agehi <- nicS2$age1st + 50 / nicS2$exposure +nicS2C <- cutLexis( data = nicS2, + cut = agehi, + timescale = "age", + new.state = 2, + precursor.states = 0 ) +subset( nicS2C, id %in% 8:10 ) +@ % +The same results would have emerged if we had used the \texttt{nicM} +dataset (the \texttt{data.table} object). Mathematicians would say +that \texttt{splitLexis} and \texttt{cutLexis} are commutative. + +Note that follow-up subsequent to the event is classified as being in +state 2, but that the final transition to state 1 (death from lung +cancer) is preserved. This is the point of the +\texttt{precursor.states=} argument. It names the states (in this case +0, ``Alive'') that will be over-written by \texttt{new.state} (in this +case state 2, ``High exposure''), while state 1 (``Dead'') should not +be updated even if it is after the time where the persons moves to +state 2. In other words, only state 0 is a precursor to state 2, state +1 is always subsequent to state 2. Even if you at a high exposure +level, death is pretty final. + +If the intermediate event is to be used as a time-dependent variable +in a Cox-model, then \texttt{lex.Cst} should be used as the +time-dependent variable, and \texttt{lex.Xst==1} as the event. + +\chapter{Modeling rates} + +\section{Background} + +The purpose of subdividing follow-up data is to be able to model the +effects of the time scale variables as parametric functions. + +In a model that assumes a constant occurrence rate in each of the +intervals the likelihood contribution from each interval is the same +as the likelihood contribution from a Poisson variate $D$, say, with +mean $\lambda\ell$ where $\lambda$ is the rate and $\ell$ is the +interval length, and where the value of the variate $D$ is 1 or 0 +according to whether an event has occurred or not. Moreover, the +likelihood contributions from all follow-up intervals from a single +person are \emph{conditionally} independent (conditional on having +survived till the start of the interval in question). This implies +that the total contribution to the likelihood from a single person is +a product of terms, and hence the same as the likelihood of a number +of independent Poisson terms, one from each interval. + +Parametric modeling of the rates is obtained by using the \emph{value} +of the timescale for each interval as quantitative explanatory +variables, using for example splines. Thus the model will be one where +the rate is assumed constant in each interval, but where a parametric +form of the \emph{size} of the rate in each interval is imposed by the +model, using the timescale as a covariate. + +\section{Practicalities} + +In the nickel worker study we might want to look at the effects of age +and time since hire. If we want to use splines we must allocate knots +for anchoring the splines at each of the time scales, either by some +\textit{ad hoc} method or by using some sort of penalized splines. The +letter will not be treated here. + +Here we shall use the former approach and allocate 5 knots on each of +the two time-scales. We allocate knots so that we have the event evenly +distributed between the knots: +<<>>= +( a.kn <- with( subset( nicM, lex.Xst==1 ), quantile( age+lex.dur, (1:5-0.5)/5 ) ) ) +( t.kn <- with( subset( nicM, lex.Xst==1 ), quantile( tfh+lex.dur, (1:5-0.5)/5 ) ) ) +@ % +In the \texttt{Epi} package there is a convenience wrapper for the +\texttt{n}atural \texttt{s}pline generator \texttt{ns}, \texttt{Ns}, +that takes the smallest and the largest of a set of supplied knots to +be the boundary knots. + +\section{Models for rates} + +\subsection{One time scale} + +A model that only models lung cancer mortality +rates as a function of age would then be: +<<>>= +ma <- glm( (lex.Xst==1) ~ Ns(age,knots=a.kn), + family = poisson, + offset = log(lex.dur), + data = nicM ) +summary( ma ) +@ % +The offset, \texttt{log(lex.dur)} comes from the fact that the +likelihood for the follow-up data is the same as that for independent +Poisson variates with mean $\lambda\ell$, and that the default link +function for the Poisson family is the log, so that we are using a +linear model for the log-mean, that is $\log(\lambda) + \log(\ell)$. +But when we want a model for the log-rate ($\log(\lambda)$), the term +$\log(\ell)$ must be included as a covariate with regression +coefficient fixed to 1; a so-called offset. + +The parameters from the model do not have any direct interpretation +\textit{per se}, but we can compute the estimated lung cancer incidence +rates for a range of ages using \texttt{ci.pred} with a suitably +defined prediction data frame. Note that we must specify all +covariates in the model, also the variable in the offset, +\texttt{lex.dur}. We set the latter to 1000, because we want the lung +cancer mortality rates per 1000 PY. By default \texttt{ci.pred} yields a +prediction on the response-scale, that is the rate-scale: +<>= +nd <- data.frame( age=40:85, lex.dur=1000 ) +pr.a <- ci.pred( ma, newdata = nd ) +matplot( nd$age, pr.a, + type="l", lty=1, col=1, lwd=c(3,1,1), + log="y", xlab="Age (years)", + ylab="Lunng cancer mortality per 1000 PY") +@ % +\insfig{pr-a}{0.8}{Lung cancer mortality among Nickel smelter workers + by age. We see that the rates increase till about 55 years, and from +then on is approximately flat.} + +\subsection{More time scales} + +There may however also be an effect of time since hire too, so we can +add this term to the model: +<<>>= +mat <- update( ma, . ~ . + Ns(tfh,knots=t.kn) ) +summary( mat ) +@ % +This model has two time scales, age and time since hire, so it makes +little sense to report the effect of age for a \emph{fixed} value of +time since hire --- the time since hire increases by age. +Instead we can show the mortality rates for persons hired at different +ages, and report the \emph{joint} effect of increasing age and time +since hire. + +In order to get a feeling for the values that can be use we look at \texttt{age1st} +<<>>= +summary( nickel$age1st ) +@ +Thus we shall show mortality rates in ages 20--90 for persons hired in +ages 15, 25, 35 and 45: +<<>>= +nd <- data.frame( expand.grid( age=c(20:90,NA), age1st=seq(15,45,10) ) ) +nd <- transform( nd, tfh = ifelse( age > age1st, age-age1st, NA ), + lex.dur = 1000 ) +# makes no sense to have age < age1st +nd <- transform( nd, age = ifelse( age > age1st, age, NA ) ) +head( nd ) +@ % +With this in place we can plot the estimated rates as before, only +now we will get 4 separate lines. The purpose of inserting an +\texttt{NA} on the age-scale in the \texttt{expand.grid} is that the +different instances of \texttt{age1st} be separated by \texttt{NA}s, and +hence will not be connected when we plot the curves. The downside of +this trick is that lines cannot be plotted with different colors or +symbols. +<>= +pr.at <- ci.pred( mat, newdata = nd ) +matplot( nd$age, pr.at, + type="l", lty=1, col=1, lwd=c(3,1,1), + log="y", xlab="Age (years)", + ylab="Lunng cancer mortality per 1000 PY") +@ % +\insfig{pr-at}{0.8}{Lung cancer mortality among Nickel smelter workers + by age and age at hire 15, 25,35 and 45. Each line (except the + first) starts at the age of hire; we see that the later in life you + are hired, the smaller the initial risk, but the higher the eventual + risk of lung cancer death.} + +We can check whether the effect of time since hire is actually +improving the model: +<<>>= +anova( ma, mat, test="Chisq" ) +@ % +We see a pretty strong indication that this is the case. + +\subsection{Difference between time scales} + +However it might be the case that it really is the age at first hire +that is the main determinant (recall that +$\mathtt{age}-\mathtt{thf}=\mathtt{age1st}$), so we could fit a model +with this variable instead --- a model with only 1 timescale, namely +\texttt{age}. +<<>>= +( f.kn <- with( subset( nicM, lex.Xst==1 ), quantile( age1st, (1:5-0.5)/5 ) ) ) +maf <- update( ma, . ~ . + Ns(age1st,knots=f.kn) ) +summary( maf ) +anova( maf, ma, mat, test="Chisq" ) +@ % +We see that there is much less indication that the age at first hire has +an effect. + +For the sake of completeness we can draw the predicted values from the +\texttt{maf} model on top of the ones from the \textrm{mat} model: +<>= +pr.af <- ci.pred( maf, newdata = nd ) +matplot( nd$age, pr.at, + type="l", lty=1, col=1, lwd=c(3,1,1), + log="y", xlab="Age (years)", + ylab="Lunng cancer mortality per 1000 PY") +matlines( nd$age, pr.af, + type="l", lty=1, col=2, lwd=c(3,0,0) ) +@ % +\insfig{pr-at-af}{0.8}{Lung cancer mortality among Nickel smelter + workers by age and age at hire 15, 25,35 and 45. Each line (except + the first) starts at the age of hire; we see that the later in life + you are hired, the smaller the initial risk, but the higher the + eventual risk of lung cancer death. The red lines are from the model + \textrm{\tt maf} where the lines are constrained to be parallel, and + which gives a worse fit to data.} + +\subsection{The complete picture --- exercise} + +We could fit the remaining models where one or more of the three +variables are included, and compare all of them: +<<>>= +maft <- update( mat, . ~ . + Ns(age1st,knots=f.kn) ) +summary( maft ) +mft <- update( maft, . ~ . - Ns(age,knots=a.kn) ) +mf <- update( maf , . ~ . - Ns(age,knots=a.kn) ) +mt <- update( mat , . ~ . - Ns(age,knots=a.kn) ) +allp <- anova( maft, mat, ma, maf, mf, mft, mt, mat, + maf, maft, mft, + test="Chisq" ) +mall <- as.matrix( allp ) +cbind( mod = c("maft","mat","ma","maf","mf","mft","mt","mat","maf","maft","mft"), + round( allp[,1:5], 3 ) ) +@ % +\begin{enumerate} +\item Explain why there are \texttt{NA}s among the parameters in the + model \texttt{maf}. +\item Draw a graph (a ``DAG'') with the models as nodes and the tests + as vertices, put the p-values on the vertices and use the result to + argue that the model with age an time since hire is actually the + most sensible description in this case. +\end{enumerate} + +\chapter{Competing risks --- multiple types of events} + +If we want to consider death from lung cancer and death from other +causes as separate events we can code these as for example 1 and 2. +<<>>= +data( nickel ) +nicL <- Lexis( entry = list( per = agein+dob, + age = agein, + tfh = agein-age1st ), + exit = list( age = ageout ), + exit.status = ( icd > 0 ) + ( icd %in% c(162,163) ), + data = nickel ) +summary( nicL ) +subset( nicL, id %in% 8:10 ) +@ % +In order to have a more readable output we can label the states, we +can enter the names of these in the \texttt{states} parameter, try for +example: +<<>>= +nicL <- Lexis( entry = list( per = agein+dob, + age = agein, + tfh = agein-age1st ), + exit = list( age = ageout ), + exit.status = ( icd > 0 ) + ( icd %in% c(162,163) ), + data = nickel, + states = c("Alive","D.oth","D.lung") ) +summary( nicL ) +str( nicL ) +@ % +Note that the \texttt{Lexis} function automatically assumes that all +persons enter in the first level (given in the \texttt{states=} +argument), corresponding to the numerical values given in +\texttt{exit.status}. + +When we cut at a date as in this case, the date where cumulative +exposure exceeds 50 exposure-years, we get the follow-up \emph{after} +the date classified as being in the new state if the exit +(\texttt{lex.Xst}) was to a state we defined as one of the +\texttt{precursor.states}: +<<>>= +nicL$agehi <- nicL$age1st + 50 / nicL$exposure +nicC <- cutLexis( data = nicL, + cut = nicL$agehi, + timescale = "age", + new.state = "HiExp", + precursor.states = "Alive" ) +subset( nicC, id %in% 8:10 ) +summary( nicC, scale=1000 ) +@ % +Note that the persons-years is the same, but that the number of +events has changed. This is because events are now defined as any +transition, including the transitions to \texttt{HiExp}. + +Also note that (so far) it is necessary to specify the variable with +the cut points in full, using only \texttt{cut=agehi} would give an error. + +\section{Subdividing states} + +It may be of interest to subdivide the states following the +intermediate event according to whether the event has occurred or +not. That is done by the argument \texttt{split.states=TRUE}. + +Moreover, it will also often be of interest to introduce a new +timescale indicating the time since intermediate event. This can be +done by the argument \texttt{new.scale=TRUE}, alternatively +\texttt{new.scale="tfe"}, as illustrated here: +<<>>= +nicC <- cutLexis( data = nicL, + cut = nicL$agehi, + timescale = "age", + new.state = "HiExp", + new.scale = "tfe", + split.states = TRUE, + precursor.states = "Alive" ) +subset( nicC, id %in% 8:10 ) +summary( nicC, scale=1000, timeScales=TRUE ) +@ % $ +Note that the \texttt{timeScales=TRUE} to \texttt{summary} lists the +timescales available in the object, and also indicates which of them +that are defined as time since entry to a particular state. This +facility is not used here, but it is needed when simulating follow-up +data --- see the vignette on \textrm{simLexis}. + +With 6 different states it is quite difficult to get an overview of +the transitions between states from the \texttt{summary()}. Therefore +there is function that gives a graphical display of the states showing +the transitions between the states: +<>= +boxes( nicC, boxpos = list(x=c(10,10,80,80,80,80), + y=c(75,25,87,63,13,37)), + scale.Y = 1000, + show.BE = TRUE ) +@ % +\insfig{nic-box}{0.9}{Transitions between states; the number in the + middle of each box is the person-years (in 1000s --- since + \textrm{\tt scale.Y=1000}), the numbers at the bottom of the boxes + are the number that start, respectively end their follow-up in each + state. The numbers on the arrows are the number of transitions and + crude transition rates (the latter in events per 1000 PY).\newline + The function \textrm{\tt boxes.Lexis} has a zillion arguments to + fine-tune the appearance of the display in terms of colors etc.} + +%% \section{Multiple events of the same type (recurrent events)} +%% Sometimes more events of the same type are recorded for each person and +%% one would then like to count these and put follow-up time in states accordingly. +%% Essentially, each set of cutpoints represents progressions from one +%% state to the next. Therefore the states should be numbered, and the +%% numbering of states subsequently occupied be increased accordingly. + +%% This is a behaviour different from the one outlined above, and it is +%% achieved by the argument \texttt{count=TRUE} to +%% \texttt{cutLexis}. When \texttt{count} is set to \texttt{TRUE}, the +%% value of the arguments \texttt{new.state} and +%% \texttt{precursor.states} are ignored. Actually, when using the +%% argument \texttt{count=TRUE}, the function \texttt{countLexis} is +%% called, so an alternative is to use this directly. + +%% \renewcommand{\bibname}{References} +%% \bibliographystyle{plain} +%% \bibliography{% +%% /home/bendix/art/bibtex/BxC,% +%% /home/bendix/art/bibtex/Stat,% +%% /home/bendix/art/bibtex/DMCa,% +%% /home/bendix/art/bibtex/Diabetes% +%% } +%% \addcontentsline{toc}{section}{\bibname} + +\end{document} diff -Nru r-cran-epi-2.19/vignettes/flup.rwl r-cran-epi-2.30/vignettes/flup.rwl --- r-cran-epi-2.19/vignettes/flup.rwl 1970-01-01 00:00:00.000000000 +0000 +++ r-cran-epi-2.30/vignettes/flup.rwl 2018-05-03 14:34:57.000000000 +0000 @@ -0,0 +1,48 @@ +R version 3.4.4 (2018-03-15) + --------------------------------------------- + Folder: /home/bendix/stat/R/lib.src/Epi/pkg/vignettes + Started: Thursday 03. May 2018, 16:34:54 + --------------------------------------------- +Writing to file flup.tex +Processing code chunks with options ... + 1 : keep.source term verbatim (flup.rnw:5) + 2 : echo keep.source term verbatim (flup.rnw:101) + 3 : echo keep.source term verbatim (flup.rnw:109) + 4 : echo keep.source term verbatim (flup.rnw:126) + 5 : echo keep.source term verbatim (flup.rnw:138) + 6 : echo keep.source term verbatim pdf (label = nicL1, flup.rnw:156) + 7 : echo keep.source term verbatim pdf (label = nicL2, flup.rnw:162) + 8 : echo keep.source term verbatim (flup.rnw:193) + 9 : echo keep.source term verbatim (flup.rnw:204) +10 : echo keep.source term verbatim (flup.rnw:211) +11 : echo keep.source term verbatim (flup.rnw:218) +12 : echo keep.source term verbatim (flup.rnw:227) +13 : echo keep.source term verbatim (flup.rnw:250) +14 : echo keep.source term verbatim (flup.rnw:278) +15 : echo keep.source term verbatim (flup.rnw:284) +16 : echo keep.source term verbatim (flup.rnw:308) +17 : echo keep.source term verbatim (flup.rnw:323) +18 : echo keep.source term verbatim (flup.rnw:390) +19 : echo keep.source term verbatim (flup.rnw:405) +20 : echo keep.source term verbatim pdf (label = pr-a, flup.rnw:429) +21 : echo keep.source term verbatim (flup.rnw:445) +22 : echo keep.source term verbatim (flup.rnw:457) +23 : echo keep.source term verbatim (flup.rnw:462) +24 : echo keep.source term verbatim pdf (label = pr-at, flup.rnw:477) +25 : echo keep.source term verbatim (flup.rnw:492) +26 : echo keep.source term verbatim (flup.rnw:504) +27 : echo keep.source term verbatim pdf (label = pr-at-af, flup.rnw:515) +28 : echo keep.source term verbatim (flup.rnw:536) +29 : echo keep.source term verbatim (flup.rnw:562) +30 : echo keep.source term verbatim (flup.rnw:576) +31 : echo keep.source term verbatim (flup.rnw:597) +32 : echo keep.source term verbatim (flup.rnw:624) +33 : echo keep.source term verbatim pdf (label = nic-box, flup.rnw:645) + +You can now run (pdf)latex on ‘flup.tex’ + + --------------------------------------------- + Folder: /home/bendix/stat/R/lib.src/Epi/pkg/vignettes + Ended: Thursday 03. May 2018, 16:34:57 + Elapsed: 00:00:03 + --------------------------------------------- diff -Nru r-cran-epi-2.19/vignettes/flup.tex r-cran-epi-2.30/vignettes/flup.tex --- r-cran-epi-2.19/vignettes/flup.tex 1970-01-01 00:00:00.000000000 +0000 +++ r-cran-epi-2.30/vignettes/flup.tex 2018-05-03 14:34:57.000000000 +0000 @@ -0,0 +1,1339 @@ + +%\VignetteIndexEntry{Follow-up data with R and Epi} +\documentclass[a4paper,dvipsnames,twoside,12pt]{report} + +\newcommand{\Title}{Follow-up data with \R\ and \texttt{Epi}} +\newcommand{\Tit}{Follow-up} +\newcommand{\Version}{\ } +\newcommand{\Dates}{December 2017} +\newcommand{\Where}{SDCC} +\newcommand{\Homepage}{\url{http://bendixcarstensen.com/} } +\newcommand{\Faculty}{\begin{tabular}{rl} +Bendix Carstensen + & Steno Diabetes Center, Gentofte, Denmark\\ + & {\small \& Department of Biostatistics, + University of Copenhagen} \\ + & \texttt{b@bxc.dk} \\ + & \url{http://BendixCarstensen.com} \\[1em] + \end{tabular}} + +\input{topreport} +\renewcommand{\rwpre}{./flup} + +\chapter{Follow-up data in the \texttt{Epi} package} + +In the \texttt{Epi}-package, follow-up data is represented by adding +some extra variables to a data frame. Such a data frame is called a +\texttt{Lexis} object. The tools for handling follow-up data then use +the structure of this for special plots, tabulations etc. + +Follow-up data basically consists of a time of entry, a time of exit +and an indication of the status at exit (normally either ``alive'' or +``dead''). Implicitly is also assumed a status \emph{during} the +follow-up (usually ``alive''). + +\begin{figure}[htbp] + \centering +\setlength{\unitlength}{1pt} +\begin{picture}(210,70)(0,75) +%\scriptsize +\thicklines + \put( 0,80){\makebox(0,0)[r]{Age-scale}} + \put( 50,80){\line(1,0){150}} + \put( 50,80){\line(0,1){5}} + \put(100,80){\line(0,1){5}} + \put(150,80){\line(0,1){5}} + \put(200,80){\line(0,1){5}} + \put( 50,77){\makebox(0,0)[t]{35}} + \put(100,77){\makebox(0,0)[t]{40}} + \put(150,77){\makebox(0,0)[t]{45}} + \put(200,77){\makebox(0,0)[t]{50}} + + \put( 0,115){\makebox(0,0)[r]{Follow-up}} + + \put( 80,105){\makebox(0,0)[r]{\small Two}} + \put( 90,105){\line(1,0){87}} + \put( 90,100){\line(0,1){10}} + \put(100,100){\line(0,1){10}} + \put(150,100){\line(0,1){10}} + \put(180,105){\circle{6}} + \put( 95,110){\makebox(0,0)[b]{1}} + \put(125,110){\makebox(0,0)[b]{5}} + \put(165,110){\makebox(0,0)[b]{3}} + + \put( 50,130){\makebox(0,0)[r]{\small One}} + \put( 60,130){\line(1,0){70}} + \put( 60,125){\line(0,1){10}} + \put(100,125){\line(0,1){10}} + \put(130,130){\circle*{6}} + \put( 80,135){\makebox(0,0)[b]{4}} + \put(115,135){\makebox(0,0)[b]{3}} +\end{picture} + \caption{\it Follow-up of two persons} + \label{fig:fu2} +\end{figure} + +\section{Timescales} + +A timescale is a variable that varies deterministicly \emph{within} +each person during follow-up, \textit{e.g.}: +\begin{itemize} + \item Age + \item Calendar time + \item Time since treatment + \item Time since relapse +\end{itemize} +All timescales advance at the same pace, so the time followed is the +same on all timescales. Therefore, it suffices to use only the entry +point on each of the time scale, for example: +\begin{itemize} + \item Age at entry. + \item Date of entry. + \item Time since treatment (\emph{at} treatment this is 0). + \item Time since relapse (\emph{at} relapse this is 0).. +\end{itemize} +For illustration we need to load the \texttt{Epi} package: +\begin{Schunk} +\begin{Sinput} +> library(Epi) +> print( sessionInfo(), l=F ) +\end{Sinput} +\begin{Soutput} +R version 3.4.4 (2018-03-15) +Platform: x86_64-pc-linux-gnu (64-bit) +Running under: Ubuntu 14.04.5 LTS + +Matrix products: default +BLAS: /usr/lib/openblas-base/libopenblas.so.0 +LAPACK: /usr/lib/lapack/liblapack.so.3.0 + +attached base packages: +[1] utils datasets graphics grDevices stats methods base + +other attached packages: +[1] Epi_2.29 + +loaded via a namespace (and not attached): + [1] cmprsk_2.2-7 zoo_1.8-0 MASS_7.3-49 compiler_3.4.4 + [5] Matrix_1.2-14 plyr_1.8.4 parallel_3.4.4 survival_2.42-3 + [9] etm_1.0.1 Rcpp_0.12.12 splines_3.4.4 grid_3.4.4 +[13] data.table_1.10.4 numDeriv_2016.8-1 lattice_0.20-35 +\end{Soutput} +\end{Schunk} +In the \texttt{Epi} package, follow-up in a cohort is represented in a +\texttt{Lexis} object. A \texttt{Lexis} object is a data frame with a +bit of extra structure representing the follow-up. For the +\texttt{nickel} data we would construct a \texttt{Lexis} object by: +\begin{Schunk} +\begin{Sinput} +> data( nickel ) +> nicL <- Lexis( entry = list( per=agein+dob, ++ age=agein, ++ tfh=agein-age1st ), ++ exit = list( age=ageout ), ++ exit.status = ( icd %in% c(162,163) )*1, ++ data = nickel ) +\end{Sinput} +\begin{Soutput} +NOTE: entry.status has been set to 0 for all. +\end{Soutput} +\end{Schunk} +The \texttt{entry} argument is a \emph{named} list with the entry +points on each of the timescales we want to use. It defines the names +of the timescales and the entry points of the follow-up of each +person. The \texttt{exit} argument gives the exit time on \emph{one} +of the timescales, so the name of the element in this list must match +one of the names of the \texttt{entry} list. This is sufficient, +because the follow-up time on all time scales is the same, in this +case \texttt{ageout - agein}. Now take a look at the result: +\begin{Schunk} +\begin{Sinput} +> str( nickel ) +\end{Sinput} +\begin{Soutput} +'data.frame': 679 obs. of 7 variables: + $ id : num 3 4 6 8 9 10 15 16 17 18 ... + $ icd : num 0 162 163 527 150 163 334 160 420 12 ... + $ exposure: num 5 5 10 9 0 2 0 0.5 0 0 ... + $ dob : num 1889 1886 1881 1886 1880 ... + $ age1st : num 17.5 23.2 25.2 24.7 30 ... + $ agein : num 45.2 48.3 53 47.9 54.7 ... + $ ageout : num 93 63.3 54.2 69.7 76.8 ... +\end{Soutput} +\begin{Sinput} +> str( nicL ) +\end{Sinput} +\begin{Soutput} +Classes ‘Lexis’ and 'data.frame': 679 obs. of 14 variables: + $ per : num 1934 1934 1934 1934 1934 ... + $ age : num 45.2 48.3 53 47.9 54.7 ... + $ tfh : num 27.7 25.1 27.7 23.2 24.8 ... + $ lex.dur : num 47.75 15 1.17 21.77 22.1 ... + $ lex.Cst : num 0 0 0 0 0 0 0 0 0 0 ... + $ lex.Xst : num 0 1 1 0 0 1 0 0 0 0 ... + $ lex.id : int 1 2 3 4 5 6 7 8 9 10 ... + $ id : num 3 4 6 8 9 10 15 16 17 18 ... + $ icd : num 0 162 163 527 150 163 334 160 420 12 ... + $ exposure: num 5 5 10 9 0 2 0 0.5 0 0 ... + $ dob : num 1889 1886 1881 1886 1880 ... + $ age1st : num 17.5 23.2 25.2 24.7 30 ... + $ agein : num 45.2 48.3 53 47.9 54.7 ... + $ ageout : num 93 63.3 54.2 69.7 76.8 ... + - attr(*, "time.scales")= chr "per" "age" "tfh" + - attr(*, "time.since")= chr "" "" "" + - attr(*, "breaks")=List of 3 + ..$ per: NULL + ..$ age: NULL + ..$ tfh: NULL +\end{Soutput} +\begin{Sinput} +> head( nicL ) +\end{Sinput} +\begin{Soutput} + per age tfh lex.dur lex.Cst lex.Xst lex.id id icd exposure dob +1 1934.246 45.2273 27.7465 47.7535 0 0 1 3 0 5 1889.019 +2 1934.246 48.2684 25.0820 15.0028 0 1 2 4 162 5 1885.978 +3 1934.246 52.9917 27.7465 1.1727 0 1 3 6 163 10 1881.255 +4 1934.246 47.9067 23.1861 21.7727 0 0 4 8 527 9 1886.340 +5 1934.246 54.7465 24.7890 22.0977 0 0 5 9 150 0 1879.500 +6 1934.246 44.3314 23.0437 18.2099 0 1 6 10 163 2 1889.915 + age1st agein ageout +1 17.4808 45.2273 92.9808 +2 23.1864 48.2684 63.2712 +3 25.2452 52.9917 54.1644 +4 24.7206 47.9067 69.6794 +5 29.9575 54.7465 76.8442 +6 21.2877 44.3314 62.5413 +\end{Soutput} +\end{Schunk} +The \texttt{Lexis} object \texttt{nicL} has a variable for each +timescale which is the entry point on this timescale. The follow-up +time is in the variable \texttt{lex.dur} (\textbf{dur}ation). + +There is a \texttt{summary} function for \texttt{Lexis} objects that +list the number of transitions and records as well as the total +amount of follow-up time: +\begin{Schunk} +\begin{Sinput} +> summary( nicL ) +\end{Sinput} +\begin{Soutput} +Transitions: + To +From 0 1 Records: Events: Risk time: Persons: + 0 542 137 679 137 15348.06 679 +\end{Soutput} +\end{Schunk} +We defined the exit status to be death from lung cancer (ICD7 +162,163), i.e. this variable is 1 if follow-up ended with a death from +this cause. If follow-up ended alive or by death from another cause, +the exit status is coded 0, i.e. as a censoring. + +Note that the exit status is in the variable \texttt{lex.Xst} +(e\texttt{X}it \texttt{st}atus. The variable \texttt{lex.Cst} is the +state where the follow-up takes place (\texttt{C}urrent +\texttt{st}atus), in this case 0 (alive). + +It is possible to get a visualization of the follow-up along the +timescales chosen by using the \texttt{plot} method for \texttt{Lexis} +objects. \texttt{nicL} is an object of \emph{class} \texttt{Lexis}, so +using the function \texttt{plot()} on it means that \R\ will look for +the function \texttt{plot.Lexis} and use this function. +\begin{Schunk} +\begin{Sinput} +> plot( nicL ) +\end{Sinput} +\end{Schunk} +The function allows quite a bit of control over the output, and a +\texttt{points.Lexis} function allows plotting of the endpoints of +follow-up: +\begin{Schunk} +\begin{Sinput} +> par( mar=c(3,3,1,1), mgp=c(3,1,0)/1.6 ) +> plot( nicL, 1:2, lwd=1, col=c("blue","red")[(nicL$exp>0)+1], ++ grid=TRUE, lty.grid=1, col.grid=gray(0.7), ++ xlim=1900+c(0,90), xaxs="i", ++ ylim= 10+c(0,90), yaxs="i", las=1 ) +> points( nicL, 1:2, pch=c(NA,3)[nicL$lex.Xst+1], ++ col="lightgray", lwd=3, cex=1.5 ) +> points( nicL, 1:2, pch=c(NA,3)[nicL$lex.Xst+1], ++ col=c("blue","red")[(nicL$exp>0)+1], lwd=1, cex=1.5 ) +\end{Sinput} +\end{Schunk} +The results of these two plotting commands are in figure \ref{fig:Lexis-diagram}. +\begin{figure}[tb] +\centering +\label{fig:Lexis-diagram} +\includegraphics[width=0.39\textwidth]{flup-nicL1} +\includegraphics[width=0.59\textwidth]{flup-nicL2} +\caption{\it Lexis diagram of the \textrm{\tt nickel} dataset; left + panel the default version, right panel with bells and whistles. The + red lines are for persons with exposure$>0$, so it is pretty evident + that the oldest ones are the exposed part of the cohort.} +\end{figure} + +\chapter{Subdividing follow-up for analysis} + +\section{Splitting the follow-up time along a timescale} + +The follow-up time in a cohort can be subdivided by for example +current age. This is achieved by the \texttt{splitLexis} (note that it +is \emph{not} called \texttt{split.Lexis}). This requires that the +timescale and the breakpoints on this timescale are supplied. Try: +\begin{Schunk} +\begin{Sinput} +> nicS1 <- splitLexis( nicL, "age", breaks=seq(0,100,10) ) +> summary( nicL ) +\end{Sinput} +\begin{Soutput} +Transitions: + To +From 0 1 Records: Events: Risk time: Persons: + 0 542 137 679 137 15348.06 679 +\end{Soutput} +\begin{Sinput} +> summary( nicS1 ) +\end{Sinput} +\begin{Soutput} +Transitions: + To +From 0 1 Records: Events: Risk time: Persons: + 0 2073 137 2210 137 15348.06 679 +\end{Soutput} +\end{Schunk} +So we see that the number of events and the amount of follow-up is the +same in the two data sets; only the number of records differ --- the +extra records all have \texttt{lex.Cst}=0 and \texttt{lex.Xst}=0. + +To see how records are split for each individual, it is useful to list +the results for a few individuals: +\begin{Schunk} +\begin{Sinput} +> round( subset( nicS1, id %in% 8:10 ), 2 ) +\end{Sinput} +\begin{Soutput} + lex.id per age tfh lex.dur lex.Cst lex.Xst id icd exposure dob age1st +11 4 1934.25 47.91 23.19 2.09 0 0 8 527 9 1886.34 24.72 +12 4 1936.34 50.00 25.28 10.00 0 0 8 527 9 1886.34 24.72 +13 4 1946.34 60.00 35.28 9.68 0 0 8 527 9 1886.34 24.72 +14 5 1934.25 54.75 24.79 5.25 0 0 9 150 0 1879.50 29.96 +15 5 1939.50 60.00 30.04 10.00 0 0 9 150 0 1879.50 29.96 +16 5 1949.50 70.00 40.04 6.84 0 0 9 150 0 1879.50 29.96 +17 6 1934.25 44.33 23.04 5.67 0 0 10 163 2 1889.91 21.29 +18 6 1939.91 50.00 28.71 10.00 0 0 10 163 2 1889.91 21.29 +19 6 1949.91 60.00 38.71 2.54 0 1 10 163 2 1889.91 21.29 + agein ageout +11 47.91 69.68 +12 47.91 69.68 +13 47.91 69.68 +14 54.75 76.84 +15 54.75 76.84 +16 54.75 76.84 +17 44.33 62.54 +18 44.33 62.54 +19 44.33 62.54 +\end{Soutput} +\end{Schunk} +The resulting object, \texttt{nicS1}, is again a \texttt{Lexis} +object, and so follow-up may be split further along another +timescale. Subsequently we list the results for individuals 8, 9 and +10 again: +\begin{Schunk} +\begin{Sinput} +> nicS2 <- splitLexis( nicS1, "tfh", breaks=c(0,1,5,10,20,30,100) ) +> round( subset( nicS2, id %in% 8:10 ), 2 ) +\end{Sinput} +\begin{Soutput} + lex.id per age tfh lex.dur lex.Cst lex.Xst id icd exposure dob age1st +13 4 1934.25 47.91 23.19 2.09 0 0 8 527 9 1886.34 24.72 +14 4 1936.34 50.00 25.28 4.72 0 0 8 527 9 1886.34 24.72 +15 4 1941.06 54.72 30.00 5.28 0 0 8 527 9 1886.34 24.72 +16 4 1946.34 60.00 35.28 9.68 0 0 8 527 9 1886.34 24.72 +17 5 1934.25 54.75 24.79 5.21 0 0 9 150 0 1879.50 29.96 +18 5 1939.46 59.96 30.00 0.04 0 0 9 150 0 1879.50 29.96 +19 5 1939.50 60.00 30.04 10.00 0 0 9 150 0 1879.50 29.96 +20 5 1949.50 70.00 40.04 6.84 0 0 9 150 0 1879.50 29.96 +21 6 1934.25 44.33 23.04 5.67 0 0 10 163 2 1889.91 21.29 +22 6 1939.91 50.00 28.71 1.29 0 0 10 163 2 1889.91 21.29 +23 6 1941.20 51.29 30.00 8.71 0 0 10 163 2 1889.91 21.29 +24 6 1949.91 60.00 38.71 2.54 0 1 10 163 2 1889.91 21.29 + agein ageout +13 47.91 69.68 +14 47.91 69.68 +15 47.91 69.68 +16 47.91 69.68 +17 54.75 76.84 +18 54.75 76.84 +19 54.75 76.84 +20 54.75 76.84 +21 44.33 62.54 +22 44.33 62.54 +23 44.33 62.54 +24 44.33 62.54 +\end{Soutput} +\end{Schunk} +A more efficient (and more intuitive) way of making this double split +is to use the \texttt{splitMulti} function from the \texttt{popEpi} +package: +\begin{Schunk} +\begin{Sinput} +> library( popEpi ) +> nicM <- splitMulti( nicL, age = seq(0,100,10), ++ tfh = c(0,1,5,10,20,30,100) ) +> summary( nicS2 ) +\end{Sinput} +\begin{Soutput} +Transitions: + To +From 0 1 Records: Events: Risk time: Persons: + 0 2992 137 3129 137 15348.06 679 +\end{Soutput} +\begin{Sinput} +> summary( nicM ) +\end{Sinput} +\begin{Soutput} +Transitions: + To +From 0 1 Records: Events: Risk time: Persons: + 0 2992 137 3129 137 15348.06 679 +\end{Soutput} +\end{Schunk} +So we see that the two ways of splitting data yields the same amount of +follow-up, but the results are not identical: +\begin{Schunk} +\begin{Sinput} +> identical( nicS2, nicM ) +\end{Sinput} +\begin{Soutput} +[1] FALSE +\end{Soutput} +\begin{Sinput} +> class( nicS2 ) +\end{Sinput} +\begin{Soutput} +[1] "Lexis" "data.frame" +\end{Soutput} +\begin{Sinput} +> class( nicM ) +\end{Sinput} +\begin{Soutput} +[1] "Lexis" "data.table" "data.frame" +\end{Soutput} +\end{Schunk} +As we see, this is because the \texttt{nicM} object also is a +\texttt{data.table} object; the \texttt{splitMulti} uses the +\texttt{data.table} machinery which makes the splitting substantially +faster --- this is of particular interest if you operate on large data +sets ($>1,000,000$ records). + +Thus the recommended way of splitting follow-up time is by +\texttt{splitMulti}. But you should be aware that the result is a +\texttt{data.table} object, which in some circumstances behaves +slightly different from \texttt{data.frame}s. See the manual for +\texttt{data.table}. + +\subsection{Time scales as covariates} + +If we want to model the effect of these timescale variables on +occurrence rates, we will for each interval use either the value of +the left endpoint in each interval or the middle. There is a function +\texttt{timeBand} which returns either of these: +\begin{Schunk} +\begin{Sinput} +> timeBand( nicM, "age", "middle" )[1:20] +\end{Sinput} +\begin{Soutput} + [1] 45 45 55 65 75 85 95 45 55 55 65 55 45 55 55 65 55 55 65 75 +\end{Soutput} +\begin{Sinput} +> # For nice printing and column labelling use the data.frame() function: +> data.frame( nicS2[,c("lex.id","per","age","tfh","lex.dur")], ++ mid.age=timeBand( nicS2, "age", "middle" ), ++ mid.t=timeBand( nicS2, "tfh", "middle" ), ++ left.t=timeBand( nicS2, "tfh", "left" ), ++ right.t=timeBand( nicS2, "tfh", "right" ), ++ fact.t=timeBand( nicS2, "tfh", "factor" ) )[1:20,] +\end{Sinput} +\begin{Soutput} + lex.id per age tfh lex.dur mid.age mid.t left.t right.t fact.t +1 1 1934.246 45.2273 27.7465 2.2535 45 25 20 30 (20,30] +2 1 1936.500 47.4808 30.0000 2.5192 45 65 30 100 (30,100] +3 1 1939.019 50.0000 32.5192 10.0000 55 65 30 100 (30,100] +4 1 1949.019 60.0000 42.5192 10.0000 65 65 30 100 (30,100] +5 1 1959.019 70.0000 52.5192 10.0000 75 65 30 100 (30,100] +6 1 1969.019 80.0000 62.5192 10.0000 85 65 30 100 (30,100] +7 1 1979.019 90.0000 72.5192 2.9808 95 65 30 100 (30,100] +8 2 1934.246 48.2684 25.0820 1.7316 45 25 20 30 (20,30] +9 2 1935.978 50.0000 26.8136 3.1864 55 25 20 30 (20,30] +10 2 1939.164 53.1864 30.0000 6.8136 55 65 30 100 (30,100] +11 2 1945.978 60.0000 36.8136 3.2712 65 65 30 100 (30,100] +12 3 1934.246 52.9917 27.7465 1.1727 55 25 20 30 (20,30] +13 4 1934.246 47.9067 23.1861 2.0933 45 25 20 30 (20,30] +14 4 1936.340 50.0000 25.2794 4.7206 55 25 20 30 (20,30] +15 4 1941.060 54.7206 30.0000 5.2794 55 65 30 100 (30,100] +16 4 1946.340 60.0000 35.2794 9.6794 65 65 30 100 (30,100] +17 5 1934.246 54.7465 24.7890 5.2110 55 25 20 30 (20,30] +18 5 1939.457 59.9575 30.0000 0.0425 55 65 30 100 (30,100] +19 5 1939.500 60.0000 30.0425 10.0000 65 65 30 100 (30,100] +20 5 1949.500 70.0000 40.0425 6.8442 75 65 30 100 (30,100] +\end{Soutput} +\end{Schunk} +Note that these are characteristics of the intervals defined by +\texttt{breaks=}, \emph{not} the midpoints nor left or right endpoints +of the actual follow-up intervals (which would be \texttt{tfh} and +\texttt{tfh+lex.dur}, respectively). + +These functions are intended for modeling timescale variables as +factors (categorical variables) in which case the coding must be +independent of the censoring and mortality pattern --- it should only +depend on the chosen grouping of the timescale. Modeling timescales as +\emph{quantitative} should not be based on these codings but directly +on the values of the time-scale variables. + +\subsection{Differences between time scales} + +The midpoint (as well as the left and right interval endpoint) should +be used with caution if the variable \texttt{age1st} is modeled too; +the age at hire is logically equal to the difference between current +age (\texttt{age}) and time since hire (\texttt{thf}): +\begin{Schunk} +\begin{Sinput} +> summary( (nicS2$age-nicS2$tfh) - nicS2$age1st ) +\end{Sinput} +\begin{Soutput} + Min. 1st Qu. Median Mean 3rd Qu. Max. +-7.105e-15 0.000e+00 0.000e+00 2.214e-17 0.000e+00 7.105e-15 +\end{Soutput} +\end{Schunk} +This calculation refer to the \emph{start} of each interval --- the +time scale variables in the \texttt{Lexis} object. But when using the +middle of the intervals, this relationship is not preserved: +\begin{Schunk} +\begin{Sinput} +> summary( timeBand( nicS2, "age", "middle" ) - ++ timeBand( nicS2, "tfh", "middle" ) - nicS2$age1st ) +\end{Sinput} +\begin{Soutput} + Min. 1st Qu. Median Mean 3rd Qu. Max. +-39.958 -24.178 -5.103 -10.129 2.575 12.519 +\end{Soutput} +\end{Schunk} +If all three variable are to be included in a model, you must make +sure that the \emph{substantial} relationship between the variables be +maintained. One way is to recompute age at first hire from the two +midpoint variables, but more straightforward would be to use the left +endpoint of the intervals, that is the time scale variables in the +\texttt{Lexis} object. The latter approach however requires that the +follow-up is split in fairly small chunks. + +\section{Cutting follow up time at a specific date} + +If we have a recording of the date of a specific event as for example +recovery or relapse, we may classify follow-up time as being before or +after this intermediate event, but it requires that follow-up records +that straddle the event be cut into two record. This is achieved with +the function \texttt{cutLexis}, which takes three arguments: the time +point, the timescale, and the value of the (new) state following the +date. + +Now we define the age for the nickel workers where the cumulative +exposure exceeds 50 exposure years: +\begin{Schunk} +\begin{Sinput} +> subset( nicL, id %in% 8:10 ) +\end{Sinput} +\begin{Soutput} + per age tfh lex.dur lex.Cst lex.Xst lex.id id icd exposure dob +4 1934.246 47.9067 23.1861 21.7727 0 0 4 8 527 9 1886.340 +5 1934.246 54.7465 24.7890 22.0977 0 0 5 9 150 0 1879.500 +6 1934.246 44.3314 23.0437 18.2099 0 1 6 10 163 2 1889.915 + age1st agein ageout +4 24.7206 47.9067 69.6794 +5 29.9575 54.7465 76.8442 +6 21.2877 44.3314 62.5413 +\end{Soutput} +\begin{Sinput} +> agehi <- nicL$age1st + 50 / nicL$exposure +> nicC <- cutLexis( data = nicL, ++ cut = agehi, ++ timescale = "age", ++ new.state = 2, ++ precursor.states = 0 ) +> subset( nicC, id %in% 8:10 ) +\end{Sinput} +\begin{Soutput} + per age tfh lex.dur lex.Cst lex.Xst lex.id id icd exposure dob +683 1934.246 47.9067 23.1861 21.7727 2 2 4 8 527 9 1886.340 +5 1934.246 54.7465 24.7890 22.0977 0 0 5 9 150 0 1879.500 +6 1934.246 44.3314 23.0437 1.9563 0 2 6 10 163 2 1889.915 +685 1936.203 46.2877 25.0000 16.2536 2 1 6 10 163 2 1889.915 + age1st agein ageout +683 24.7206 47.9067 69.6794 +5 29.9575 54.7465 76.8442 +6 21.2877 44.3314 62.5413 +685 21.2877 44.3314 62.5413 +\end{Soutput} +\end{Schunk} +(The \texttt{precursor.states=} argument is explained below). Note +that individual 6 has had his follow-up split at 25 years since hire +where 50 exposure-years were attained. This could also have been +achieved in the split dataset \texttt{nicS2} instead of \texttt{nicL}, +try: +\begin{Schunk} +\begin{Sinput} +> subset( nicS2, id %in% 8:10 ) +\end{Sinput} +\begin{Soutput} + lex.id per age tfh lex.dur lex.Cst lex.Xst id icd exposure dob +13 4 1934.246 47.9067 23.1861 2.0933 0 0 8 527 9 1886.340 +14 4 1936.340 50.0000 25.2794 4.7206 0 0 8 527 9 1886.340 +15 4 1941.060 54.7206 30.0000 5.2794 0 0 8 527 9 1886.340 +16 4 1946.340 60.0000 35.2794 9.6794 0 0 8 527 9 1886.340 +17 5 1934.246 54.7465 24.7890 5.2110 0 0 9 150 0 1879.500 +18 5 1939.457 59.9575 30.0000 0.0425 0 0 9 150 0 1879.500 +19 5 1939.500 60.0000 30.0425 10.0000 0 0 9 150 0 1879.500 +20 5 1949.500 70.0000 40.0425 6.8442 0 0 9 150 0 1879.500 +21 6 1934.246 44.3314 23.0437 5.6686 0 0 10 163 2 1889.915 +22 6 1939.915 50.0000 28.7123 1.2877 0 0 10 163 2 1889.915 +23 6 1941.203 51.2877 30.0000 8.7123 0 0 10 163 2 1889.915 +24 6 1949.915 60.0000 38.7123 2.5413 0 1 10 163 2 1889.915 + age1st agein ageout +13 24.7206 47.9067 69.6794 +14 24.7206 47.9067 69.6794 +15 24.7206 47.9067 69.6794 +16 24.7206 47.9067 69.6794 +17 29.9575 54.7465 76.8442 +18 29.9575 54.7465 76.8442 +19 29.9575 54.7465 76.8442 +20 29.9575 54.7465 76.8442 +21 21.2877 44.3314 62.5413 +22 21.2877 44.3314 62.5413 +23 21.2877 44.3314 62.5413 +24 21.2877 44.3314 62.5413 +\end{Soutput} +\begin{Sinput} +> agehi <- nicS2$age1st + 50 / nicS2$exposure +> nicS2C <- cutLexis( data = nicS2, ++ cut = agehi, ++ timescale = "age", ++ new.state = 2, ++ precursor.states = 0 ) +> subset( nicS2C, id %in% 8:10 ) +\end{Sinput} +\begin{Soutput} + lex.id per age tfh lex.dur lex.Cst lex.Xst id icd exposure dob +3142 4 1934.246 47.9067 23.1861 2.0933 2 2 8 527 9 1886.340 +3143 4 1936.340 50.0000 25.2794 4.7206 2 2 8 527 9 1886.340 +3144 4 1941.060 54.7206 30.0000 5.2794 2 2 8 527 9 1886.340 +3145 4 1946.340 60.0000 35.2794 9.6794 2 2 8 527 9 1886.340 +17 5 1934.246 54.7465 24.7890 5.2110 0 0 9 150 0 1879.500 +18 5 1939.457 59.9575 30.0000 0.0425 0 0 9 150 0 1879.500 +19 5 1939.500 60.0000 30.0425 10.0000 0 0 9 150 0 1879.500 +20 5 1949.500 70.0000 40.0425 6.8442 0 0 9 150 0 1879.500 +21 6 1934.246 44.3314 23.0437 1.9563 0 2 10 163 2 1889.915 +3150 6 1936.203 46.2877 25.0000 3.7123 2 2 10 163 2 1889.915 +3151 6 1939.915 50.0000 28.7123 1.2877 2 2 10 163 2 1889.915 +3152 6 1941.203 51.2877 30.0000 8.7123 2 2 10 163 2 1889.915 +3153 6 1949.915 60.0000 38.7123 2.5413 2 1 10 163 2 1889.915 + age1st agein ageout +3142 24.7206 47.9067 69.6794 +3143 24.7206 47.9067 69.6794 +3144 24.7206 47.9067 69.6794 +3145 24.7206 47.9067 69.6794 +17 29.9575 54.7465 76.8442 +18 29.9575 54.7465 76.8442 +19 29.9575 54.7465 76.8442 +20 29.9575 54.7465 76.8442 +21 21.2877 44.3314 62.5413 +3150 21.2877 44.3314 62.5413 +3151 21.2877 44.3314 62.5413 +3152 21.2877 44.3314 62.5413 +3153 21.2877 44.3314 62.5413 +\end{Soutput} +\end{Schunk} +The same results would have emerged if we had used the \texttt{nicM} +dataset (the \texttt{data.table} object). Mathematicians would say +that \texttt{splitLexis} and \texttt{cutLexis} are commutative. + +Note that follow-up subsequent to the event is classified as being in +state 2, but that the final transition to state 1 (death from lung +cancer) is preserved. This is the point of the +\texttt{precursor.states=} argument. It names the states (in this case +0, ``Alive'') that will be over-written by \texttt{new.state} (in this +case state 2, ``High exposure''), while state 1 (``Dead'') should not +be updated even if it is after the time where the persons moves to +state 2. In other words, only state 0 is a precursor to state 2, state +1 is always subsequent to state 2. Even if you at a high exposure +level, death is pretty final. + +If the intermediate event is to be used as a time-dependent variable +in a Cox-model, then \texttt{lex.Cst} should be used as the +time-dependent variable, and \texttt{lex.Xst==1} as the event. + +\chapter{Modeling rates} + +\section{Background} + +The purpose of subdividing follow-up data is to be able to model the +effects of the time scale variables as parametric functions. + +In a model that assumes a constant occurrence rate in each of the +intervals the likelihood contribution from each interval is the same +as the likelihood contribution from a Poisson variate $D$, say, with +mean $\lambda\ell$ where $\lambda$ is the rate and $\ell$ is the +interval length, and where the value of the variate $D$ is 1 or 0 +according to whether an event has occurred or not. Moreover, the +likelihood contributions from all follow-up intervals from a single +person are \emph{conditionally} independent (conditional on having +survived till the start of the interval in question). This implies +that the total contribution to the likelihood from a single person is +a product of terms, and hence the same as the likelihood of a number +of independent Poisson terms, one from each interval. + +Parametric modeling of the rates is obtained by using the \emph{value} +of the timescale for each interval as quantitative explanatory +variables, using for example splines. Thus the model will be one where +the rate is assumed constant in each interval, but where a parametric +form of the \emph{size} of the rate in each interval is imposed by the +model, using the timescale as a covariate. + +\section{Practicalities} + +In the nickel worker study we might want to look at the effects of age +and time since hire. If we want to use splines we must allocate knots +for anchoring the splines at each of the time scales, either by some +\textit{ad hoc} method or by using some sort of penalized splines. The +letter will not be treated here. + +Here we shall use the former approach and allocate 5 knots on each of +the two time-scales. We allocate knots so that we have the event evenly +distributed between the knots: +\begin{Schunk} +\begin{Sinput} +> ( a.kn <- with( subset( nicM, lex.Xst==1 ), quantile( age+lex.dur, (1:5-0.5)/5 ) ) ) +\end{Sinput} +\begin{Soutput} + 10% 30% 50% 70% 90% +50.11874 55.61674 61.09590 64.88704 73.32220 +\end{Soutput} +\begin{Sinput} +> ( t.kn <- with( subset( nicM, lex.Xst==1 ), quantile( tfh+lex.dur, (1:5-0.5)/5 ) ) ) +\end{Sinput} +\begin{Soutput} + 10% 30% 50% 70% 90% +24.25572 30.02202 34.00440 39.84592 45.95512 +\end{Soutput} +\end{Schunk} +In the \texttt{Epi} package there is a convenience wrapper for the +\texttt{n}atural \texttt{s}pline generator \texttt{ns}, \texttt{Ns}, +that takes the smallest and the largest of a set of supplied knots to +be the boundary knots. + +\section{Models for rates} + +\subsection{One time scale} + +A model that only models lung cancer mortality +rates as a function of age would then be: +\begin{Schunk} +\begin{Sinput} +> ma <- glm( (lex.Xst==1) ~ Ns(age,knots=a.kn), ++ family = poisson, ++ offset = log(lex.dur), ++ data = nicM ) +> summary( ma ) +\end{Sinput} +\begin{Soutput} +Call: +glm(formula = (lex.Xst == 1) ~ Ns(age, knots = a.kn), family = poisson, + data = nicM, offset = log(lex.dur)) + +Deviance Residuals: + Min 1Q Median 3Q Max +-0.5074 -0.3896 -0.2143 -0.1203 3.7904 + +Coefficients: + Estimate Std. Error z value Pr(>|z|) +(Intercept) -4.6591 0.1324 -35.199 < 2e-16 +Ns(age, knots = a.kn)1 0.1671 0.2970 0.563 0.57371 +Ns(age, knots = a.kn)2 -0.1315 0.3727 -0.353 0.72411 +Ns(age, knots = a.kn)3 0.7827 0.2885 2.713 0.00667 +Ns(age, knots = a.kn)4 -0.3717 0.2780 -1.337 0.18125 + +(Dispersion parameter for poisson family taken to be 1) + + Null deviance: 1024.38 on 3128 degrees of freedom +Residual deviance: 979.16 on 3124 degrees of freedom +AIC: 1263.2 + +Number of Fisher Scoring iterations: 7 +\end{Soutput} +\end{Schunk} +The offset, \texttt{log(lex.dur)} comes from the fact that the +likelihood for the follow-up data is the same as that for independent +Poisson variates with mean $\lambda\ell$, and that the default link +function for the Poisson family is the log, so that we are using a +linear model for the log-mean, that is $\log(\lambda) + \log(\ell)$. +But when we want a model for the log-rate ($\log(\lambda)$), the term +$\log(\ell)$ must be included as a covariate with regression +coefficient fixed to 1; a so-called offset. + +The parameters from the model do not have any direct interpretation +\textit{per se}, but we can compute the estimated lung cancer incidence +rates for a range of ages using \texttt{ci.pred} with a suitably +defined prediction data frame. Note that we must specify all +covariates in the model, also the variable in the offset, +\texttt{lex.dur}. We set the latter to 1000, because we want the lung +cancer mortality rates per 1000 PY. By default \texttt{ci.pred} yields a +prediction on the response-scale, that is the rate-scale: +\begin{Schunk} +\begin{Sinput} +> nd <- data.frame( age=40:85, lex.dur=1000 ) +> pr.a <- ci.pred( ma, newdata = nd ) +> matplot( nd$age, pr.a, ++ type="l", lty=1, col=1, lwd=c(3,1,1), ++ log="y", xlab="Age (years)", ++ ylab="Lunng cancer mortality per 1000 PY") +\end{Sinput} +\end{Schunk} +\insfig{pr-a}{0.8}{Lung cancer mortality among Nickel smelter workers + by age. We see that the rates increase till about 55 years, and from +then on is approximately flat.} + +\subsection{More time scales} + +There may however also be an effect of time since hire too, so we can +add this term to the model: +\begin{Schunk} +\begin{Sinput} +> mat <- update( ma, . ~ . + Ns(tfh,knots=t.kn) ) +> summary( mat ) +\end{Sinput} +\begin{Soutput} +Call: +glm(formula = (lex.Xst == 1) ~ Ns(age, knots = a.kn) + Ns(tfh, + knots = t.kn), family = poisson, data = nicM, offset = log(lex.dur)) + +Deviance Residuals: + Min 1Q Median 3Q Max +-0.6308 -0.3730 -0.2170 -0.1180 3.8903 + +Coefficients: + Estimate Std. Error z value Pr(>|z|) +(Intercept) -4.65125 0.14844 -31.335 <2e-16 +Ns(age, knots = a.kn)1 0.19029 0.32601 0.584 0.5594 +Ns(age, knots = a.kn)2 0.04239 0.40857 0.104 0.9174 +Ns(age, knots = a.kn)3 0.87848 0.37395 2.349 0.0188 +Ns(age, knots = a.kn)4 0.08124 0.37567 0.216 0.8288 +Ns(tfh, knots = t.kn)1 0.05961 0.45702 0.130 0.8962 +Ns(tfh, knots = t.kn)2 -0.30254 0.39214 -0.771 0.4404 +Ns(tfh, knots = t.kn)3 -0.08144 0.37493 -0.217 0.8281 +Ns(tfh, knots = t.kn)4 -0.63400 0.34055 -1.862 0.0626 + +(Dispersion parameter for poisson family taken to be 1) + + Null deviance: 1024.4 on 3128 degrees of freedom +Residual deviance: 970.7 on 3120 degrees of freedom +AIC: 1262.7 + +Number of Fisher Scoring iterations: 7 +\end{Soutput} +\end{Schunk} +This model has two time scales, age and time since hire, so it makes +little sense to report the effect of age for a \emph{fixed} value of +time since hire --- the time since hire increases by age. +Instead we can show the mortality rates for persons hired at different +ages, and report the \emph{joint} effect of increasing age and time +since hire. + +In order to get a feeling for the values that can be use we look at \texttt{age1st} +\begin{Schunk} +\begin{Sinput} +> summary( nickel$age1st ) +\end{Sinput} +\begin{Soutput} + Min. 1st Qu. Median Mean 3rd Qu. Max. + 10.78 21.80 26.16 26.74 30.63 52.19 +\end{Soutput} +\end{Schunk} +Thus we shall show mortality rates in ages 20--90 for persons hired in +ages 15, 25, 35 and 45: +\begin{Schunk} +\begin{Sinput} +> nd <- data.frame( expand.grid( age=c(20:90,NA), age1st=seq(15,45,10) ) ) +> nd <- transform( nd, tfh = ifelse( age > age1st, age-age1st, NA ), ++ lex.dur = 1000 ) +> # makes no sense to have age < age1st +> nd <- transform( nd, age = ifelse( age > age1st, age, NA ) ) +> head( nd ) +\end{Sinput} +\begin{Soutput} + age age1st tfh lex.dur +1 20 15 5 1000 +2 21 15 6 1000 +3 22 15 7 1000 +4 23 15 8 1000 +5 24 15 9 1000 +6 25 15 10 1000 +\end{Soutput} +\end{Schunk} +With this in place we can plot the estimated rates as before, only +now we will get 4 separate lines. The purpose of inserting an +\texttt{NA} on the age-scale in the \texttt{expand.grid} is that the +different instances of \texttt{age1st} be separated by \texttt{NA}s, and +hence will not be connected when we plot the curves. The downside of +this trick is that lines cannot be plotted with different colors or +symbols. +\begin{Schunk} +\begin{Sinput} +> pr.at <- ci.pred( mat, newdata = nd ) +> matplot( nd$age, pr.at, ++ type="l", lty=1, col=1, lwd=c(3,1,1), ++ log="y", xlab="Age (years)", ++ ylab="Lunng cancer mortality per 1000 PY") +\end{Sinput} +\end{Schunk} +\insfig{pr-at}{0.8}{Lung cancer mortality among Nickel smelter workers + by age and age at hire 15, 25,35 and 45. Each line (except the + first) starts at the age of hire; we see that the later in life you + are hired, the smaller the initial risk, but the higher the eventual + risk of lung cancer death.} + +We can check whether the effect of time since hire is actually +improving the model: +\begin{Schunk} +\begin{Sinput} +> anova( ma, mat, test="Chisq" ) +\end{Sinput} +\begin{Soutput} +Analysis of Deviance Table + +Model 1: (lex.Xst == 1) ~ Ns(age, knots = a.kn) +Model 2: (lex.Xst == 1) ~ Ns(age, knots = a.kn) + Ns(tfh, knots = t.kn) + Resid. Df Resid. Dev Df Deviance Pr(>Chi) +1 3124 979.16 +2 3120 970.70 4 8.4626 0.07603 +\end{Soutput} +\end{Schunk} +We see a pretty strong indication that this is the case. + +\subsection{Difference between time scales} + +However it might be the case that it really is the age at first hire +that is the main determinant (recall that +$\mathtt{age}-\mathtt{thf}=\mathtt{age1st}$), so we could fit a model +with this variable instead --- a model with only 1 timescale, namely +\texttt{age}. +\begin{Schunk} +\begin{Sinput} +> ( f.kn <- with( subset( nicM, lex.Xst==1 ), quantile( age1st, (1:5-0.5)/5 ) ) ) +\end{Sinput} +\begin{Soutput} + 10% 30% 50% 70% 90% +20.25860 22.55422 26.00000 28.36578 33.96910 +\end{Soutput} +\begin{Sinput} +> maf <- update( ma, . ~ . + Ns(age1st,knots=f.kn) ) +> summary( maf ) +\end{Sinput} +\begin{Soutput} +Call: +glm(formula = (lex.Xst == 1) ~ Ns(age, knots = a.kn) + Ns(age1st, + knots = f.kn), family = poisson, data = nicM, offset = log(lex.dur)) + +Deviance Residuals: + Min 1Q Median 3Q Max +-0.5696 -0.3671 -0.2257 -0.1197 3.7777 + +Coefficients: + Estimate Std. Error z value Pr(>|z|) +(Intercept) -4.62646 0.17564 -26.340 < 2e-16 +Ns(age, knots = a.kn)1 0.21589 0.29742 0.726 0.46792 +Ns(age, knots = a.kn)2 -0.06427 0.37653 -0.171 0.86446 +Ns(age, knots = a.kn)3 0.79456 0.29345 2.708 0.00678 +Ns(age, knots = a.kn)4 -0.31305 0.27976 -1.119 0.26314 +Ns(age1st, knots = f.kn)1 -0.15145 0.38279 -0.396 0.69237 +Ns(age1st, knots = f.kn)2 0.04607 0.27980 0.165 0.86923 +Ns(age1st, knots = f.kn)3 0.26374 0.26156 1.008 0.31331 +Ns(age1st, knots = f.kn)4 -0.22878 0.23117 -0.990 0.32234 + +(Dispersion parameter for poisson family taken to be 1) + + Null deviance: 1024.4 on 3128 degrees of freedom +Residual deviance: 973.2 on 3120 degrees of freedom +AIC: 1265.2 + +Number of Fisher Scoring iterations: 7 +\end{Soutput} +\begin{Sinput} +> anova( maf, ma, mat, test="Chisq" ) +\end{Sinput} +\begin{Soutput} +Analysis of Deviance Table + +Model 1: (lex.Xst == 1) ~ Ns(age, knots = a.kn) + Ns(age1st, knots = f.kn) +Model 2: (lex.Xst == 1) ~ Ns(age, knots = a.kn) +Model 3: (lex.Xst == 1) ~ Ns(age, knots = a.kn) + Ns(tfh, knots = t.kn) + Resid. Df Resid. Dev Df Deviance Pr(>Chi) +1 3120 973.20 +2 3124 979.16 -4 -5.9624 0.20198 +3 3120 970.70 4 8.4626 0.07603 +\end{Soutput} +\end{Schunk} +We see that there is much less indication that the age at first hire has +an effect. + +For the sake of completeness we can draw the predicted values from the +\texttt{maf} model on top of the ones from the \textrm{mat} model: +\begin{Schunk} +\begin{Sinput} +> pr.af <- ci.pred( maf, newdata = nd ) +> matplot( nd$age, pr.at, ++ type="l", lty=1, col=1, lwd=c(3,1,1), ++ log="y", xlab="Age (years)", ++ ylab="Lunng cancer mortality per 1000 PY") +> matlines( nd$age, pr.af, ++ type="l", lty=1, col=2, lwd=c(3,0,0) ) +\end{Sinput} +\end{Schunk} +\insfig{pr-at-af}{0.8}{Lung cancer mortality among Nickel smelter + workers by age and age at hire 15, 25,35 and 45. Each line (except + the first) starts at the age of hire; we see that the later in life + you are hired, the smaller the initial risk, but the higher the + eventual risk of lung cancer death. The red lines are from the model + \textrm{\tt maf} where the lines are constrained to be parallel, and + which gives a worse fit to data.} + +\subsection{The complete picture --- exercise} + +We could fit the remaining models where one or more of the three +variables are included, and compare all of them: +\begin{Schunk} +\begin{Sinput} +> maft <- update( mat, . ~ . + Ns(age1st,knots=f.kn) ) +> summary( maft ) +\end{Sinput} +\begin{Soutput} +Call: +glm(formula = (lex.Xst == 1) ~ Ns(age, knots = a.kn) + Ns(tfh, + knots = t.kn) + Ns(age1st, knots = f.kn), family = poisson, + data = nicM, offset = log(lex.dur)) + +Deviance Residuals: + Min 1Q Median 3Q Max +-0.5899 -0.3579 -0.2224 -0.1185 3.8687 + +Coefficients: (1 not defined because of singularities) + Estimate Std. Error z value Pr(>|z|) +(Intercept) -4.71537 0.16481 -28.612 <2e-16 +Ns(age, knots = a.kn)1 0.01671 0.35152 0.048 0.9621 +Ns(age, knots = a.kn)2 -0.11682 0.44638 -0.262 0.7935 +Ns(age, knots = a.kn)3 0.47689 0.50638 0.942 0.3463 +Ns(age, knots = a.kn)4 -0.18241 0.47318 -0.385 0.6999 +Ns(tfh, knots = t.kn)1 0.35272 0.51329 0.687 0.4920 +Ns(tfh, knots = t.kn)2 -0.11034 0.43043 -0.256 0.7977 +Ns(tfh, knots = t.kn)3 0.26874 0.49133 0.547 0.5844 +Ns(tfh, knots = t.kn)4 -0.30302 0.43585 -0.695 0.4869 +Ns(age1st, knots = f.kn)1 -0.10650 0.37476 -0.284 0.7763 +Ns(age1st, knots = f.kn)2 0.17245 0.20063 0.860 0.3900 +Ns(age1st, knots = f.kn)3 0.47357 0.24239 1.954 0.0507 +Ns(age1st, knots = f.kn)4 NA NA NA NA + +(Dispersion parameter for poisson family taken to be 1) + + Null deviance: 1024.38 on 3128 degrees of freedom +Residual deviance: 966.31 on 3117 degrees of freedom +AIC: 1264.3 + +Number of Fisher Scoring iterations: 7 +\end{Soutput} +\begin{Sinput} +> mft <- update( maft, . ~ . - Ns(age,knots=a.kn) ) +> mf <- update( maf , . ~ . - Ns(age,knots=a.kn) ) +> mt <- update( mat , . ~ . - Ns(age,knots=a.kn) ) +> allp <- anova( maft, mat, ma, maf, mf, mft, mt, mat, ++ maf, maft, mft, ++ test="Chisq" ) +> mall <- as.matrix( allp ) +> cbind( mod = c("maft","mat","ma","maf","mf","mft","mt","mat","maf","maft","mft"), ++ round( allp[,1:5], 3 ) ) +\end{Sinput} +\begin{Soutput} + mod Resid. Df Resid. Dev Df Deviance Pr(>Chi) +1 maft 3117 966.306 NA NA NA +2 mat 3120 970.697 -3 -4.391 0.222 +3 ma 3124 979.160 -4 -8.463 0.076 +4 maf 3120 973.197 4 5.962 0.202 +5 mf 3124 1011.593 -4 -38.396 0.000 +6 mft 3120 971.120 4 40.473 0.000 +7 mt 3124 985.734 -4 -14.614 0.006 +8 mat 3120 970.697 4 15.037 0.005 +9 maf 3120 973.197 0 -2.500 NA +10 maft 3117 966.306 3 6.892 0.075 +11 mft 3120 971.120 -3 -4.814 0.186 +\end{Soutput} +\end{Schunk} +\begin{enumerate} +\item Explain why there are \texttt{NA}s among the parameters in the + model \texttt{maf}. +\item Draw a graph (a ``DAG'') with the models as nodes and the tests + as vertices, put the p-values on the vertices and use the result to + argue that the model with age an time since hire is actually the + most sensible description in this case. +\end{enumerate} + +\chapter{Competing risks --- multiple types of events} + +If we want to consider death from lung cancer and death from other +causes as separate events we can code these as for example 1 and 2. +\begin{Schunk} +\begin{Sinput} +> data( nickel ) +> nicL <- Lexis( entry = list( per = agein+dob, ++ age = agein, ++ tfh = agein-age1st ), ++ exit = list( age = ageout ), ++ exit.status = ( icd > 0 ) + ( icd %in% c(162,163) ), ++ data = nickel ) +\end{Sinput} +\begin{Soutput} +NOTE: entry.status has been set to 0 for all. +\end{Soutput} +\begin{Sinput} +> summary( nicL ) +\end{Sinput} +\begin{Soutput} +Transitions: + To +From 0 1 2 Records: Events: Risk time: Persons: + 0 47 495 137 679 632 15348.06 679 +\end{Soutput} +\begin{Sinput} +> subset( nicL, id %in% 8:10 ) +\end{Sinput} +\begin{Soutput} + per age tfh lex.dur lex.Cst lex.Xst lex.id id icd exposure dob +4 1934.246 47.9067 23.1861 21.7727 0 1 4 8 527 9 1886.340 +5 1934.246 54.7465 24.7890 22.0977 0 1 5 9 150 0 1879.500 +6 1934.246 44.3314 23.0437 18.2099 0 2 6 10 163 2 1889.915 + age1st agein ageout +4 24.7206 47.9067 69.6794 +5 29.9575 54.7465 76.8442 +6 21.2877 44.3314 62.5413 +\end{Soutput} +\end{Schunk} +In order to have a more readable output we can label the states, we +can enter the names of these in the \texttt{states} parameter, try for +example: +\begin{Schunk} +\begin{Sinput} +> nicL <- Lexis( entry = list( per = agein+dob, ++ age = agein, ++ tfh = agein-age1st ), ++ exit = list( age = ageout ), ++ exit.status = ( icd > 0 ) + ( icd %in% c(162,163) ), ++ data = nickel, ++ states = c("Alive","D.oth","D.lung") ) +\end{Sinput} +\begin{Soutput} +NOTE: entry.status has been set to 0 for all. +\end{Soutput} +\begin{Sinput} +> summary( nicL ) +\end{Sinput} +\begin{Soutput} +Transitions: + To +From Alive D.oth D.lung Records: Events: Risk time: Persons: + Alive 47 495 137 679 632 15348.06 679 +\end{Soutput} +\begin{Sinput} +> str( nicL ) +\end{Sinput} +\begin{Soutput} +Classes ‘Lexis’ and 'data.frame': 679 obs. of 14 variables: + $ per : num 1934 1934 1934 1934 1934 ... + $ age : num 45.2 48.3 53 47.9 54.7 ... + $ tfh : num 27.7 25.1 27.7 23.2 24.8 ... + $ lex.dur : num 47.75 15 1.17 21.77 22.1 ... + $ lex.Cst : Factor w/ 3 levels "Alive","D.oth",..: 1 1 1 1 1 1 1 1 1 1 ... + $ lex.Xst : Factor w/ 3 levels "Alive","D.oth",..: 1 3 3 2 2 3 2 2 2 2 ... + $ lex.id : int 1 2 3 4 5 6 7 8 9 10 ... + $ id : num 3 4 6 8 9 10 15 16 17 18 ... + $ icd : num 0 162 163 527 150 163 334 160 420 12 ... + $ exposure: num 5 5 10 9 0 2 0 0.5 0 0 ... + $ dob : num 1889 1886 1881 1886 1880 ... + $ age1st : num 17.5 23.2 25.2 24.7 30 ... + $ agein : num 45.2 48.3 53 47.9 54.7 ... + $ ageout : num 93 63.3 54.2 69.7 76.8 ... + - attr(*, "time.scales")= chr "per" "age" "tfh" + - attr(*, "time.since")= chr "" "" "" + - attr(*, "breaks")=List of 3 + ..$ per: NULL + ..$ age: NULL + ..$ tfh: NULL +\end{Soutput} +\end{Schunk} +Note that the \texttt{Lexis} function automatically assumes that all +persons enter in the first level (given in the \texttt{states=} +argument), corresponding to the numerical values given in +\texttt{exit.status}. + +When we cut at a date as in this case, the date where cumulative +exposure exceeds 50 exposure-years, we get the follow-up \emph{after} +the date classified as being in the new state if the exit +(\texttt{lex.Xst}) was to a state we defined as one of the +\texttt{precursor.states}: +\begin{Schunk} +\begin{Sinput} +> nicL$agehi <- nicL$age1st + 50 / nicL$exposure +> nicC <- cutLexis( data = nicL, ++ cut = nicL$agehi, ++ timescale = "age", ++ new.state = "HiExp", ++ precursor.states = "Alive" ) +> subset( nicC, id %in% 8:10 ) +\end{Sinput} +\begin{Soutput} + per age tfh lex.dur lex.Cst lex.Xst lex.id id icd exposure dob +683 1934.246 47.9067 23.1861 21.7727 HiExp D.oth 4 8 527 9 1886.340 +5 1934.246 54.7465 24.7890 22.0977 Alive D.oth 5 9 150 0 1879.500 +6 1934.246 44.3314 23.0437 1.9563 Alive HiExp 6 10 163 2 1889.915 +685 1936.203 46.2877 25.0000 16.2536 HiExp D.lung 6 10 163 2 1889.915 + age1st agein ageout agehi +683 24.7206 47.9067 69.6794 30.27616 +5 29.9575 54.7465 76.8442 Inf +6 21.2877 44.3314 62.5413 46.28770 +685 21.2877 44.3314 62.5413 46.28770 +\end{Soutput} +\begin{Sinput} +> summary( nicC, scale=1000 ) +\end{Sinput} +\begin{Soutput} +Transitions: + To +From Alive HiExp D.oth D.lung Records: Events: Risk time: Persons: + Alive 39 83 279 65 466 427 10.77 466 + HiExp 0 8 216 72 296 288 4.58 296 + Sum 39 91 495 137 762 715 15.35 679 +\end{Soutput} +\end{Schunk} +Note that the persons-years is the same, but that the number of +events has changed. This is because events are now defined as any +transition, including the transitions to \texttt{HiExp}. + +Also note that (so far) it is necessary to specify the variable with +the cut points in full, using only \texttt{cut=agehi} would give an error. + +\section{Subdividing states} + +It may be of interest to subdivide the states following the +intermediate event according to whether the event has occurred or +not. That is done by the argument \texttt{split.states=TRUE}. + +Moreover, it will also often be of interest to introduce a new +timescale indicating the time since intermediate event. This can be +done by the argument \texttt{new.scale=TRUE}, alternatively +\texttt{new.scale="tfe"}, as illustrated here: +\begin{Schunk} +\begin{Sinput} +> nicC <- cutLexis( data = nicL, ++ cut = nicL$agehi, ++ timescale = "age", ++ new.state = "HiExp", ++ new.scale = "tfe", ++ split.states = TRUE, ++ precursor.states = "Alive" ) +> subset( nicC, id %in% 8:10 ) +\end{Sinput} +\begin{Soutput} + per age tfh tfe lex.dur lex.Cst lex.Xst lex.id id icd +683 1934.246 47.9067 23.1861 17.63054 21.7727 HiExp D.oth(HiExp) 4 8 527 +5 1934.246 54.7465 24.7890 NA 22.0977 Alive D.oth 5 9 150 +6 1934.246 44.3314 23.0437 NA 1.9563 Alive HiExp 6 10 163 +685 1936.203 46.2877 25.0000 0.00000 16.2536 HiExp D.lung(HiExp) 6 10 163 + exposure dob age1st agein ageout agehi +683 9 1886.340 24.7206 47.9067 69.6794 30.27616 +5 0 1879.500 29.9575 54.7465 76.8442 Inf +6 2 1889.915 21.2877 44.3314 62.5413 46.28770 +685 2 1889.915 21.2877 44.3314 62.5413 46.28770 +\end{Soutput} +\begin{Sinput} +> summary( nicC, scale=1000, timeScales=TRUE ) +\end{Sinput} +\begin{Soutput} +Transitions: + To +From Alive HiExp D.oth D.lung D.lung(HiExp) D.oth(HiExp) Records: Events: Risk time: + Alive 39 83 279 65 0 0 466 427 10.77 + HiExp 0 8 0 0 72 216 296 288 4.58 + Sum 39 91 279 65 72 216 762 715 15.35 + +Transitions: + To +From Persons: + Alive 466 + HiExp 296 + Sum 679 + +Timescales: + time.scale time.since +1 per +2 age +3 tfh +4 tfe HiExp +\end{Soutput} +\end{Schunk} +Note that the \texttt{timeScales=TRUE} to \texttt{summary} lists the +timescales available in the object, and also indicates which of them +that are defined as time since entry to a particular state. This +facility is not used here, but it is needed when simulating follow-up +data --- see the vignette on \textrm{simLexis}. + +With 6 different states it is quite difficult to get an overview of +the transitions between states from the \texttt{summary()}. Therefore +there is function that gives a graphical display of the states showing +the transitions between the states: +\begin{Schunk} +\begin{Sinput} +> boxes( nicC, boxpos = list(x=c(10,10,80,80,80,80), ++ y=c(75,25,87,63,13,37)), ++ scale.Y = 1000, ++ show.BE = TRUE ) +\end{Sinput} +\end{Schunk} +\insfig{nic-box}{0.9}{Transitions between states; the number in the + middle of each box is the person-years (in 1000s --- since + \textrm{\tt scale.Y=1000}), the numbers at the bottom of the boxes + are the number that start, respectively end their follow-up in each + state. The numbers on the arrows are the number of transitions and + crude transition rates (the latter in events per 1000 PY).\newline + The function \textrm{\tt boxes.Lexis} has a zillion arguments to + fine-tune the appearance of the display in terms of colors etc.} + +%% \section{Multiple events of the same type (recurrent events)} +%% Sometimes more events of the same type are recorded for each person and +%% one would then like to count these and put follow-up time in states accordingly. +%% Essentially, each set of cutpoints represents progressions from one +%% state to the next. Therefore the states should be numbered, and the +%% numbering of states subsequently occupied be increased accordingly. + +%% This is a behaviour different from the one outlined above, and it is +%% achieved by the argument \texttt{count=TRUE} to +%% \texttt{cutLexis}. When \texttt{count} is set to \texttt{TRUE}, the +%% value of the arguments \texttt{new.state} and +%% \texttt{precursor.states} are ignored. Actually, when using the +%% argument \texttt{count=TRUE}, the function \texttt{countLexis} is +%% called, so an alternative is to use this directly. + +%% \renewcommand{\bibname}{References} +%% \bibliographystyle{plain} +%% \bibliography{% +%% /home/bendix/art/bibtex/BxC,% +%% /home/bendix/art/bibtex/Stat,% +%% /home/bendix/art/bibtex/DMCa,% +%% /home/bendix/art/bibtex/Diabetes% +%% } +%% \addcontentsline{toc}{section}{\bibname} + +\end{document} diff -Nru r-cran-epi-2.19/vignettes/Follow-up.rnw r-cran-epi-2.30/vignettes/Follow-up.rnw --- r-cran-epi-2.19/vignettes/Follow-up.rnw 2015-05-27 15:26:58.000000000 +0000 +++ r-cran-epi-2.30/vignettes/Follow-up.rnw 1970-01-01 00:00:00.000000000 +0000 @@ -1,399 +0,0 @@ -\SweaveOpts{results=verbatim,keep.source=TRUE,include=FALSE} -%\VignetteIndexEntry{Follow-up data with the Epi package} -\documentclass[a4paper,twoside,12pt]{article} - -\usepackage[english]{babel} -\usepackage{booktabs,rotating,graphicx,amsmath,verbatim,fancyhdr,Sweave} -\usepackage[colorlinks,linkcolor=red,urlcolor=blue]{hyperref} -\newcommand{\R}{\textsf{\bf R}} -\renewcommand{\topfraction}{0.95} -\renewcommand{\bottomfraction}{0.95} -\renewcommand{\textfraction}{0.1} -\renewcommand{\floatpagefraction}{0.9} -\DeclareGraphicsExtensions{.pdf,.jpg} -\setcounter{secnumdepth}{1} -\setcounter{tocdepth}{1} - -\oddsidemargin 1mm -\evensidemargin 1mm -\textwidth 160mm -\textheight 230mm -\topmargin -5mm -\headheight 8mm -\headsep 5mm -\footskip 15mm - -\begin{document} - -\raggedleft -\pagestyle{empty} -\vspace*{0.1\textheight} -\Huge -{\bf Follow-up data with the\\ \texttt{Epi} package} -\noindent\rule[-1ex]{\textwidth}{5pt}\\[2.5ex] -\Large -Summer 2014 -\vfill -\normalsize -\begin{tabular}{rl} - Michael Hills & Retired \\ - & Highgate, London \\[1em] -Martyn Plummer & International Agency for Research on Cancer, Lyon\\ - & \texttt{plummer@iarc.fr} \\[1em] -Bendix Carstensen & Steno Diabetes Center, Gentofte, Denmark\\ - & \small \& Department of Biostatistics, - University of Copenhagen\\ - & \normalsize \texttt{bxc@steno.dk} \\ - & \url{www.pubhealth.ku.dk/~bxc} -\end{tabular} -\normalsize -\newpage -\raggedright -\parindent 3ex -\parskip 0ex -\tableofcontents -\cleardoublepage -\setcounter{page}{1} -\pagestyle{fancy} -\renewcommand{\sectionmark}[1]{\markboth{\thesection #1}{\thesection \ #1}} -\fancyhead[OL]{\sl Follow-up data with the \texttt{Epi} package.} -\fancyhead[ER]{\sl \rightmark} -\fancyhead[EL,OR]{\bf \thepage} -\fancyfoot{} -\renewcommand{\headrulewidth}{0.1pt} - -<<>>= -library(Epi) -print( sessionInfo(), l=F ) -@ - -\section{Follow-up data in the \texttt{Epi} package} - -In the \texttt{Epi}-package, follow-up data is represented by adding -some extra variables to a dataframe. Such a dataframe is called a -\texttt{Lexis} object. The tools for handling follow-up data then use -the structure of this for special plots, tabulations etc. - -Follow-up data basically consists of a time of entry, a time of exit -and an indication of the status at exit (normally either ``alive'' or -``dead''). Implicitly is also assumed a status \emph{during} the -follow-up (usually ``alive''). - -\begin{figure}[htbp] - \centering -\setlength{\unitlength}{1pt} -\begin{picture}(210,70)(0,75) -%\scriptsize -\thicklines - \put( 0,80){\makebox(0,0)[r]{Age-scale}} - \put( 50,80){\line(1,0){150}} - \put( 50,80){\line(0,1){5}} - \put(100,80){\line(0,1){5}} - \put(150,80){\line(0,1){5}} - \put(200,80){\line(0,1){5}} - \put( 50,77){\makebox(0,0)[t]{35}} - \put(100,77){\makebox(0,0)[t]{40}} - \put(150,77){\makebox(0,0)[t]{45}} - \put(200,77){\makebox(0,0)[t]{50}} - - \put( 0,115){\makebox(0,0)[r]{Follow-up}} - - \put( 80,105){\makebox(0,0)[r]{\small Two}} - \put( 90,105){\line(1,0){87}} - \put( 90,100){\line(0,1){10}} - \put(100,100){\line(0,1){10}} - \put(150,100){\line(0,1){10}} - \put(180,105){\circle{6}} - \put( 95,110){\makebox(0,0)[b]{1}} - \put(125,110){\makebox(0,0)[b]{5}} - \put(165,110){\makebox(0,0)[b]{3}} - - \put( 50,130){\makebox(0,0)[r]{\small One}} - \put( 60,130){\line(1,0){70}} - \put( 60,125){\line(0,1){10}} - \put(100,125){\line(0,1){10}} - \put(130,130){\circle*{6}} - \put( 80,135){\makebox(0,0)[b]{4}} - \put(115,135){\makebox(0,0)[b]{3}} -\end{picture} - \caption{\it Follow-up of two persons} - \label{fig:fu2} -\end{figure} - -\section{Timescales} - -A timescale is a variable that varies deterministically \emph{within} -each person during follow-up, \textit{e.g.}: -\begin{itemize} - \item Age - \item Calendar time - \item Time since treatment - \item Time since relapse -\end{itemize} -All timescales advance at the same pace, so the time followed is the -same on all timescales. Therefore, it suffices to use only the entry -point on each of the time scale, for example: -\begin{itemize} - \item Age at entry. - \item Date of entry. - \item Time since treatment (\emph{at} treatment this is 0). - \item Time since relapse (\emph{at} relapse this is 0).. -\end{itemize} -In the \texttt{Epi} package, follow-up in a cohort is represented in a -\texttt{Lexis} object. A \texttt{Lexis} object is a dataframe with a -bit of extra structure representing the follow-up. For the -\texttt{nickel} data we would construct a \texttt{Lexis} object by: -<<>>= -data( nickel ) -nicL <- Lexis( entry = list( per=agein+dob, - age=agein, - tfh=agein-age1st ), - exit = list( age=ageout ), - exit.status = ( icd %in% c(162,163) )*1, - data = nickel ) -@ -The \texttt{entry} argument is a \emph{named} list with the entry -points on each of the timescales we want to use. It defines the names -of the timescales and the entry points. The \texttt{exit} argument -gives the exit time on \emph{one} of the timescales, so the name of -the element in this list must match one of the neames of the -\texttt{entry} list. This is sufficient, because the follow-up time on -all time scales is the same, in this case \texttt{ageout - agein}. Now -take a look at the result: -<<>>= -str( nickel ) -str( nicL ) -head( nicL ) -@ -The \texttt{Lexis} object \texttt{nicL} has a variable for each -timescale which is the entry point on this timescale. The follow-up -time is in the variable \texttt{lex.dur} (\textbf{dur}ation). - -There is a \texttt{summary} function for \texttt{Lexis} objects that -list the numer of transitions and records as well as the total -follow-up time: -<<>>= -summary( nicL ) -@ -We defined the exit status to be death from lung cancer (ICD7 -162,163), i.e. this variable is 1 if follow-up ended with a death from -this cause. If follow-up ended alive or by death from another cause, -the exit status is coded 0, i.e. as a censoring. - -Note that the exit status is in the variable \texttt{lex.Xst} -(e\textbf{X}it \textbf{st}atus. The variable \texttt{lex.Cst} is the -state where the follow-up takes place (\textbf{C}urrent -\textbf{st}atus), in this case 0 (alive). - -It is possible to get a visualization of the follow-up along the -timescales chosen by using the \texttt{plot} method for \texttt{Lexis} -objects. \texttt{nicL} is an object of \emph{class} \texttt{Lexis}, so -using the function \texttt{plot()} on it means that \R\ will look for -the function \texttt{plot.Lexis} and use this function. -<>= -plot( nicL ) -@ -The function allows a lot of control over the output, and a -\texttt{points.Lexis} function allows plotting of the endpoints of -follow-up: -<>= -par( mar=c(3,3,1,1), mgp=c(3,1,0)/1.6 ) -plot( nicL, 1:2, lwd=1, col=c("blue","red")[(nicL$exp>0)+1], - grid=TRUE, lty.grid=1, col.grid=gray(0.7), - xlim=1900+c(0,90), xaxs="i", - ylim= 10+c(0,90), yaxs="i", las=1 ) -points( nicL, 1:2, pch=c(NA,3)[nicL$lex.Xst+1], - col="lightgray", lwd=3, cex=1.5 ) -points( nicL, 1:2, pch=c(NA,3)[nicL$lex.Xst+1], - col=c("blue","red")[(nicL$exp>0)+1], lwd=1, cex=1.5 ) -@ -The results of these two plotting commands are in figure \ref{fig:Lexis-diagram}. -\begin{figure}[tb] -\centering -\label{fig:Lexis-diagram} -\includegraphics[width=0.39\textwidth]{Follow-up-nicL1} -\includegraphics[width=0.59\textwidth]{Follow-up-nicL2} -\caption{\it Lexis diagram of the \texttt{nickel} dataset, left panel - the default version, the right one with bells - and whistles. The red lines are for persons with exposure$>0$, so it - is pretty evident that the oldest ones are the exposed part of the - cohort.} -\end{figure} - -\section{Splitting the follow-up time along a timescale} - -The follow-up time in a cohort can be subdivided by for example -current age. This is achieved by the \texttt{splitLexis} (note that it -is \emph{not} called \texttt{split.Lexis}). This requires that the -timescale and the breakpoints on this timescale are supplied. Try: -<<>>= -nicS1 <- splitLexis( nicL, "age", breaks=seq(0,100,10) ) -summary( nicL ) -summary( nicS1 ) -@ -So we see that the number of events and the amount of follow-up is the -same in the two datasets; only the number of records differ. - -To see how records are split for each individual, it is useful to list -the results for a few individuals: -<<>>= -round( subset( nicS1, id %in% 8:10 ), 2 ) -@ -The resulting object, \texttt{nicS1}, is again a \texttt{Lexis} -object, and so follow-up may be split further along another -timescale. Try this and list the results for individuals 8, 9 and 10 again: -<<>>= -nicS2 <- splitLexis( nicS1, "tfh", breaks=c(0,1,5,10,20,30,100) ) -round( subset( nicS2, id %in% 8:10 ), 2 ) -@ -If we want to model the effect of these timescales we will for each -interval use either the value of the left endpoint in each interval or -the middle. There is a function \texttt{timeBand} which returns these. -Try: -<<>>= -timeBand( nicS2, "age", "middle" )[1:20] -# For nice printing and column labelling use the data.frame() function: -data.frame( nicS2[,c("id","lex.id","per","age","tfh","lex.dur")], - mid.age=timeBand( nicS2, "age", "middle" ), - mid.tfh=timeBand( nicS2, "tfh", "middle" ) )[1:20,] -@ -Note that these are the midpoints of the intervals defined by -\texttt{breaks=}, \emph{not} the midpoints of the actual follow-up -intervals. This is because the variable to be used in modelling must -be independent of the consoring and mortality pattern --- it should -only depend on the chosen grouping of the timescale. - -\section{Splitting time at a specific date} - -If we have a recording of the date of a specific event as for example -recovery or relapse, we may classify follow-up time as being before of -after this intermediate event. This is achieved with the function -\texttt{cutLexis}, which takes three arguments: the time point, the -timescale, and the value of the (new) state following the date. - -Now we define the age for the nickel vorkers where the cumulative -exposure exceeds 50 exposure years: -<<>>= -subset( nicL, id %in% 8:10 ) -agehi <- nicL$age1st + 50 / nicL$exposure -nicC <- cutLexis( data=nicL, cut=agehi, timescale="age", - new.state=2, precursor.states=0 ) -subset( nicC, id %in% 8:10 ) -@ -(The \texttt{precursor.states=} argument is explained below). -Note that individual 6 has had his follow-up split at age 25 where 50 -exposure-years were attained. This could also have been achieved in -the split dataset \texttt{nicS2} instead of \texttt{nicL}, try: -<<>>= -subset( nicS2, id %in% 8:10 ) -agehi <- nicS2$age1st + 50 / nicS2$exposure -nicS2C <- cutLexis( data=nicS2, cut=agehi, timescale="age", - new.state=2, precursor.states=0 ) -subset( nicS2C, id %in% 8:10 ) -@ -Note that follow-up subsequent to the event is classified as being -in state 2, but that the final transition to state 1 (death from lung -cancer) is preserved. This is the point of the \texttt{precursor.states=} -argument. It names the states (in this case 0, ``Alive'') that will be -over-witten by \texttt{new.state} (in this case state 2, ``High -exposure''). Clearly, state 1 (``Dead'') should not be updated even if -it is after the time where the persons moves to state 2. In other -words, only state 0 is a precursor to state 2, state 1 is always -subsequent to state 2. - -Note if the intermediate event is to be used as a time-dependent -variable in a Cox-model, then \texttt{lex.Cst} should be used as the -time-dependent variable, and \texttt{lex.Xst==1} as the event. - -\section{Competing risks --- multiple types of events} - -If we want to consider death from lung cancer and death from other -causes as separate events we can code these as for example 1 and 2. -<<>>= -data( nickel ) -nicL <- Lexis( entry = list( per=agein+dob, - age=agein, - tfh=agein-age1st ), - exit = list( age=ageout ), - exit.status = ( icd > 0 ) + ( icd %in% c(162,163) ), - data = nickel ) -summary( nicL ) -subset( nicL, id %in% 8:10 ) -@ -If we want to label the states, we can enter the names of these in the -\texttt{states} parameter, try for example: -<<>>= -nicL <- Lexis( entry = list( per=agein+dob, - age=agein, - tfh=agein-age1st ), - exit = list( age=ageout ), - exit.status = ( icd > 0 ) + ( icd %in% c(162,163) ), - data = nickel, - states = c("Alive","D.oth","D.lung") ) -summary( nicL ) -@ - -Note that the \texttt{Lexis} function automatically assumes that all -persons enter in the first level (given in the \texttt{states=} -argument) - -When we cut at a date as in this case, the date where cumulative -exposure exceeds 50 exposure-years, we get the follow-up \emph{after} -the date classified as being in the new state if the exit -(\texttt{lex.Xst}) was to a state we defined as one of the -\texttt{precursor.states}: -<<>>= -nicL$agehi <- nicL$age1st + 50 / nicL$exposure -nicC <- cutLexis( data = nicL, - cut = nicL$agehi, - timescale = "age", - new.state = "HiExp", - precursor.states = "Alive" ) -subset( nicC, id %in% 8:10 ) -summary( nicC, scale=1000 ) -@ -Note that the persons-years is the same, but that the number of events -has changed. This is because events are now defined as any transition -from alive, including the transitions to \texttt{HiExp}. - -Also note that (so far) it is necessary to specify the variable with -the cutpoints in full, using only \texttt{cut=agehi} would give an error. - -\subsection{Subdivision of existing states} -It may be of interest to subdivide the states following the -intermediate event according to wheter the event has occurred or -not. That is done by the argument \texttt{split.states=TRUE}. - -Moreover, it will also often be of interest to introduce a new -timescale indicating the time since intermediate event. This can be -done by the argument \texttt{new.scale=TRUE}, alternatively -\texttt{new.scale="tfevent"}, as illustrated here: -<<>>= -nicC <- cutLexis( data = nicL, - cut = nicL$agehi, - timescale = "age", - new.state = "Hi", - split.states=TRUE, new.scale=TRUE, - precursor.states = "Alive" ) -subset( nicC, id %in% 8:10 ) -summary( nicC, scale=1000 ) -@ - -\section{Multiple events of the same type (recurrent events)} -Sometimes more events of the same type are recorded for each person and -one would then like to count these and put follow-up time in states accordingly. -Essentially, each set of cutpoints represents progressions from one -state to the next. Therefore the states should be numbered, and the -numbering of states subsequently occupied be increased accordingly. - -This is a behaviour different from the one outlined above, and it is -achieved by the argument \texttt{count=TRUE} to -\texttt{cutLexis}. When \texttt{count} is set to \texttt{TRUE}, the -value of the arguments \texttt{new.state} and -\texttt{precursor.states} are ignored. Actually, when using the -argument \texttt{count=TRUE}, the function \texttt{countLexis} is -called, so an alternative is to use this directly. - -\end{document} - - diff -Nru r-cran-epi-2.19/vignettes/index.html r-cran-epi-2.30/vignettes/index.html --- r-cran-epi-2.19/vignettes/index.html 2017-02-17 03:31:31.000000000 +0000 +++ r-cran-epi-2.30/vignettes/index.html 1970-01-01 00:00:00.000000000 +0000 @@ -1,18 +0,0 @@ - - -Vignettes for the Epi package</a> - -

Vignettes for the Epi package

- -Here is the website for the Epi package. - Binary files /tmp/tmp1TClAk/RJn9eX0x2u/r-cran-epi-2.19/vignettes/pr.Rda and /tmp/tmp1TClAk/HK_qO2S7tg/r-cran-epi-2.30/vignettes/pr.Rda differ Binary files /tmp/tmp1TClAk/RJn9eX0x2u/r-cran-epi-2.19/vignettes/simLexis-boxes.pdf and /tmp/tmp1TClAk/HK_qO2S7tg/r-cran-epi-2.30/vignettes/simLexis-boxes.pdf differ Binary files /tmp/tmp1TClAk/RJn9eX0x2u/r-cran-epi-2.19/vignettes/simLexis-comp-0.pdf and /tmp/tmp1TClAk/HK_qO2S7tg/r-cran-epi-2.30/vignettes/simLexis-comp-0.pdf differ Binary files /tmp/tmp1TClAk/RJn9eX0x2u/r-cran-epi-2.19/vignettes/simLexis-mort-int.pdf and /tmp/tmp1TClAk/HK_qO2S7tg/r-cran-epi-2.30/vignettes/simLexis-mort-int.pdf differ Binary files /tmp/tmp1TClAk/RJn9eX0x2u/r-cran-epi-2.19/vignettes/simLexis.pdf and /tmp/tmp1TClAk/HK_qO2S7tg/r-cran-epi-2.30/vignettes/simLexis.pdf differ Binary files /tmp/tmp1TClAk/RJn9eX0x2u/r-cran-epi-2.19/vignettes/simLexis-pstate0.pdf and /tmp/tmp1TClAk/HK_qO2S7tg/r-cran-epi-2.30/vignettes/simLexis-pstate0.pdf differ Binary files /tmp/tmp1TClAk/RJn9eX0x2u/r-cran-epi-2.19/vignettes/simLexis-pstatex.pdf and /tmp/tmp1TClAk/HK_qO2S7tg/r-cran-epi-2.30/vignettes/simLexis-pstatex.pdf differ Binary files /tmp/tmp1TClAk/RJn9eX0x2u/r-cran-epi-2.19/vignettes/simLexis-pstatey.pdf and /tmp/tmp1TClAk/HK_qO2S7tg/r-cran-epi-2.30/vignettes/simLexis-pstatey.pdf differ diff -Nru r-cran-epi-2.19/vignettes/simLexis.R r-cran-epi-2.30/vignettes/simLexis.R --- r-cran-epi-2.19/vignettes/simLexis.R 1970-01-01 00:00:00.000000000 +0000 +++ r-cran-epi-2.30/vignettes/simLexis.R 2018-05-03 14:35:00.000000000 +0000 @@ -0,0 +1,477 @@ +### R code from vignette source 'simLexis' +### Encoding: UTF-8 + +################################################### +### code chunk number 1: simLexis.rnw:24-27 +################################################### +options( width=90, + SweaveHooks=list( fig=function() + par(mar=c(3,3,1,1),mgp=c(3,1,0)/1.6,las=1,bty="n") ) ) + + +################################################### +### code chunk number 2: start +################################################### +options( width=90 ) +library( Epi ) +print( sessionInfo(), l=F ) + + +################################################### +### code chunk number 3: Lexis +################################################### +data(DMlate) +dml <- Lexis( entry = list(Per=dodm, Age=dodm-dobth, DMdur=0 ), + exit = list(Per=dox), + exit.status = factor(!is.na(dodth),labels=c("DM","Dead")), + data = DMlate ) + + +################################################### +### code chunk number 4: cut +################################################### +dmi <- cutLexis( dml, cut = dml$doins, + pre = "DM", + new.state = "Ins", + new.scale = "t.Ins", + split.states = TRUE ) +summary( dmi ) +str(dmi) + + +################################################### +### code chunk number 5: boxes +################################################### +boxes( dmi, boxpos = list(x=c(20,20,80,80), + y=c(80,20,80,20)), + scale.R = 1000, show.BE = TRUE ) + + +################################################### +### code chunk number 6: split +################################################### +Si <- splitLexis( dmi, 0:30/2, "DMdur" ) +dim( Si ) +print( subset( Si, lex.id==97 )[,1:10], digits=6 ) + + +################################################### +### code chunk number 7: knots +################################################### +nk <- 5 +( ai.kn <- with( subset(Si,lex.Xst=="Ins" & lex.Cst!=lex.Xst ), + quantile( Age+lex.dur , probs=(1:nk-0.5)/nk ) ) ) +( ad.kn <- with( subset(Si,lex.Xst=="Dead"), + quantile( Age+lex.dur , probs=(1:nk-0.5)/nk ) ) ) +( di.kn <- with( subset(Si,lex.Xst=="Ins" & lex.Cst!=lex.Xst ), + c(0,quantile( DMdur+lex.dur, probs=(1:(nk-1))/nk ) )) ) +( dd.kn <- with( subset(Si,lex.Xst=="Dead"), + c(0,quantile( DMdur+lex.dur, probs=(1:(nk-1))/nk ) )) ) +( ti.kn <- with( subset(Si,lex.Xst=="Dead(Ins)"), + c(0,quantile( t.Ins+lex.dur, probs=(1:(nk-1))/nk ) )) ) + + +################################################### +### code chunk number 8: Poisson +################################################### +library( splines ) +DM.Ins <- glm( (lex.Xst=="Ins") ~ Ns( Age , knots=ai.kn ) + + Ns( DMdur, knots=di.kn ) + + I(Per-2000) + sex, + family=poisson, offset=log(lex.dur), + data = subset(Si,lex.Cst=="DM") ) +DM.Dead <- glm( (lex.Xst=="Dead") ~ Ns( Age , knots=ad.kn ) + + Ns( DMdur, knots=dd.kn ) + + I(Per-2000) + sex, + family=poisson, offset=log(lex.dur), + data = subset(Si,lex.Cst=="DM") ) +Ins.Dead <- glm( (lex.Xst=="Dead(Ins)") ~ Ns( Age , knots=ad.kn ) + + Ns( DMdur, knots=dd.kn ) + + Ns( t.Ins, knots=ti.kn ) + + I(Per-2000) + sex, + family=poisson, offset=log(lex.dur), + data = subset(Si,lex.Cst=="Ins") ) + + +################################################### +### code chunk number 9: prop-haz +################################################### +with( Si, table(lex.Cst) ) +All.Dead <- glm( (lex.Xst %in% c("Dead(Ins)","Dead")) ~ + Ns( Age , knots=ad.kn ) + + Ns( DMdur, knots=dd.kn ) + + lex.Cst + + I(Per-2000) + sex, + family=poisson, offset=log(lex.dur), + data = Si ) +round( ci.exp( All.Dead ), 3 ) + + +################################################### +### code chunk number 10: get-dev +################################################### +what <- c("null.deviance","df.null","deviance","df.residual") +( rD <- unlist( DM.Dead[what] ) ) +( rI <- unlist( Ins.Dead[what] ) ) +( rA <- unlist( All.Dead[what] ) ) +round( c( dd <- rA-(rI+rD), "pVal"=1-pchisq(dd[3],dd[4]+1) ), 3 ) + + +################################################### +### code chunk number 11: pr-array +################################################### +pr.rates <- NArray( list( DMdur = seq(0,12,0.1), + DMage = 4:7*10, + r.Ins = c(NA,0,2,5), + model = c("DM/Ins","All"), + what = c("rate","lo","hi") ) ) +str( pr.rates ) + + +################################################### +### code chunk number 12: simLexis.rnw:382-383 +################################################### +ci.pred + + +################################################### +### code chunk number 13: make-pred +################################################### +nd <- data.frame( DMdur = as.numeric( dimnames(pr.rates)[[1]] ), + lex.Cst = factor( 1, levels=1:4, + labels=levels(Si$lex.Cst) ), + sex = factor( 1, levels=1:2, labels=c("M","F")), + lex.dur = 1000 ) +for( ia in dimnames(pr.rates)[[2]] ) + { +dnew <- transform( nd, Age = as.numeric(ia)+DMdur, + Per = 1998+DMdur ) +pr.rates[,ia,1,"DM/Ins",] <- ci.pred( DM.Dead, newdata = dnew ) +pr.rates[,ia,1,"All" ,] <- ci.pred( All.Dead, newdata = dnew ) +for( ii in dimnames(pr.rates)[[3]][-1] ) + { +dnew = transform( dnew, lex.Cst = factor( 2, levels=1:4, + labels=levels(Si$lex.Cst) ), + t.Ins = ifelse( (DMdur-as.numeric(ii)) >= 0, + DMdur-as.numeric(ii), NA ) ) +pr.rates[,ia, ii ,"DM/Ins",] <- ci.pred( Ins.Dead, newdata = dnew ) +pr.rates[,ia, ii ,"All" ,] <- ci.pred( All.Dead, newdata = dnew ) + } + } + + +################################################### +### code chunk number 14: mort-int +################################################### +par( mar=c(3,3,1,1), mgp=c(3,1,0)/1.6, las=1 ) +plot( NA, xlim=c(40,82), ylim=c(5,300), bty="n", + log="y", xlab="Age", ylab="Mortality rate per 1000 PY" ) +abline( v=seq(40,80,5), h=outer(1:9,10^(0:2),"*"), col=gray(0.8) ) +for( aa in 4:7*10 ) for( ii in 1:4 ) + matlines( aa+as.numeric(dimnames(pr.rates)[[1]]), + cbind( pr.rates[,paste(aa),ii,"DM/Ins",], + pr.rates[,paste(aa),ii,"All" ,] ), + type="l", lty=1, lwd=c(3,1,1), + col=rep(c("red","limegreen"),each=3) ) + + +################################################### +### code chunk number 15: Tr +################################################### +Tr <- list( "DM" = list( "Ins" = DM.Ins, + "Dead" = DM.Dead ), + "Ins" = list( "Dead(Ins)" = Ins.Dead ) ) + + +################################################### +### code chunk number 16: make-ini +################################################### +str( Si[NULL,1:9] ) +ini <- subset(Si,FALSE,select=1:9) +str( ini ) +ini <- subset(Si,select=1:9)[NULL,] +str( ini ) + + +################################################### +### code chunk number 17: ini-fill +################################################### +ini[1:2,"lex.id"] <- 1:2 +ini[1:2,"lex.Cst"] <- "DM" +ini[1:2,"Per"] <- 1995 +ini[1:2,"Age"] <- 60 +ini[1:2,"DMdur"] <- 5 +ini[1:2,"sex"] <- c("M","F") +ini + + +################################################### +### code chunk number 18: simL +################################################### +set.seed( 52381764 ) +Nsim <- 5000 +system.time( simL <- simLexis( Tr, + ini, + t.range = 12, + N = Nsim ) ) + + +################################################### +### code chunk number 19: sum-simL +################################################### +summary( simL, by="sex" ) + + +################################################### +### code chunk number 20: Tr.p-simP +################################################### +Tr.p <- list( "DM" = list( "Ins" = DM.Ins, + "Dead" = All.Dead ), + "Ins" = list( "Dead(Ins)" = All.Dead ) ) +system.time( simP <- simLexis( Tr.p, + ini, + t.range = 12, + N = Nsim ) ) +summary( simP, by="sex" ) + + +################################################### +### code chunk number 21: Cox-dur +################################################### +library( survival ) +Cox.Dead <- coxph( Surv( DMdur, DMdur+lex.dur, + lex.Xst %in% c("Dead(Ins)","Dead")) ~ + Ns( Age-DMdur, knots=ad.kn ) + + I(lex.Cst=="Ins") + + I(Per-2000) + sex, + data = Si ) +round( ci.exp( Cox.Dead ), 3 ) +round( ci.exp( All.Dead ), 3 ) + + +################################################### +### code chunk number 22: TR.c +################################################### +Tr.c <- list( "DM" = list( "Ins" = Tr$DM$Ins, + "Dead" = Cox.Dead ), + "Ins" = list( "Dead(Ins)" = Cox.Dead ) ) +system.time( simC <- simLexis( Tr.c, + ini, + t.range = 12, + N = Nsim ) ) +summary( simC, by="sex" ) + + +################################################### +### code chunk number 23: nState +################################################### +system.time( +nSt <- nState( subset(simL,sex=="M"), + at=seq(0,11,0.2), from=1995, time.scale="Per" ) ) +nSt[1:10,] + + +################################################### +### code chunk number 24: pstate0 +################################################### +pM <- pState( nSt, perm=c(1,2,4,3) ) +head( pM ) +par( mfrow=c(1,2), mar=c(3,3,1,1), mgp=c(3,1,0)/1.6 ) +plot( pM ) +plot( pM, border="black", col="transparent", lwd=3 ) +text( rep(as.numeric(rownames(pM)[nrow(pM)-1]),ncol(pM)), + pM[nrow(pM),]-diff(c(0,pM[nrow(pM),]))/5, + colnames( pM ), adj=1 ) +box( col="white", lwd=3 ) +box() + + +################################################### +### code chunk number 25: pstatex +################################################### +clr <- c("limegreen","orange") +# expand with a lighter version of the two chosen colors +clx <- c( clr, rgb( t( col2rgb( clr[2:1] )*2 + rep(255,3) ) / 3, max=255 ) ) +par( mfrow=c(1,2), las=1, mar=c(3,3,4,2), mgp=c(3,1,0)/1.6 ) +# Men +plot( pM, col=clx ) +lines( as.numeric(rownames(pM)), pM[,2], lwd=3 ) +mtext( "60 year old male, diagnosed 1990, aged 55", side=3, line=2.5, adj=0, col=gray(0.6) ) +mtext( "Survival curve", side=3, line=1.5, adj=0 ) +mtext( "DM, no insulin DM, Insulin", side=3, line=0.5, adj=0, col=clr[2] ) +mtext( "DM, no insulin", side=3, line=0.5, adj=0, col=clr[1] ) +axis( side=4 ) +axis( side=4, at=1:19/20, labels=FALSE ) +axis( side=4, at=1:99/100, labels=FALSE, tcl=-0.3 ) +# Women +pF <- pState( nState( subset(simL,sex=="F"), + at=seq(0,11,0.2), + from=1995, + time.scale="Per" ), + perm=c(1,2,4,3) ) +plot( pF, col=clx ) +lines( as.numeric(rownames(pF)), pF[,2], lwd=3 ) +mtext( "60 year old female, diagnosed 1990, aged 55", side=3, line=2.5, adj=0, col=gray(0.6) ) +mtext( "Survival curve", side=3, line=1.5, adj=0 ) +mtext( "DM, no insulin DM, Insulin", side=3, line=0.5, adj=0, col=clr[1] ) +mtext( "DM, no insulin", side=3, line=0.5, adj=0, col=clr[2] ) +axis( side=4 ) +axis( side=4, at=1:19/20, labels=FALSE ) +axis( side=4, at=1:99/100, labels=FALSE, tcl=-0.3 ) + + +################################################### +### code chunk number 26: pstatey +################################################### +par( mfrow=c(1,2), las=1, mar=c(3,3,4,2), mgp=c(3,1,0)/1.6 ) +# Men +pM <- pState( nState( subset(simL,sex=="M"), + at=seq(0,11,0.2), + from=60, + time.scale="Age" ), + perm=c(1,2,4,3) ) +plot( pM, col=clx, xlab="Age" ) +lines( as.numeric(rownames(pM)), pM[,2], lwd=3 ) +mtext( "60 year old male, diagnosed 1990, aged 55", side=3, line=2.5, adj=0, col=gray(0.6) ) +mtext( "Survival curve", side=3, line=1.5, adj=0 ) +mtext( "DM, no insulin DM, Insulin", side=3, line=0.5, adj=0, col=clr[2] ) +mtext( "DM, no insulin", side=3, line=0.5, adj=0, col=clr[1] ) +axis( side=4 ) +axis( side=4, at=1:19/20, labels=FALSE ) +axis( side=4, at=1:19/20, labels=FALSE, tcl=-0.4 ) +axis( side=4, at=1:99/100, labels=FALSE, tcl=-0.3 ) +# Women +pF <- pState( nState( subset(simL,sex=="F"), + at=seq(0,11,0.2), + from=60, + time.scale="Age" ), + perm=c(1,2,4,3) ) +plot( pF, col=clx, xlab="Age" ) +lines( as.numeric(rownames(pF)), pF[,2], lwd=3 ) +mtext( "60 year old female, diagnosed 1990, aged 55", side=3, line=2.5, adj=0, col=gray(0.6) ) +mtext( "Survival curve", side=3, line=1.5, adj=0 ) +mtext( "DM, no insulin DM, Insulin", side=3, line=0.5, adj=0, col=clr[1] ) +mtext( "DM, no insulin", side=3, line=0.5, adj=0, col=clr[2] ) +axis( side=4 ) +axis( side=4, at=1:9/10, labels=FALSE ) +axis( side=4, at=1:19/20, labels=FALSE, tcl=-0.4 ) +axis( side=4, at=1:99/100, labels=FALSE, tcl=-0.3 ) + + +################################################### +### code chunk number 27: comp-0 +################################################### +PrM <- pState( nState( subset(simP,sex=="M"), + at=seq(0,11,0.2), + from=60, + time.scale="Age" ), + perm=c(1,2,4,3) ) +PrF <- pState( nState( subset(simP,sex=="F"), + at=seq(0,11,0.2), + from=60, + time.scale="Age" ), + perm=c(1,2,4,3) ) +CoxM <- pState( nState( subset(simC,sex=="M"), + at=seq(0,11,0.2), + from=60, + time.scale="Age" ), + perm=c(1,2,4,3) ) +CoxF <- pState( nState( subset(simC,sex=="F"), + at=seq(0,11,0.2), + from=60, + time.scale="Age" ), + perm=c(1,2,4,3) ) + +par( mfrow=c(1,2), mar=c(3,3,1,1), mgp=c(3,1,0)/1.6 ) + plot( pM, border="black", col="transparent", lwd=3 ) +lines( PrM, border="blue" , col="transparent", lwd=3 ) +lines( CoxM, border="red" , col="transparent", lwd=3 ) +text( 60.5, 0.05, "M" ) +box( lwd=3 ) + + plot( pF, border="black", col="transparent", lwd=3 ) +lines( PrF, border="blue" , col="transparent", lwd=3 ) +lines( CoxF, border="red" , col="transparent", lwd=3 ) +text( 60.5, 0.05, "F" ) +box( lwd=3 ) + + +################################################### +### code chunk number 28: CHANGE1 (eval = FALSE) +################################################### +## source( "../R/simLexis.R", keep.source=TRUE ) + + +################################################### +### code chunk number 29: CHANGE2 +################################################### +simX <- Epi:::simX +sim1 <- Epi:::sim1 +lint <- Epi:::lint +get.next <- Epi:::get.next +chop.lex <- Epi:::chop.lex + + +################################################### +### code chunk number 30: simLexis.rnw:934-937 +################################################### +cbind( +attr( ini, "time.scale" ), +attr( ini, "time.since" ) ) + + +################################################### +### code chunk number 31: simLexis.rnw:962-963 +################################################### +simLexis + + +################################################### +### code chunk number 32: simLexis.rnw:980-981 +################################################### +simX + + +################################################### +### code chunk number 33: simLexis.rnw:993-994 +################################################### +sim1 + + +################################################### +### code chunk number 34: simLexis.rnw:1006-1007 +################################################### +lint + + +################################################### +### code chunk number 35: simLexis.rnw:1017-1018 +################################################### +get.next + + +################################################### +### code chunk number 36: simLexis.rnw:1027-1028 +################################################### +chop.lex + + +################################################### +### code chunk number 37: simLexis.rnw:1045-1046 +################################################### +nState + + +################################################### +### code chunk number 38: simLexis.rnw:1055-1056 +################################################### +pState + + +################################################### +### code chunk number 39: simLexis.rnw:1060-1062 +################################################### +plot.pState +lines.pState + + diff -Nru r-cran-epi-2.19/vignettes/simLexis.rnw r-cran-epi-2.30/vignettes/simLexis.rnw --- r-cran-epi-2.19/vignettes/simLexis.rnw 2017-04-14 06:58:08.000000000 +0000 +++ r-cran-epi-2.30/vignettes/simLexis.rnw 2018-03-08 11:41:17.000000000 +0000 @@ -1,13 +1,12 @@ -\SweaveOpts{results=verbatim,keep.source=TRUE,include=FALSE,eps=FALSE,prefix.string=sL} + +\SweaveOpts{results=verbatim,keep.source=TRUE,include=FALSE,eps=FALSE} %\VignetteIndexEntry{Simulation of multistate models with multiple timescales: simLexis} \documentclass[a4paper,twoside,12pt]{report} -%---------------------------------------------------------------------- -% General information \newcommand{\Title}{Simulation of\\ multistate models with\\ multiple timescales:\\ \texttt{simLexis} in the \texttt{Epi} package} \newcommand{\Tit}{Multistate models with multiple timescales} -\newcommand{\Version}{Version 2.3} +\newcommand{\Version}{Version 2.4} \newcommand{\Dates}{\today} \newcommand{\Where}{SDC} \newcommand{\Homepage}{\url{http://BendixCarstensen.com/Epi/simLexis.pdf}} @@ -16,149 +15,17 @@ & Steno Diabetes Center, Gentofte, Denmark\\ & {\small \& Department of Biostatistics, University of Copenhagen} \\ - & \texttt{bxc@steno.dk}\\ + & \texttt{b@bxc.dk}\\ & \url{http://BendixCarstensen.com} \\[1em] \end{tabular}} -%---------------------------------------------------------------------- -% Packages -%\usepackage[inline]{showlabels} -\usepackage[utf8]{inputenc} -\usepackage[T1]{fontenc} -\usepackage[english]{babel} -\usepackage[font=it,labelfont=normalfont]{caption} -\usepackage[colorlinks,urlcolor=blue,linkcolor=red]{hyperref} -\usepackage[ae,hyper]{Rd} -\usepackage[dvipsnames]{xcolor} -\usepackage[super]{nth} -\usepackage{makeidx,Sweave,floatflt,amsmath,amsfonts,amsbsy,enumitem,dcolumn,needspace} -\usepackage{ifthen,calc,eso-pic,everyshi} -\usepackage{booktabs,longtable,rotating,graphicx} -\usepackage{pdfpages,verbatim,fancyhdr,datetime,% -afterpage} -\usepackage[abspath]{currfile} -% \usepackage{times} -\renewcommand{\textfraction}{0.0} -\renewcommand{\topfraction}{1.0} -\renewcommand{\bottomfraction}{1.0} -\renewcommand{\floatpagefraction}{0.9} -\DeclareMathOperator{\Pp}{P} -\providecommand{\pmat}[1]{\Pp\left\{#1\right\}} -\providecommand{\ptxt}[1]{\Pp\left\{\text{#1}\right\}} -\providecommand{\dif}{{\,\mathrm d}} -% \usepackage{mslapa} -\newenvironment{exercise}[0]{\refstepcounter{exno} - \begin{quote} - {\bf Exercise \theexno.}} - {\end{quote}} -\definecolor{blaa}{RGB}{99,99,255} -\DeclareGraphicsExtensions{.pdf,.jpg} -% Make the Sweave output nicer -\DefineVerbatimEnvironment{Sinput}{Verbatim}{fontsize=\footnotesize,fontshape=sl,formatcom=\color{Blue}} -\DefineVerbatimEnvironment{Soutput}{Verbatim}{fontsize=\footnotesize,formatcom=\color{Maroon},xleftmargin=0em} -\DefineVerbatimEnvironment{Scode}{Verbatim}{fontsize=\footnotesize} -\fvset{listparameters={\setlength{\topsep}{0pt}}} -\renewenvironment{Schunk}% -{\renewcommand{\baselinestretch}{0.85} \vspace{\topsep}}% -{\renewcommand{\baselinestretch}{1.00} \vspace{\topsep}} -% \renewenvironment{knitrout} -% {\renewcommand{\baselinestretch}{0.85}} -% {\renewcommand{\baselinestretch}{1.00}} % redefined in topreport.tex - -%---------------------------------------------------------------------- -% The usual usefuls -% \input{/home/bendix/util/tex/useful.tex} - -%---------------------------------------------------------------------- -% Set up layout of pages -\oddsidemargin 1mm -\evensidemargin 1mm -\topmargin -5mm -\headheight 8mm -\headsep 5mm -\textheight 240mm -\textwidth 165mm -%\footheight 5mm -\footskip 15mm -\renewcommand{\topfraction}{0.9} -\renewcommand{\bottomfraction}{0.9} -\renewcommand{\textfraction}{0.1} -\renewcommand{\floatpagefraction}{0.9} -\renewcommand{\headrulewidth}{0.1pt} -\setcounter{secnumdepth}{4} -\setcounter{tocdepth}{2} - -%---------------------------------------------------------------------- -% How to insert a figure in a .rnw file -\newcommand{\rwpre}{sL} -\newcommand{\insfig}[3]{ -\begin{figure}[h] - \centering - \includegraphics[width=#2\textwidth]{\rwpre-#1} - \caption{#3} - \label{fig:#1} -% \afterpage{\clearpage} -\end{figure}} - -%---------------------------------------------------------------------- -% Here is the document starting with the titlepage -\begin{document} - -%---------------------------------------------------------------------- -% The title page -\setcounter{page}{1} -\pagenumbering{roman} -\pagestyle{plain} -\thispagestyle{empty} -% \vspace*{0.05\textheight} -\flushright -% The blank below here is necessary in order not to muck up the -% linespacing in title if it has more than 2 lines -{\Huge \bfseries \Title - -}\ \\[-1.5ex] -\noindent\textcolor{blaa}{\rule[-1ex]{\textwidth}{5pt}}\\[2.5ex] -\large -\Where \\ -\Dates \\ -\Homepage \\ -\Version \\[1em] -\normalsize -Compiled \today,\ \currenttime\\ -% from: \texttt{\currfileabspath}\\[1em] -% \input{wordcount} -\normalsize -\vfill -\Faculty -% End of titlepage -% \newpage - -%---------------------------------------------------------------------- -% Table of contents -\tableofcontents - -%---------------------------------------------------------------------- -% General text layout -\raggedright -\parindent 1em -\parskip 0ex -\cleardoublepage - -%---------------------------------------------------------------------- -% General page style -\pagenumbering{arabic} -\setcounter{page}{1} -\pagestyle{fancy} -\renewcommand{\chaptermark}[1]{\markboth{\textsl{#1}}{}} -\renewcommand{\sectionmark}[1]{\markright{\thesection\ \textsl{#1}}{}} -\fancyhead[EL]{\bf \thepage \quad \rm \leftmark} -\fancyhead[ER]{\sl \Tit} -\fancyhead[OR]{\rm \rightmark \quad \bf \thepage} -\fancyfoot{} - - -%---------------------------------------------------------------------- -% Here comes the substance +\input{topreport} +\renewcommand{\rwpre}{./simLexis} +<>= +options( width=90, + SweaveHooks=list( fig=function() + par(mar=c(3,3,1,1),mgp=c(3,1,0)/1.6,las=1,bty="n") ) ) +@ % \chapter{Using \texttt{simLexis}} @@ -348,7 +215,7 @@ Note that when we split the follow-up each person's follow up now consists of many records, each with the \emph{current} values of the timescales at the start of the interval represented by the record. In -the modelling we must necessarily assume that the rates are constant +the modeling we must necessarily assume that the rates are constant within each 6-month interval, but the \emph{size} of these rates we model as smooth functions of the time scales (that is the values at the beginning of each interval). @@ -593,7 +460,7 @@ the age-effect for a fixed value of insulin duration really is a misnomer since it does not correspond to any real person's follow-up, but to the mortality of persons in different ages but with the same -duration of incuin use. +duration of insulin use. \section{Input to the \texttt{simLexis} function} @@ -651,27 +518,23 @@ \section{Simulation of the follow-up} -Now we simulate 1000 of each of these persons using the estimated -model. The \texttt{t.range} argument gives the times range where the -integrated intensities (cumulative rates) are evaluated and where -linear interpolation is used when simulating transition times. Note -that this must be given in the same units as \texttt{lex.dur} in the -\texttt{Lexis} object used for fitting the models for the transitions. +Now we simulate life-courses of a 1000 of each of these persons using +the estimated model. The \texttt{t.range} argument gives the times +range where the integrated intensities (cumulative rates) are +evaluated and where linear interpolation is used when simulating +transition times. Note that this must be given in the same units as +\texttt{lex.dur} in the \texttt{Lexis} object used for fitting the +models for the transitions. It is not a parameter that can be easily +determined from the \texttt{TR} object, hence it must be supplied by +the user. <>= set.seed( 52381764 ) -Nsim <- 1000 +Nsim <- 5000 system.time( simL <- simLexis( Tr, ini, t.range = 12, N = Nsim ) ) @ % -%% Temp -%% <<>>= -%% source("../../../tmpstore/rbind.Lexis.R") -%% source("../R/simLexis.R") -%% lls() -%% @ -%% Temp The result is a \texttt{Lexis} object --- a data frame representing the simulated follow-up of \Sexpr{2*Nsim} persons (\Sexpr{Nsim} identical men and \Sexpr{Nsim} identical women) according to the rates @@ -768,7 +631,7 @@ fractions in each state, or, rather more relevant, the cumulative fraction across the states in some specified order, so that a plot of the stacked probabilities can be made, using either the default rather -colorful layout, or a more minimalistic version (both in figure \ref{fig:pstate0}): +colorful layout, or a more minimalist version (both in figure \ref{fig:pstate0}): <>= pM <- pState( nSt, perm=c(1,2,4,3) ) head( pM ) @@ -782,7 +645,7 @@ box() @ % \insfig{pstate0}{1.0}{Default layout of the \textrm{\tt plot.pState} - graph (left), and a version with the state probabilites as lines and + graph (left), and a version with the state probabilities as lines and annotation of states.} A more useful set-up of the graph would include a more through @@ -797,8 +660,8 @@ lines( as.numeric(rownames(pM)), pM[,2], lwd=3 ) mtext( "60 year old male, diagnosed 1990, aged 55", side=3, line=2.5, adj=0, col=gray(0.6) ) mtext( "Survival curve", side=3, line=1.5, adj=0 ) -mtext( "DM, no insulin DM, Insulin", side=3, line=0.5, adj=0, col=clr[1] ) -mtext( "DM, no insulin", side=3, line=0.5, adj=0, col=clr[2] ) +mtext( "DM, no insulin DM, Insulin", side=3, line=0.5, adj=0, col=clr[2] ) +mtext( "DM, no insulin", side=3, line=0.5, adj=0, col=clr[1] ) axis( side=4 ) axis( side=4, at=1:19/20, labels=FALSE ) axis( side=4, at=1:99/100, labels=FALSE, tcl=-0.3 ) @@ -837,8 +700,8 @@ lines( as.numeric(rownames(pM)), pM[,2], lwd=3 ) mtext( "60 year old male, diagnosed 1990, aged 55", side=3, line=2.5, adj=0, col=gray(0.6) ) mtext( "Survival curve", side=3, line=1.5, adj=0 ) -mtext( "DM, no insulin DM, Insulin", side=3, line=0.5, adj=0, col=clr[1] ) -mtext( "DM, no insulin", side=3, line=0.5, adj=0, col=clr[2] ) +mtext( "DM, no insulin DM, Insulin", side=3, line=0.5, adj=0, col=clr[2] ) +mtext( "DM, no insulin", side=3, line=0.5, adj=0, col=clr[1] ) axis( side=4 ) axis( side=4, at=1:19/20, labels=FALSE ) axis( side=4, at=1:19/20, labels=FALSE, tcl=-0.4 ) @@ -861,7 +724,7 @@ axis( side=4, at=1:99/100, labels=FALSE, tcl=-0.3 ) @ % Note the several statements with \texttt{axis(side=4,...}; they are -nesessary to get the fine tick-marks in the right hand side of the +necessary to get the fine tick-marks in the right hand side of the plots that you will need in order to read off the probabilities at 2006 (or 71 years). @@ -917,7 +780,7 @@ From figure \ref{fig:comp-0} it is clear that the two proportional hazards models (blue and red curves) produce pretty much the same -estimates of the state occupancy probabilites over time, but also that +estimates of the state occupancy probabilities over time, but also that they relative to the model with separately estimated transition intensities overestimates the probability of being alive without insulin and underestimates the probabilities of being dead without @@ -927,7 +790,7 @@ mortality of the insulin treated diabetes patients relative to the non-insulin treated. -Interestingly, we also see a bump in th estimated probabilities from +Interestingly, we also see a bump in the estimated probabilities from the Cox-model based model, but this is entirely an artifact that comes from the estimation method for the baseline hazard of the Cox-model that lets the (cumulative) hazard have large jumps at event times @@ -1030,9 +893,30 @@ This section explains the actually existing code for \texttt{simLexis}, as it is in the current version of \texttt{Epi}. -<>= + +% The following code sources the originally formatted code in +% simLexis.R in order to make it printable with all the formatting and +% comments in it. However the sourcing does not work when compiling +% the vignettes in R CMD build. So when checking the code we comment +% this out by putting eval=FALSE, and moreover, because it is only +% simLexis that is exported, we also need to bring the functions simX, +% sim1, lint, get.next and chop.lex into the workspace. So depending +% on whether we actually construct the pdf for the inst/doc folder or +% test (upload) the package, one of the following to chunks are run +% with eval=FALSE and the other with eval=TRUE + +% When compiling the published vignette +<>= source( "../R/simLexis.R", keep.source=TRUE ) @ % +% When checking the package +<>= +simX <- Epi:::simX +sim1 <- Epi:::sim1 +lint <- Epi:::lint +get.next <- Epi:::get.next +chop.lex <- Epi:::chop.lex +@ % The function \texttt{simLexis} takes a \texttt{Lexis} object as input. This defines the initial state(s) and times of the start for a number of persons. Since the purpose is to simulate a history through @@ -1179,20 +1063,20 @@ @ % \bibliographystyle{plain} - \begin{thebibliography}{1} +\begin{thebibliography}{1} \bibitem{Carstensen.2011a} -B.~Carstensen and M.~Plummer. +B~Carstensen and M~Plummer. \newblock Using {L}exis objects for multi-state models in {R}. \newblock {\em Journal of Statistical Software}, 38(6):1--18, 1 2011. \bibitem{Iacobelli.2013} -S.~Iacobelli and B.~Carstensen. -\newblock {{M}ultiple time scales in multi-state models}. +S~Iacobelli and B~Carstensen. +\newblock {Multiple time scales in multi-state models}. \newblock {\em Stat Med}, 32(30):5315--5327, Dec 2013. \bibitem{Plummer.2011} -M.~Plummer and B.~Carstensen. +M~Plummer and B~Carstensen. \newblock Lexis: An {R} class for epidemiological studies with long-term follow-up. \newblock {\em Journal of Statistical Software}, 38(5):1--12, 1 2011. diff -Nru r-cran-epi-2.19/vignettes/simLexis.rwl r-cran-epi-2.30/vignettes/simLexis.rwl --- r-cran-epi-2.19/vignettes/simLexis.rwl 1970-01-01 00:00:00.000000000 +0000 +++ r-cran-epi-2.30/vignettes/simLexis.rwl 2018-05-03 14:36:24.000000000 +0000 @@ -0,0 +1,54 @@ +R version 3.4.4 (2018-03-15) + --------------------------------------------- + Folder: /home/bendix/stat/R/lib.src/Epi/pkg/vignettes + Started: Thursday 03. May 2018, 16:35:00 + --------------------------------------------- +Writing to file simLexis.tex +Processing code chunks with options ... + 1 : keep.source term verbatim (simLexis.rnw:24) + 2 : echo keep.source term verbatim (label = start, simLexis.rnw:156) + 3 : echo keep.source term verbatim (label = Lexis, simLexis.rnw:163) + 4 : echo keep.source term verbatim (label = cut, simLexis.rnw:176) + 5 : echo keep.source term verbatim pdf (label = boxes, simLexis.rnw:189) + 6 : echo keep.source term verbatim (label = split, simLexis.rnw:210) + 7 : echo keep.source term verbatim (label = knots, simLexis.rnw:238) + 8 : echo keep.source term verbatim (label = Poisson, simLexis.rnw:260) + 9 : echo keep.source term verbatim (label = prop-haz, simLexis.rnw:299) +10 : echo keep.source term verbatim (label = get-dev, simLexis.rnw:327) +11 : echo keep.source term verbatim (label = pr-array, simLexis.rnw:371) +12 : echo keep.source term verbatim (simLexis.rnw:382) +13 : echo keep.source term verbatim (label = make-pred, simLexis.rnw:389) +14 : echo keep.source term verbatim pdf (label = mort-int, simLexis.rnw:416) +15 : echo keep.source term verbatim (label = Tr, simLexis.rnw:478) +16 : echo keep.source term verbatim (label = make-ini, simLexis.rnw:497) +17 : echo keep.source term verbatim (label = ini-fill, simLexis.rnw:509) +18 : echo keep.source term verbatim (label = simL, simLexis.rnw:530) +19 : echo keep.source term verbatim (label = sum-simL, simLexis.rnw:542) +20 : echo keep.source term verbatim (label = Tr.p-simP, simLexis.rnw:559) +21 : echo keep.source term verbatim (label = Cox-dur, simLexis.rnw:576) +22 : echo keep.source term verbatim (label = TR.c, simLexis.rnw:604) +23 : echo keep.source term verbatim (label = nState, simLexis.rnw:621) +24 : echo keep.source term verbatim pdf (label = pstate0, simLexis.rnw:635) +25 : echo keep.source term verbatim pdf (label = pstatex, simLexis.rnw:653) +26 : echo keep.source term verbatim pdf (label = pstatey, simLexis.rnw:691) +27 : echo keep.source term verbatim pdf (label = comp-0, simLexis.rnw:740) +28 : keep.source (label = CHANGE1, simLexis.rnw:909) +29 : keep.source term hide (label = CHANGE2, simLexis.rnw:913) +30 : echo keep.source term verbatim (simLexis.rnw:934) +31 : echo keep.source term verbatim (simLexis.rnw:962) +32 : echo keep.source term verbatim (simLexis.rnw:980) +33 : echo keep.source term verbatim (simLexis.rnw:993) +34 : echo keep.source term verbatim (simLexis.rnw:1006) +35 : echo keep.source term verbatim (simLexis.rnw:1017) +36 : echo keep.source term verbatim (simLexis.rnw:1027) +37 : echo keep.source term verbatim (simLexis.rnw:1045) +38 : echo keep.source term verbatim (simLexis.rnw:1055) +39 : echo keep.source term verbatim (simLexis.rnw:1060) + +You can now run (pdf)latex on ‘simLexis.tex’ + + --------------------------------------------- + Folder: /home/bendix/stat/R/lib.src/Epi/pkg/vignettes + Ended: Thursday 03. May 2018, 16:36:24 + Elapsed: 00:01:24 + --------------------------------------------- diff -Nru r-cran-epi-2.19/vignettes/simLexis.tex r-cran-epi-2.30/vignettes/simLexis.tex --- r-cran-epi-2.19/vignettes/simLexis.tex 1970-01-01 00:00:00.000000000 +0000 +++ r-cran-epi-2.30/vignettes/simLexis.tex 2018-05-03 14:36:24.000000000 +0000 @@ -0,0 +1,1746 @@ + + +%\VignetteIndexEntry{Simulation of multistate models with multiple timescales: simLexis} +\documentclass[a4paper,twoside,12pt]{report} + +\newcommand{\Title}{Simulation of\\ multistate models with\\ multiple + timescales:\\ \texttt{simLexis} in the \texttt{Epi} package} +\newcommand{\Tit}{Multistate models with multiple timescales} +\newcommand{\Version}{Version 2.4} +\newcommand{\Dates}{\today} +\newcommand{\Where}{SDC} +\newcommand{\Homepage}{\url{http://BendixCarstensen.com/Epi/simLexis.pdf}} +\newcommand{\Faculty}{\begin{tabular}{rl} +Bendix Carstensen + & Steno Diabetes Center, Gentofte, Denmark\\ + & {\small \& Department of Biostatistics, + University of Copenhagen} \\ + & \texttt{b@bxc.dk}\\ + & \url{http://BendixCarstensen.com} \\[1em] + \end{tabular}} + +\input{topreport} +\renewcommand{\rwpre}{./simLexis} + +\chapter{Using \texttt{simLexis}} + +\section{Introduction} + +This vignette explains the machinery behind simulation of life +histories through multistate models implemented in +\texttt{simLexis}. In \texttt{simLexis} transition rates are allowed +to depend on multiple time scales, including timescales defined as +time since entry to a particular state (duration). This therefore also +covers the case where time \emph{at} entry into a state is an +explanatory variable for the rates, since time at entry is merely time +minus duration. Thus, the set-up here goes beyond Markov- and +semi-Markov-models, and brings simulation based estimation of +state-occupancy probabilities into the realm of realistic multistate +models. + +The basic idea is to simulate a new \texttt{Lexis} object +\cite{Plummer.2011,Carstensen.2011a} as defined in the \texttt{Epi} +package for \R, based on 1) a multistate model defined by its states +and the transition rates between them and 2) an initial population of +individuals. + +Thus the output will be a \texttt{Lexis} object describing the +transitions of a predefined set of persons through a multistate +model. Therefore, if persons are defined to be identical at start, +then calculation of the probability of being in a particular state at +a given time boils down to a simple enumeration of the fraction of the +persons in the particular state at the given time. Bar of course the +(binomial) simulation error, but this can be brought down by +simulation a sufficiently large number of persons. + +An observed \texttt{Lexis} object with follow-up of persons through a +number of states will normally be the basis for estimation of +transition rates between states, and thus will contain all information +about covariates determining the occurrence rates, in particular the +\emph{timescales} \cite{Iacobelli.2013}. Hence, the natural input to +simulation from an estimated multistate model will typically be an +object of the same structure as the originally observed. Since +transitions and times are what is simulated, any values of +\texttt{lex.Xst} and \texttt{lex.dur} in the input object will of +course be ignored. + +This first chapter of this vignette shows by an example how to use the +function \texttt{simLexis} and display the results. The subsequent +chapter discusses in more detail how the simulation machinery is +implemented and is not needed for the practical use of \texttt{simLexis}. + +\section{\texttt{simLexis} in practice} + +This section is largely a commented walk-trough of the example from +the help-page of \texttt{simLexis}, with a larger number of simulated +persons in order to minimize the pure simulation variation. + +When we want to simulate transition times through a multistate model +where transition rates may depend on time since entry to the current +or a previous state, it is essential that we have a machinery to keep +track of the transition time on \emph{all} time scales, as well as a +mechanism that can initiate a new time scale to 0 when a transition +occurs to a state where we shall use time since entry as determinant +of exit rates from that state. This is provided by \texttt{simLexis}. + +\subsection{Input for the simulation} + +Input for simulation of a single trajectory through a multistate model +requires a representation of the \emph{current status} of a person; +the starting conditions. The object that we supply to the simulation +function must contain information about all covariates and all +timescales upon which transitions depend, and in particular which +one(s) of the timescales that are defined as time since entry into a +particular state. Hence, starting conditions should be represented as +a \texttt{Lexis} object (where \texttt{lex.dur} and \texttt{lex.Xst} +are ignored, since there is no follow-up yet), where the time scale +information is in the attributes \texttt{time.scale} and +\texttt{time.since} respectively. + +Thus there are two main arguments to a function to simulate from a +multistate model: +\begin{enumerate} +\item A \texttt{Lexis} object representing the initial states and + covariates of the population to be simulated. This has to have the + same structure as the original \texttt{Lexis} object representing + the multistate model from which transition rates in the model were + estimated. As noted above, the values for \texttt{lex.Xst} and + \texttt{lex.dur} are not required (since these are the quantities + that will be simulated). +\item A transition object, representing the transition intensities + between states, which should be a list of lists of intensity + representations. As an intensity representation we mean a function + that for given a \texttt{Lexis} object that can be used to produce + estimates of the transition intensities at a set of supplied time + points since the state represented in the \texttt{Lexis} object. + + The names of the elements of the transition object (which are lists) + will be names of the \emph{transient} states, that is the states + \emph{from} which a transition can occur. The names of the elements + of each of these lists are the names of states \emph{to} which + transitions can occur (which may be either transient or absorbing + states). + + Hence, if the transition object is called \texttt{Tr} then + \verb+TR$A$B+ (or \verb+Tr[["A"]][["B"]]+) will represent the + transition intensity from state \texttt{A} to the state \texttt{B}. + + The entries in the transition object can be either \texttt{glm} + objects, representing Poisson models for the transitions, + \texttt{coxph} objects representing an intensity model along one + time scale, or simply a function that takes a \texttt{Lexis} + object as input returns an estimated intensity for each row. + +\end{enumerate} + +In addition to these two input items, there will be a couple of tuning +parameters. + +The output of the function will simply be a \texttt{Lexis} object with +simulated transitions between states. This will be the basis for +deriving sensible statistics from the \texttt{Lexis} object --- see +next section. + +\section{Setting up a \texttt{Lexis} object} + +As an example we will use the \texttt{DMlate} dataset from the +\texttt{Epi} package; it is a dataset simulated to resemble a random +sample of 10,000 patients from the Danish National Diabetes Register. + +We start by loading the \texttt{Epi} package: +\begin{Schunk} +\begin{Sinput} +> options( width=90 ) +> library( Epi ) +> print( sessionInfo(), l=F ) +\end{Sinput} +\begin{Soutput} +R version 3.4.4 (2018-03-15) +Platform: x86_64-pc-linux-gnu (64-bit) +Running under: Ubuntu 14.04.5 LTS + +Matrix products: default +BLAS: /usr/lib/openblas-base/libopenblas.so.0 +LAPACK: /usr/lib/lapack/liblapack.so.3.0 + +attached base packages: +[1] utils datasets graphics grDevices stats methods base + +other attached packages: +[1] Epi_2.29 + +loaded via a namespace (and not attached): + [1] cmprsk_2.2-7 zoo_1.8-0 MASS_7.3-49 compiler_3.4.4 + [5] Matrix_1.2-14 plyr_1.8.4 parallel_3.4.4 survival_2.42-3 + [9] etm_1.0.1 Rcpp_0.12.12 splines_3.4.4 grid_3.4.4 +[13] data.table_1.10.4 numDeriv_2016.8-1 lattice_0.20-35 +\end{Soutput} +\end{Schunk} +First we load the diabetes data and set up a simple illness-death +model: +\begin{Schunk} +\begin{Sinput} +> data(DMlate) +> dml <- Lexis( entry = list(Per=dodm, Age=dodm-dobth, DMdur=0 ), ++ exit = list(Per=dox), ++ exit.status = factor(!is.na(dodth),labels=c("DM","Dead")), ++ data = DMlate ) +\end{Sinput} +\begin{Soutput} +NOTE: entry.status has been set to "DM" for all. +\end{Soutput} +\end{Schunk} +This is just data for a simple survival model with states ``DM'' and +``Dead''. Now we cut the follow-up at insulin start, which for the +majority of patients (T2D) is a clinical indicator of deterioration of +disease regulation. We therefore also introduce a new timescale, and +split the non-precursor states, so that we can address the question of +ever having been on insulin: +\begin{Schunk} +\begin{Sinput} +> dmi <- cutLexis( dml, cut = dml$doins, ++ pre = "DM", ++ new.state = "Ins", ++ new.scale = "t.Ins", ++ split.states = TRUE ) +> summary( dmi ) +\end{Sinput} +\begin{Soutput} +Transitions: + To +From DM Ins Dead Dead(Ins) Records: Events: Risk time: Persons: + DM 6157 1694 2048 0 9899 3742 45885.49 9899 + Ins 0 1340 0 451 1791 451 8387.77 1791 + Sum 6157 3034 2048 451 11690 4193 54273.27 9996 +\end{Soutput} +\begin{Sinput} +> str(dmi) +\end{Sinput} +\begin{Soutput} +Classes ‘Lexis’ and 'data.frame': 11690 obs. of 15 variables: + $ Per : num 1999 2003 2005 2009 2009 ... + $ Age : num 58.7 64.1 86.3 44 75.8 ... + $ DMdur : num 0 0 0 0 0 0 0 0 0 0 ... + $ t.Ins : num NA NA NA NA NA NA NA NA NA NA ... + $ lex.dur: num 11.08 6.689 5.446 0.736 1.344 ... + $ lex.Cst: Factor w/ 4 levels "DM","Ins","Dead",..: 1 1 1 1 1 1 1 1 1 1 ... + $ lex.Xst: Factor w/ 4 levels "DM","Ins","Dead",..: 1 1 1 1 1 3 1 1 3 1 ... + $ lex.id : int 1 2 3 4 5 6 7 8 9 10 ... + $ sex : Factor w/ 2 levels "M","F": 2 1 2 2 1 2 1 1 2 1 ... + $ dobth : num 1940 1939 1918 1965 1933 ... + $ dodm : num 1999 2003 2005 2009 2009 ... + $ dodth : num NA NA NA NA NA ... + $ dooad : num NA 2007 NA NA NA ... + $ doins : num NA NA NA NA NA NA NA NA NA NA ... + $ dox : num 2010 2010 2010 2010 2010 ... + - attr(*, "time.scales")= chr "Per" "Age" "DMdur" "t.Ins" + - attr(*, "time.since")= chr "" "" "" "Ins" + - attr(*, "breaks")=List of 4 + ..$ Per : NULL + ..$ Age : NULL + ..$ DMdur: NULL + ..$ t.Ins: NULL +\end{Soutput} +\end{Schunk} +We can show how many person-years we have and show the number of +transitions and transition rates (per 1000), using the +\texttt{boxes.Lexis} function to display the states and the number of +transitions between them: +\begin{Schunk} +\begin{Sinput} +> boxes( dmi, boxpos = list(x=c(20,20,80,80), ++ y=c(80,20,80,20)), ++ scale.R = 1000, show.BE = TRUE ) +\end{Sinput} +\end{Schunk} +\insfig{boxes}{0.8}{Data overview for the \textrm{\tt dmi} + dataset. Numbers in the boxes are person-years and the number of + persons who begin, resp. end their follow-up in each state, and + numbers on the arrows are no. of transitions and rates (transition + intensities) per 1000 PY.} + +\section{Analysis of rates} + +In the \texttt{Lexis} object (which is just a data frame) each person +is represented by one record for each transient state he occupies, +thus in this case either 1 or 2 records; those who have a recorded +time both without and with insulin have two records. + +In order to be able to fit Poisson models with occurrence rates varying +by the different time-scales, we split the follow-up in 6-month intervals +for modeling: +\begin{Schunk} +\begin{Sinput} +> Si <- splitLexis( dmi, 0:30/2, "DMdur" ) +> dim( Si ) +\end{Sinput} +\begin{Soutput} +[1] 115370 15 +\end{Soutput} +\begin{Sinput} +> print( subset( Si, lex.id==97 )[,1:10], digits=6 ) +\end{Sinput} +\begin{Soutput} + lex.id Per Age DMdur t.Ins lex.dur lex.Cst lex.Xst sex dobth +1105 97 1997.55 58.9268 0.00000 NA 0.5000000 DM DM F 1938.62 +1106 97 1998.05 59.4268 0.50000 NA 0.5000000 DM DM F 1938.62 +1107 97 1998.55 59.9268 1.00000 NA 0.5000000 DM DM F 1938.62 +1108 97 1999.05 60.4268 1.50000 NA 0.5000000 DM DM F 1938.62 +1109 97 1999.55 60.9268 2.00000 NA 0.1793292 DM Ins F 1938.62 +1110 97 1999.72 61.1061 2.17933 0.000000 0.3206708 Ins Ins F 1938.62 +1111 97 2000.05 61.4268 2.50000 0.320671 0.5000000 Ins Ins F 1938.62 +1112 97 2000.55 61.9268 3.00000 0.820671 0.0116359 Ins Dead(Ins) F 1938.62 +\end{Soutput} +\end{Schunk} +Note that when we split the follow-up each person's follow up now +consists of many records, each with the \emph{current} values of the +timescales at the start of the interval represented by the record. In +the modeling we must necessarily assume that the rates are constant +within each 6-month interval, but the \emph{size} of these rates we +model as smooth functions of the time scales (that is the values at +the beginning of each interval). + +The approach often used in epidemiology where one parameter is +attached to each interval of time (or age) is not feasible when more +than one time scale is used, because intervals are not classified the +same way on all timescales. + +We shall use natural splines (restricted cubic splines) for the +analysis of rates, and hence we must allocate knots for the +splines. This is done for each of the time-scales, and separately for +the transition out of states ``DM'' and ``Ins''. For age, we place the +knots so that the number of events is the same between each pair of +knots, but only half of this beyond each of the boundary knots, +whereas for the timescales \texttt{DMdur} and \texttt{tIns} where we +have observation from a well-defined 0, we put knots at 0 and place the +remaining knots so that the number of events is the same between each +pair of knots as well as outside the boundary knots. +\begin{Schunk} +\begin{Sinput} +> nk <- 5 +> ( ai.kn <- with( subset(Si,lex.Xst=="Ins" & lex.Cst!=lex.Xst ), ++ quantile( Age+lex.dur , probs=(1:nk-0.5)/nk ) ) ) +\end{Sinput} +\begin{Soutput} + 10% 30% 50% 70% 90% +23.23751 48.82218 58.63244 67.79028 78.88542 +\end{Soutput} +\begin{Sinput} +> ( ad.kn <- with( subset(Si,lex.Xst=="Dead"), ++ quantile( Age+lex.dur , probs=(1:nk-0.5)/nk ) ) ) +\end{Sinput} +\begin{Soutput} + 10% 30% 50% 70% 90% +61.91951 72.52731 78.43121 83.32348 90.15195 +\end{Soutput} +\begin{Sinput} +> ( di.kn <- with( subset(Si,lex.Xst=="Ins" & lex.Cst!=lex.Xst ), ++ c(0,quantile( DMdur+lex.dur, probs=(1:(nk-1))/nk ) )) ) +\end{Sinput} +\begin{Soutput} + 20% 40% 60% 80% +0.00000000 0.06570842 0.45448323 3.28761123 6.63764545 +\end{Soutput} +\begin{Sinput} +> ( dd.kn <- with( subset(Si,lex.Xst=="Dead"), ++ c(0,quantile( DMdur+lex.dur, probs=(1:(nk-1))/nk ) )) ) +\end{Sinput} +\begin{Soutput} + 20% 40% 60% 80% +0.0000000 0.7687885 2.1327858 4.0465435 6.5232033 +\end{Soutput} +\begin{Sinput} +> ( ti.kn <- with( subset(Si,lex.Xst=="Dead(Ins)"), ++ c(0,quantile( t.Ins+lex.dur, probs=(1:(nk-1))/nk ) )) ) +\end{Sinput} +\begin{Soutput} + 20% 40% 60% 80% +0.0000000 0.3093771 1.1307324 2.5489391 4.9117043 +\end{Soutput} +\end{Schunk} +Note that when we tease out the event records for transition to +\emph{transient} states (in this case ``Ins'', that is +verb|lex.Xst=="Ins"|), we should add \verb|lex.Cst!=lex.Xst|, to +include only transition records and avoiding including records of +sojourn time in the transient state. + +We then fit Poisson models to transition rates, using the wrapper +\texttt{Ns} from the \texttt{Epi} package to simplify the +specification of the rates: +\begin{Schunk} +\begin{Sinput} +> library( splines ) +> DM.Ins <- glm( (lex.Xst=="Ins") ~ Ns( Age , knots=ai.kn ) + ++ Ns( DMdur, knots=di.kn ) + ++ I(Per-2000) + sex, ++ family=poisson, offset=log(lex.dur), ++ data = subset(Si,lex.Cst=="DM") ) +> DM.Dead <- glm( (lex.Xst=="Dead") ~ Ns( Age , knots=ad.kn ) + ++ Ns( DMdur, knots=dd.kn ) + ++ I(Per-2000) + sex, ++ family=poisson, offset=log(lex.dur), ++ data = subset(Si,lex.Cst=="DM") ) +> Ins.Dead <- glm( (lex.Xst=="Dead(Ins)") ~ Ns( Age , knots=ad.kn ) + ++ Ns( DMdur, knots=dd.kn ) + ++ Ns( t.Ins, knots=ti.kn ) + ++ I(Per-2000) + sex, ++ family=poisson, offset=log(lex.dur), ++ data = subset(Si,lex.Cst=="Ins") ) +\end{Sinput} +\end{Schunk} +Note the similarity of the code used to fit the three models, is is +mainly redefining the response variable (``to'' state) and the subset +of the data used (``from'' state). + +\section{The mortality rates} + +This section discusses in some detail how to extract ad display the +mortality rates from the models fitted. But it is not necessary for +understanding how to use \texttt{simLexis} in practice. + +\subsection{Proportionality of mortality rates} + +Note that we have fitted separate models for the three transitions, +there is no assumption of proportionality between the mortality rates +from \texttt{DM} and \texttt{Ins}. + +However, there is nothing that prevents us from testing this +assumption; we can just fit a model for the mortality rates in the +entire data frame \texttt{Si}, and compare the deviance from this with +the sum of the deviances from the separate models: +\begin{Schunk} +\begin{Sinput} +> with( Si, table(lex.Cst) ) +\end{Sinput} +\begin{Soutput} +lex.Cst + DM Ins Dead Dead(Ins) + 97039 18331 0 0 +\end{Soutput} +\begin{Sinput} +> All.Dead <- glm( (lex.Xst %in% c("Dead(Ins)","Dead")) ~ ++ Ns( Age , knots=ad.kn ) + ++ Ns( DMdur, knots=dd.kn ) + ++ lex.Cst + ++ I(Per-2000) + sex, ++ family=poisson, offset=log(lex.dur), ++ data = Si ) +> round( ci.exp( All.Dead ), 3 ) +\end{Sinput} +\begin{Soutput} + exp(Est.) 2.5% 97.5% +(Intercept) 0.049 0.043 0.056 +Ns(Age, knots = ad.kn)1 4.120 3.479 4.879 +Ns(Age, knots = ad.kn)2 4.652 4.054 5.338 +Ns(Age, knots = ad.kn)3 15.460 13.575 17.608 +Ns(Age, knots = ad.kn)4 7.529 6.711 8.447 +Ns(DMdur, knots = dd.kn)1 0.520 0.429 0.629 +Ns(DMdur, knots = dd.kn)2 0.707 0.622 0.803 +Ns(DMdur, knots = dd.kn)3 0.319 0.238 0.428 +Ns(DMdur, knots = dd.kn)4 0.829 0.742 0.926 +lex.CstIns 2.168 1.946 2.414 +I(Per - 2000) 0.965 0.954 0.977 +sexF 0.665 0.614 0.720 +\end{Soutput} +\end{Schunk} +From the parameter values we would in a simple setting just claim that +start of insulin-treatment was associated with a slightly more than +doubling of mortality. + +The model \texttt{All.dead} assumes that the age- and DM-duration +effects on mortality in the ``DM'' and ``Ins'' states are the same, +and moreover that there is no effect of insulin duration, but merely a +mortality that is larger by a multiplicative constant not depending on +insulin duration. The model \texttt{DM.Dead} has 8 parameters to +describe the dependency on age and DM duration, the model +\texttt{Ins.Dead} has 12 for the same plus the insulin duration (a +natural spline with $k$ knots gives $k-1$ parameters, and we chose +$k=5$ above). + +We can compare the fit of the proportional hazards model with the fit +of the separate models for the two mortality rates, by adding up the +deviances and d.f. from these: +\begin{Schunk} +\begin{Sinput} +> what <- c("null.deviance","df.null","deviance","df.residual") +> ( rD <- unlist( DM.Dead[what] ) ) +\end{Sinput} +\begin{Soutput} +null.deviance df.null deviance df.residual + 19957.95 97038.00 17849.90 97028.00 +\end{Soutput} +\begin{Sinput} +> ( rI <- unlist( Ins.Dead[what] ) ) +\end{Sinput} +\begin{Soutput} +null.deviance df.null deviance df.residual + 4329.880 18330.000 3674.067 18316.000 +\end{Soutput} +\begin{Sinput} +> ( rA <- unlist( All.Dead[what] ) ) +\end{Sinput} +\begin{Soutput} +null.deviance df.null deviance df.residual + 24300.15 115369.00 21608.79 115358.00 +\end{Soutput} +\begin{Sinput} +> round( c( dd <- rA-(rI+rD), "pVal"=1-pchisq(dd[3],dd[4]+1) ), 3 ) +\end{Sinput} +\begin{Soutput} +null.deviance df.null deviance df.residual pVal.deviance + 12.314 1.000 84.822 14.000 0.000 +\end{Soutput} +\end{Schunk} +Thus we see there is a substantial non-proportionality of mortality +rates from the two states; but a test provides no clue whatsoever to +the particular \emph{shape} of the non-proportionality. + +To this end, we shall explore the predicted mortalities under the two +models quantitatively in more detail. Note that the reason that there +is a difference in the null deviances (and a difference of 1 in the +null d.f.) is that the null deviance of \texttt{All.Dead} refer to a +model with a single intercept, that is a model with constant and +\emph{identical} mortality rates from the states ``DM'' and ``Ins'', +whereas the null models for \texttt{DM.Dead} and \texttt{Ins.Dead} +have constant but \emph{different} mortality rates from the states +``DM'' and ``Ins''. This is however irrelevant for the comparison of +the \emph{residual} deviances. + +\subsection{How the mortality rates look} + +If we want to see how the mortality rates are modelled in +\texttt{DM.Dead} and \texttt{Ins.Dead} in relation to +\texttt{All.Dead}, we make a prediction of rates for say men diagnosed +in different ages and going on insulin at different times after +this. So we consider men diagnosed in ages 40, 50, 60 and 70, and who +either never enter insulin treatment or do it 0, 2 or 5 years after +diagnosis of DM. + +To this end we create a prediction data frame where we have +observation times from diagnosis and 12 years on (longer would not +make sense as this is the extent of the data). + +But we start by setting up an array to hold the predicted mortality +rates, classified by diabetes duration, age at diabetes onset, time of +insulin onset, and of course type of model. What we want to do is to +plot the age-specific mortality rates for persons not on insulin, and +for persons starting insulin at different times after DM. The +mortality curves start at the age where the person gets diabetes and +continues 12 years; for persons on insulin they start at the age when +they initiate insulin. +\begin{Schunk} +\begin{Sinput} +> pr.rates <- NArray( list( DMdur = seq(0,12,0.1), ++ DMage = 4:7*10, ++ r.Ins = c(NA,0,2,5), ++ model = c("DM/Ins","All"), ++ what = c("rate","lo","hi") ) ) +> str( pr.rates ) +\end{Sinput} +\begin{Soutput} + logi [1:121, 1:4, 1:4, 1:2, 1:3] NA NA NA NA NA NA ... + - attr(*, "dimnames")=List of 5 + ..$ DMdur: chr [1:121] "0" "0.1" "0.2" "0.3" ... + ..$ DMage: chr [1:4] "40" "50" "60" "70" + ..$ r.Ins: chr [1:4] NA "0" "2" "5" + ..$ model: chr [1:2] "DM/Ins" "All" + ..$ what : chr [1:3] "rate" "lo" "hi" +\end{Soutput} +\end{Schunk} +For convenience the \texttt{Epi} package contains a function that computes +predicted (log-)rates with c.i. --- it is merely a wrapper for +\texttt{predict.glm}: +\begin{Schunk} +\begin{Sinput} +> ci.pred +\end{Sinput} +\begin{Soutput} +function (obj, newdata, Exp = NULL, alpha = 0.05) +{ + if (!inherits(obj, "glm")) + stop("Not usable for non-glm objects") + zz <- predict(obj, newdata = newdata, se.fit = TRUE, type = "link") + zz <- cbind(zz$fit, zz$se.fit) %*% ci.mat(alpha = alpha) + if (missing(Exp)) { + return(obj$family$linkinv(zz)) + } + else { + if (Exp) { + return(exp(zz)) + } + else if (!Exp) + return(zz) + } +} + +\end{Soutput} +\end{Schunk} +So set up the prediction data frame and modify it in loops over +ages at onset and insulin onset. Note that we set \texttt{lex.dur} to +1000 in the prediction frame, so that we obtain rates in units of +events per 1000 PY. +\begin{Schunk} +\begin{Sinput} +> nd <- data.frame( DMdur = as.numeric( dimnames(pr.rates)[[1]] ), ++ lex.Cst = factor( 1, levels=1:4, ++ labels=levels(Si$lex.Cst) ), ++ sex = factor( 1, levels=1:2, labels=c("M","F")), ++ lex.dur = 1000 ) +> for( ia in dimnames(pr.rates)[[2]] ) ++ { ++ dnew <- transform( nd, Age = as.numeric(ia)+DMdur, ++ Per = 1998+DMdur ) ++ pr.rates[,ia,1,"DM/Ins",] <- ci.pred( DM.Dead, newdata = dnew ) ++ pr.rates[,ia,1,"All" ,] <- ci.pred( All.Dead, newdata = dnew ) ++ for( ii in dimnames(pr.rates)[[3]][-1] ) ++ { ++ dnew = transform( dnew, lex.Cst = factor( 2, levels=1:4, ++ labels=levels(Si$lex.Cst) ), ++ t.Ins = ifelse( (DMdur-as.numeric(ii)) >= 0, ++ DMdur-as.numeric(ii), NA ) ) ++ pr.rates[,ia, ii ,"DM/Ins",] <- ci.pred( Ins.Dead, newdata = dnew ) ++ pr.rates[,ia, ii ,"All" ,] <- ci.pred( All.Dead, newdata = dnew ) ++ } ++ } +\end{Sinput} +\end{Schunk} +So for each age at DM onset we make a plot of the mortality as function +of current age both for those with no insulin treatment at those that +start 1, 3 and 5 years after, thus 4 curves (with c.i.). These curves +are replicated with a different color for the simplified model. +\begin{Schunk} +\begin{Sinput} +> par( mar=c(3,3,1,1), mgp=c(3,1,0)/1.6, las=1 ) +> plot( NA, xlim=c(40,82), ylim=c(5,300), bty="n", ++ log="y", xlab="Age", ylab="Mortality rate per 1000 PY" ) +> abline( v=seq(40,80,5), h=outer(1:9,10^(0:2),"*"), col=gray(0.8) ) +> for( aa in 4:7*10 ) for( ii in 1:4 ) ++ matlines( aa+as.numeric(dimnames(pr.rates)[[1]]), ++ cbind( pr.rates[,paste(aa),ii,"DM/Ins",], ++ pr.rates[,paste(aa),ii,"All" ,] ), ++ type="l", lty=1, lwd=c(3,1,1), ++ col=rep(c("red","limegreen"),each=3) ) +\end{Sinput} +\end{Schunk} +\insfig{mort-int}{0.9}{Estimated mortality rates for male diabetes + patients with no insulin (lower sets of curves) and insulin (upper + curves), with DM onset in age 40, 50, 60 and 70. The red curves are + from the models with separate age effects for persons with and + without insulin, and a separate effect of insulin duration. The + green curves are from the model with common age-effects and only a + time-dependent effect of insulin, assuming no effect of insulin + duration (the classical time-dependent variable approach). Hence the + upper green curve is common for any time of insulin inception.} + +From figure \ref{fig:mort-int} we see that there is a substantial +insulin-duration effect which is not accommodated by the simple model +with only one time-dependent variable to describe the insulin +effect. Note that the simple model (green curves) for those on insulin +does not depend in insulin duration, and hence the mortality curves +for those on insulin are just parallel to the mortality curves for +those not on insulin, regardless of diabetes duration (or age) at the +time of insulin initiation. This is the proportional hazards +assumption. Thus the effect of insulin initiation is under-estimated +for short duration of insulin and over-estimated for long duration of +insulin. + +This is the major discrepancy between the two models, and +illustrates the importance of being able to accommodate different time +scales, but there is also a declining overall insulin effect by age which +is not accommodated by the proportional hazards approach. + +Finally, this plot illustrates an important feature in reporting +models with multiple timescales; all timescales must be represented in +the predicted rates, only reporting the effect of one timescale, +conditional on a fixed value of other timescales is misleading since +all timescales by definition advance at the same pace. For example, +the age-effect for a fixed value of insulin duration really is a +misnomer since it does not correspond to any real person's follow-up, +but to the mortality of persons in different ages but with the same +duration of insulin use. + +\section{Input to the \texttt{simLexis} function} + +In order to simulate from the multistate model with the estimated +transition rates, and get the follow-up of a hypothetical cohort, we +must supply \emph{both} the transition rates and the structure of the +model \emph{as well as} the initial cohort status to +\texttt{simLexis}. + +\subsection{The transition object} + +We first put the models into an object representing the transitions; +note this is a list of lists, the latter having \texttt{glm} objects +as elements: +\begin{Schunk} +\begin{Sinput} +> Tr <- list( "DM" = list( "Ins" = DM.Ins, ++ "Dead" = DM.Dead ), ++ "Ins" = list( "Dead(Ins)" = Ins.Dead ) ) +\end{Sinput} +\end{Schunk} +Now we have the description of the rates and of the structure of the +model. The \texttt{Tr} object defines the states and models for all +transitions between them; the object \verb|Tr$A$B| is the model +for the transition intensity from state \texttt{A} to state +\texttt{B}. + +\subsection{The initial cohort} + +We now define an initial \texttt{Lexis} object of persons with all +relevant covariates defined. Note that we use \texttt{subset} to get a +\texttt{Lexis} object, this conserves the \texttt{time.scale} and +\texttt{time.since} attributes which is needed for the simulation (the +usual ``\texttt{[}'' operator does not preserve these attributes when +you select columns): +\begin{Schunk} +\begin{Sinput} +> str( Si[NULL,1:9] ) +\end{Sinput} +\begin{Soutput} +Classes ‘Lexis’ and 'data.frame': 0 obs. of 9 variables: + $ lex.id : int + $ Per : num + $ Age : num + $ DMdur : num + $ t.Ins : num + $ lex.dur: num + $ lex.Cst: Factor w/ 4 levels "DM","Ins","Dead",..: + $ lex.Xst: Factor w/ 4 levels "DM","Ins","Dead",..: + $ sex : Factor w/ 2 levels "M","F": + - attr(*, "time.scales")= chr "Per" "Age" "DMdur" "t.Ins" + - attr(*, "time.since")= chr "" "" "" "Ins" + - attr(*, "breaks")=List of 4 + ..$ Per : NULL + ..$ Age : NULL + ..$ DMdur: num 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 ... + ..$ t.Ins: NULL +\end{Soutput} +\begin{Sinput} +> ini <- subset(Si,FALSE,select=1:9) +> str( ini ) +\end{Sinput} +\begin{Soutput} +Classes ‘Lexis’ and 'data.frame': 0 obs. of 9 variables: + $ lex.id : int + $ Per : num + $ Age : num + $ DMdur : num + $ t.Ins : num + $ lex.dur: num + $ lex.Cst: Factor w/ 4 levels "DM","Ins","Dead",..: + $ lex.Xst: Factor w/ 4 levels "DM","Ins","Dead",..: + $ sex : Factor w/ 2 levels "M","F": + - attr(*, "time.scales")= chr "Per" "Age" "DMdur" "t.Ins" + - attr(*, "time.since")= chr "" "" "" "Ins" + - attr(*, "breaks")=List of 4 + ..$ Per : NULL + ..$ Age : NULL + ..$ DMdur: num 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 ... + ..$ t.Ins: NULL +\end{Soutput} +\begin{Sinput} +> ini <- subset(Si,select=1:9)[NULL,] +> str( ini ) +\end{Sinput} +\begin{Soutput} +Classes ‘Lexis’ and 'data.frame': 0 obs. of 9 variables: + $ lex.id : int + $ Per : num + $ Age : num + $ DMdur : num + $ t.Ins : num + $ lex.dur: num + $ lex.Cst: Factor w/ 4 levels "DM","Ins","Dead",..: + $ lex.Xst: Factor w/ 4 levels "DM","Ins","Dead",..: + $ sex : Factor w/ 2 levels "M","F": + - attr(*, "time.scales")= chr "Per" "Age" "DMdur" "t.Ins" + - attr(*, "time.since")= chr "" "" "" "Ins" + - attr(*, "breaks")=List of 4 + ..$ Per : NULL + ..$ Age : NULL + ..$ DMdur: num 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 ... + ..$ t.Ins: NULL +\end{Soutput} +\end{Schunk} +We now have an empty \texttt{Lexis} object with attributes reflecting +the timescales in multistate model we want to simulate, so we must now +enter some data to represent the persons whose follow-up we want to +simulate through the model; we set up an initial dataset with one man +and one woman: +\begin{Schunk} +\begin{Sinput} +> ini[1:2,"lex.id"] <- 1:2 +> ini[1:2,"lex.Cst"] <- "DM" +> ini[1:2,"Per"] <- 1995 +> ini[1:2,"Age"] <- 60 +> ini[1:2,"DMdur"] <- 5 +> ini[1:2,"sex"] <- c("M","F") +> ini +\end{Sinput} +\begin{Soutput} + lex.id Per Age DMdur t.Ins lex.dur lex.Cst lex.Xst sex +1 1 1995 60 5 NA NA DM M +2 2 1995 60 5 NA NA DM F +\end{Soutput} +\end{Schunk} + +\section{Simulation of the follow-up} + +Now we simulate life-courses of a 1000 of each of these persons using +the estimated model. The \texttt{t.range} argument gives the times +range where the integrated intensities (cumulative rates) are +evaluated and where linear interpolation is used when simulating +transition times. Note that this must be given in the same units as +\texttt{lex.dur} in the \texttt{Lexis} object used for fitting the +models for the transitions. It is not a parameter that can be easily +determined from the \texttt{TR} object, hence it must be supplied by +the user. +\begin{Schunk} +\begin{Sinput} +> set.seed( 52381764 ) +> Nsim <- 5000 +> system.time( simL <- simLexis( Tr, ++ ini, ++ t.range = 12, ++ N = Nsim ) ) +\end{Sinput} +\begin{Soutput} + user system elapsed + 19.076 2.634 20.723 +\end{Soutput} +\end{Schunk} +The result is a \texttt{Lexis} object --- a data frame representing +the simulated follow-up of 10000 persons (5000 +identical men and 5000 identical women) according to the rates +we estimated from the original dataset. +\begin{Schunk} +\begin{Sinput} +> summary( simL, by="sex" ) +\end{Sinput} +\begin{Soutput} +$M + +Transitions: + To +From DM Ins Dead Dead(Ins) Records: Events: Risk time: Persons: + DM 1431 2059 1510 0 5000 3569 36693.61 5000 + Ins 0 1410 0 649 2059 649 10931.47 2059 + Sum 1431 3469 1510 649 7059 4218 47625.08 5000 + +$F + +Transitions: + To +From DM Ins Dead Dead(Ins) Records: Events: Risk time: Persons: + DM 2171 1697 1132 0 5000 2829 42283.74 5000 + Ins 0 1360 0 337 1697 337 9541.50 1697 + Sum 2171 3057 1132 337 6697 3166 51825.24 5000 +\end{Soutput} +\end{Schunk} + +\subsection{Using other models for simulation} + +\subsubsection{Proportional hazards Poisson model} + +We fitted a proportional mortality model \texttt{All.Dead} (which fitted +worse than the other two), this is a model for \emph{both} the +transition from ``DM'' to ``Death'' \emph{and} from ``Ins'' to +``Dead(Ins)'', assuming that they are proportional. But it can easily +be used in the simulation set-up, because the state is embedded in the +model via the term \texttt{lex.Cst}, which is updated during the simulation. + +Simulation using this instead just requires that we supply a different +transition object: +\begin{Schunk} +\begin{Sinput} +> Tr.p <- list( "DM" = list( "Ins" = DM.Ins, ++ "Dead" = All.Dead ), ++ "Ins" = list( "Dead(Ins)" = All.Dead ) ) +> system.time( simP <- simLexis( Tr.p, ++ ini, ++ t.range = 12, ++ N = Nsim ) ) +\end{Sinput} +\begin{Soutput} + user system elapsed + 18.898 3.572 21.412 +\end{Soutput} +\begin{Sinput} +> summary( simP, by="sex" ) +\end{Sinput} +\begin{Soutput} +$M + +Transitions: + To +From DM Ins Dead Dead(Ins) Records: Events: Risk time: Persons: + DM 1637 2076 1287 0 5000 3363 37695.03 5000 + Ins 0 1209 0 867 2076 867 10129.68 2076 + Sum 1637 3285 1287 867 7076 4230 47824.71 5000 + +$F + +Transitions: + To +From DM Ins Dead Dead(Ins) Records: Events: Risk time: Persons: + DM 2274 1733 993 0 5000 2726 42968.05 5000 + Ins 0 1201 0 532 1733 532 8624.01 1733 + Sum 2274 2934 993 532 6733 3258 51592.05 5000 +\end{Soutput} +\end{Schunk} + +\subsubsection{Proportional hazards Cox model} + +A third possibility would be to replace the two-time scale +proportional mortality model by a one-time-scale Cox-model, using +diabetes duration as time scale, and age at diagnosis of DM as (fixed) +covariate: +\begin{Schunk} +\begin{Sinput} +> library( survival ) +> Cox.Dead <- coxph( Surv( DMdur, DMdur+lex.dur, ++ lex.Xst %in% c("Dead(Ins)","Dead")) ~ ++ Ns( Age-DMdur, knots=ad.kn ) + ++ I(lex.Cst=="Ins") + ++ I(Per-2000) + sex, ++ data = Si ) +> round( ci.exp( Cox.Dead ), 3 ) +\end{Sinput} +\begin{Soutput} + exp(Est.) 2.5% 97.5% +Ns(Age - DMdur, knots = ad.kn)1 4.172 3.535 4.924 +Ns(Age - DMdur, knots = ad.kn)2 4.503 3.825 5.301 +Ns(Age - DMdur, knots = ad.kn)3 16.077 14.087 18.348 +Ns(Age - DMdur, knots = ad.kn)4 7.479 6.500 8.605 +I(lex.Cst == "Ins")TRUE 2.171 1.949 2.419 +I(Per - 2000) 0.965 0.954 0.977 +sexF 0.667 0.616 0.723 +\end{Soutput} +\begin{Sinput} +> round( ci.exp( All.Dead ), 3 ) +\end{Sinput} +\begin{Soutput} + exp(Est.) 2.5% 97.5% +(Intercept) 0.049 0.043 0.056 +Ns(Age, knots = ad.kn)1 4.120 3.479 4.879 +Ns(Age, knots = ad.kn)2 4.652 4.054 5.338 +Ns(Age, knots = ad.kn)3 15.460 13.575 17.608 +Ns(Age, knots = ad.kn)4 7.529 6.711 8.447 +Ns(DMdur, knots = dd.kn)1 0.520 0.429 0.629 +Ns(DMdur, knots = dd.kn)2 0.707 0.622 0.803 +Ns(DMdur, knots = dd.kn)3 0.319 0.238 0.428 +Ns(DMdur, knots = dd.kn)4 0.829 0.742 0.926 +lex.CstIns 2.168 1.946 2.414 +I(Per - 2000) 0.965 0.954 0.977 +sexF 0.665 0.614 0.720 +\end{Soutput} +\end{Schunk} +Note that in order for this model to be usable for simulation, it is +necessary that we use the components of the \texttt{Lexis} object to +specify the survival. Each record in the data frame \texttt{Si} +represents follow up from \texttt{DMdur} to \texttt{DMdur+lex.dur}, so +the model is a Cox model with diabetes duration as underlying timescale +and age at diagnosis, \texttt{Age-DMdur}, as covariate. + +Also note that we used \texttt{I(lex.Cst=="Ins")} instead of just +\texttt{lex.Cst}, because \texttt{coxph} assigns design matrix columns +to all levels of \texttt{lex.Cst}, also those not present in data, +which would give \texttt{NA}s among the parameter estimates and +\texttt{NA}s as mortality outcomes. + +We see that the effect of insulin and the other covariates are pretty +much the same as in the two-time-scale model. We can simulate from this +model too; there is no restrictions on what type of model can be used +for different transitions +\begin{Schunk} +\begin{Sinput} +> Tr.c <- list( "DM" = list( "Ins" = Tr$DM$Ins, ++ "Dead" = Cox.Dead ), ++ "Ins" = list( "Dead(Ins)" = Cox.Dead ) ) +> system.time( simC <- simLexis( Tr.c, ++ ini, ++ t.range = 12, ++ N = Nsim ) ) +\end{Sinput} +\begin{Soutput} + user system elapsed + 20.321 3.608 22.891 +\end{Soutput} +\begin{Sinput} +> summary( simC, by="sex" ) +\end{Sinput} +\begin{Soutput} +$M + +Transitions: + To +From DM Ins Dead Dead(Ins) Records: Events: Risk time: Persons: + DM 1671 2121 1208 0 5000 3329 37403.75 5000 + Ins 0 1419 0 702 2121 702 10384.18 2121 + Sum 1671 3540 1208 702 7121 4031 47787.93 5000 + +$F + +Transitions: + To +From DM Ins Dead Dead(Ins) Records: Events: Risk time: Persons: + DM 2272 1807 921 0 5000 2728 42185.28 5000 + Ins 0 1418 0 389 1807 389 9439.83 1807 + Sum 2272 3225 921 389 6807 3117 51625.11 5000 +\end{Soutput} +\end{Schunk} + +\section{Reporting the simulation results} + +We can now tabulate the number of persons in each state at a +predefined set of times on a given time scale. Note that in order for +this to be sensible, the \texttt{from} argument would normally be +equal to the starting time for the simulated individuals. +\begin{Schunk} +\begin{Sinput} +> system.time( ++ nSt <- nState( subset(simL,sex=="M"), ++ at=seq(0,11,0.2), from=1995, time.scale="Per" ) ) +\end{Sinput} +\begin{Soutput} + user system elapsed + 0.747 0.000 0.746 +\end{Soutput} +\begin{Sinput} +> nSt[1:10,] +\end{Sinput} +\begin{Soutput} + State +when DM Ins Dead Dead(Ins) + 1995 5000 0 0 0 + 1995.2 4933 38 29 0 + 1995.4 4849 87 62 2 + 1995.6 4777 128 89 6 + 1995.8 4709 169 116 6 + 1996 4632 212 148 8 + 1996.2 4549 258 183 10 + 1996.4 4480 292 216 12 + 1996.6 4411 331 240 18 + 1996.8 4340 360 277 23 +\end{Soutput} +\end{Schunk} +We see that as time goes by, the 5000 men slowly move away from the +starting state (\texttt{DM}). + +Based on this table (\texttt{nSt} is a table) we can now compute the +fractions in each state, or, rather more relevant, the cumulative fraction +across the states in some specified order, so that a plot of the +stacked probabilities can be made, using either the default rather +colorful layout, or a more minimalist version (both in figure \ref{fig:pstate0}): +\begin{Schunk} +\begin{Sinput} +> pM <- pState( nSt, perm=c(1,2,4,3) ) +> head( pM ) +\end{Sinput} +\begin{Soutput} + State +when DM Ins Dead(Ins) Dead + 1995 1.0000 1.0000 1.0000 1 + 1995.2 0.9866 0.9942 0.9942 1 + 1995.4 0.9698 0.9872 0.9876 1 + 1995.6 0.9554 0.9810 0.9822 1 + 1995.8 0.9418 0.9756 0.9768 1 + 1996 0.9264 0.9688 0.9704 1 +\end{Soutput} +\begin{Sinput} +> par( mfrow=c(1,2), mar=c(3,3,1,1), mgp=c(3,1,0)/1.6 ) +> plot( pM ) +> plot( pM, border="black", col="transparent", lwd=3 ) +> text( rep(as.numeric(rownames(pM)[nrow(pM)-1]),ncol(pM)), ++ pM[nrow(pM),]-diff(c(0,pM[nrow(pM),]))/5, ++ colnames( pM ), adj=1 ) +> box( col="white", lwd=3 ) +> box() +\end{Sinput} +\end{Schunk} +\insfig{pstate0}{1.0}{Default layout of the \textrm{\tt plot.pState} + graph (left), and a version with the state probabilities as lines and + annotation of states.} + +A more useful set-up of the graph would include a more through +annotation and sensible choice of colors, as seen in figure \ref{fig:pstatex}: +\begin{Schunk} +\begin{Sinput} +> clr <- c("limegreen","orange") +> # expand with a lighter version of the two chosen colors +> clx <- c( clr, rgb( t( col2rgb( clr[2:1] )*2 + rep(255,3) ) / 3, max=255 ) ) +> par( mfrow=c(1,2), las=1, mar=c(3,3,4,2), mgp=c(3,1,0)/1.6 ) +> # Men +> plot( pM, col=clx ) +> lines( as.numeric(rownames(pM)), pM[,2], lwd=3 ) +> mtext( "60 year old male, diagnosed 1990, aged 55", side=3, line=2.5, adj=0, col=gray(0.6) ) +> mtext( "Survival curve", side=3, line=1.5, adj=0 ) +> mtext( "DM, no insulin DM, Insulin", side=3, line=0.5, adj=0, col=clr[2] ) +> mtext( "DM, no insulin", side=3, line=0.5, adj=0, col=clr[1] ) +> axis( side=4 ) +> axis( side=4, at=1:19/20, labels=FALSE ) +> axis( side=4, at=1:99/100, labels=FALSE, tcl=-0.3 ) +> # Women +> pF <- pState( nState( subset(simL,sex=="F"), ++ at=seq(0,11,0.2), ++ from=1995, ++ time.scale="Per" ), ++ perm=c(1,2,4,3) ) +> plot( pF, col=clx ) +> lines( as.numeric(rownames(pF)), pF[,2], lwd=3 ) +> mtext( "60 year old female, diagnosed 1990, aged 55", side=3, line=2.5, adj=0, col=gray(0.6) ) +> mtext( "Survival curve", side=3, line=1.5, adj=0 ) +> mtext( "DM, no insulin DM, Insulin", side=3, line=0.5, adj=0, col=clr[1] ) +> mtext( "DM, no insulin", side=3, line=0.5, adj=0, col=clr[2] ) +> axis( side=4 ) +> axis( side=4, at=1:19/20, labels=FALSE ) +> axis( side=4, at=1:99/100, labels=FALSE, tcl=-0.3 ) +\end{Sinput} +\end{Schunk} +\insfig{pstatex}{1.0}{\textrm{\tt plot.pState} graphs where persons + ever on insulin are given in orange and persons never on insulin in + green, and the overall survival (dead over the line) as a black line.} + +If we instead wanted to show the results on the age-scale, we would +use age as timescale when constructing the probabilities; otherwise the +code is pretty much the same as before (Figure \ref{fig:pstatey}): +\begin{Schunk} +\begin{Sinput} +> par( mfrow=c(1,2), las=1, mar=c(3,3,4,2), mgp=c(3,1,0)/1.6 ) +> # Men +> pM <- pState( nState( subset(simL,sex=="M"), ++ at=seq(0,11,0.2), ++ from=60, ++ time.scale="Age" ), ++ perm=c(1,2,4,3) ) +> plot( pM, col=clx, xlab="Age" ) +> lines( as.numeric(rownames(pM)), pM[,2], lwd=3 ) +> mtext( "60 year old male, diagnosed 1990, aged 55", side=3, line=2.5, adj=0, col=gray(0.6) ) +> mtext( "Survival curve", side=3, line=1.5, adj=0 ) +> mtext( "DM, no insulin DM, Insulin", side=3, line=0.5, adj=0, col=clr[2] ) +> mtext( "DM, no insulin", side=3, line=0.5, adj=0, col=clr[1] ) +> axis( side=4 ) +> axis( side=4, at=1:19/20, labels=FALSE ) +> axis( side=4, at=1:19/20, labels=FALSE, tcl=-0.4 ) +> axis( side=4, at=1:99/100, labels=FALSE, tcl=-0.3 ) +> # Women +> pF <- pState( nState( subset(simL,sex=="F"), ++ at=seq(0,11,0.2), ++ from=60, ++ time.scale="Age" ), ++ perm=c(1,2,4,3) ) +> plot( pF, col=clx, xlab="Age" ) +> lines( as.numeric(rownames(pF)), pF[,2], lwd=3 ) +> mtext( "60 year old female, diagnosed 1990, aged 55", side=3, line=2.5, adj=0, col=gray(0.6) ) +> mtext( "Survival curve", side=3, line=1.5, adj=0 ) +> mtext( "DM, no insulin DM, Insulin", side=3, line=0.5, adj=0, col=clr[1] ) +> mtext( "DM, no insulin", side=3, line=0.5, adj=0, col=clr[2] ) +> axis( side=4 ) +> axis( side=4, at=1:9/10, labels=FALSE ) +> axis( side=4, at=1:19/20, labels=FALSE, tcl=-0.4 ) +> axis( side=4, at=1:99/100, labels=FALSE, tcl=-0.3 ) +\end{Sinput} +\end{Schunk} +Note the several statements with \texttt{axis(side=4,...}; they are +necessary to get the fine tick-marks in the right hand side of the +plots that you will need in order to read off the probabilities at +2006 (or 71 years). + +\insfig{pstatey}{1.0}{\textrm{\tt plot.pState} graphs where persons + ever on insulin are given in orange and persons never on insulin in + green, and the overall survival (dead over the line) as a black line.} + +\subsection{Comparing predictions from different models} + +We have actually fitted different models for the transitions, and we +have simulated Lexis objects from all three approaches, so we can plot +these predictions on top of each other: +\begin{Schunk} +\begin{Sinput} +> PrM <- pState( nState( subset(simP,sex=="M"), ++ at=seq(0,11,0.2), ++ from=60, ++ time.scale="Age" ), ++ perm=c(1,2,4,3) ) +> PrF <- pState( nState( subset(simP,sex=="F"), ++ at=seq(0,11,0.2), ++ from=60, ++ time.scale="Age" ), ++ perm=c(1,2,4,3) ) +> CoxM <- pState( nState( subset(simC,sex=="M"), ++ at=seq(0,11,0.2), ++ from=60, ++ time.scale="Age" ), ++ perm=c(1,2,4,3) ) +> CoxF <- pState( nState( subset(simC,sex=="F"), ++ at=seq(0,11,0.2), ++ from=60, ++ time.scale="Age" ), ++ perm=c(1,2,4,3) ) +> par( mfrow=c(1,2), mar=c(3,3,1,1), mgp=c(3,1,0)/1.6 ) +> plot( pM, border="black", col="transparent", lwd=3 ) +> lines( PrM, border="blue" , col="transparent", lwd=3 ) +> lines( CoxM, border="red" , col="transparent", lwd=3 ) +> text( 60.5, 0.05, "M" ) +> box( lwd=3 ) +> plot( pF, border="black", col="transparent", lwd=3 ) +> lines( PrF, border="blue" , col="transparent", lwd=3 ) +> lines( CoxF, border="red" , col="transparent", lwd=3 ) +> text( 60.5, 0.05, "F" ) +> box( lwd=3 ) +\end{Sinput} +\end{Schunk} +\insfig{comp-0}{1.0}{Comparison of the simulated state occupancy + probabilities using separate Poisson models for the mortality rates + with and without insulin (black) and using proportional hazards + Poisson models (blue) and Cox-models with diabetes duration as + timescale and age at diabetes diagnosis as covariate (red).} + +From figure \ref{fig:comp-0} it is clear that the two proportional +hazards models (blue and red curves) produce pretty much the same +estimates of the state occupancy probabilities over time, but also that +they relative to the model with separately estimated transition +intensities overestimates the probability of being alive without +insulin and underestimates the probabilities of being dead without +insulin. However both the overall survival, and the fraction of +persons on insulin are quite well in agreement with the more elaborate +model. Thus the proportional hazards models overestimate the relative +mortality of the insulin treated diabetes patients relative to the +non-insulin treated. + +Interestingly, we also see a bump in the estimated probabilities from +the Cox-model based model, but this is entirely an artifact that comes +from the estimation method for the baseline hazard of the Cox-model +that lets the (cumulative) hazard have large jumps at event times +where the risk set is small. So also here it shows up that an +assumption of continuous underlying hazards leads to more credible +estimates. + +\chapter{Simulation of transitions in multistate models} + +\section{Theory} + +Suppose that the rate functions for the transitions out of the current +state to, say, 3 different states are $\lambda_1$, $\lambda_2$ and +$\lambda_3$, and the corresponding cumulative rates are $\Lambda_1$, +$\Lambda_2$ and $\Lambda_3$, and we want to simulate an exit time and +an exit state (that is either 1, 2 or 3). This can be done in two +slightly different ways: +\begin{enumerate} +\item First time, then state: + \begin{enumerate} + \item Compute the survival function, $S(t) = + \exp\bigl(-\Lambda_1(t)-\Lambda_2(t)-\Lambda_3(t)\bigr)$ + \item Simulate a random U(0,1) variate, $u$, say. + \item The simulated exit time is then the solution $t_u$ to + the equation $S(t_u) = u \quad \Leftrightarrow \quad \sum_j\Lambda_j(t_u) = -\log(u)$. + \item A simulated transition at $t_u$ is then found by simulating a + random draw from the multinomial distribution with probabilities + $p_i=\lambda_i(t_u) / \sum_j\lambda_j(t_u)$. + \end{enumerate} +\item Separate cumulative incidences: + \begin{enumerate} + \item Simulate 3 independent U(0,1) random variates $u_1$, $u_2$ and $u_3$. + \item Solve the equations $\Lambda_i(t_i)=-\log(u_i), i=1,2,3$ and get $(t_1,t_2,t_3)$. + \item The simulated survival time is then $\min(t_1,t_2,t_3)$, and + the simulated transition is to the state corresponding to this, + that is $k \in \{1,2,3\}$, where $t_k=\min(t_1,t_2,t_3)$ + \end{enumerate} +\end{enumerate} +The intuitive argument is that with three possible transition there are +3 independent processes running, but only the first transition is observed. +The latter approach is used in the implementation in \texttt{simLexis}. + +The formal argument for the equality of the two approaches goes as +follows: +\begin{enumerate} +\item Equality of the transition times: + \begin{enumerate} + \item In the first approach we simulate from a distribution with + cumulative rate $\Lambda_1(t)+\Lambda_2(t)+\Lambda_3(t)$, hence + from a distribution with survival function: + \begin{align*} + S(t) & = \exp\bigl(-(\Lambda_1(t)+\Lambda_2(t)+\Lambda_3(t))\bigr) \\ + & = \exp\bigl(-\Lambda_1(t)\bigr)\times + \exp\bigl(-\Lambda_2(t)\bigr)\times + \exp\bigl(-\Lambda_3(t)\bigr) + \end{align*} + \item In the second approach we choose the smallest of three + independent survival times, with survival functions + $\exp(-\Lambda_i), i=1,2,3$. Now, the survival function for the + minimum of three independent survival times is: + \begin{align*} + S_\text{min}(t) & = \pmat{\min(t_1,t_2,t_3)>t} \\ + & = \pmat{t_1>t} \times + \pmat{t_2>t} \times + \pmat{t_3>t} \\ + & = \exp\bigl(-\Lambda_1(t)\bigr)\times + \exp\bigl(-\Lambda_2(t)\bigr)\times + \exp\bigl(-\Lambda_3(t)\bigr) + \end{align*} + which is the same survival function as derived above. + \end{enumerate} +\item Type of transition: + \begin{enumerate} + \item In the first instance the probability of a transition to state + $i$, conditional on the transition time being $t$, is as known + from standard probability theory: + $\lambda_i(t)/\bigl(\lambda_1(t)+\lambda_2(t)+\lambda_3(t)\bigr)$. + \item In the second approach we choose the transition corresponding + to the the smallest of the transition times. So when we condition + on the event that a transition takes place at time $t$, we have to + show that the conditional probability that the smallest of the + three simulated transition times was actually the $i$th, is as above. + + But conditional on \emph{survival} till $t$, the probabilities + that events of type $1,2,3$ takes place in the interval $(t,t+\dif + t)$ are $\lambda_1(t)\dif t$, $\lambda_2(t)\dif t$ and + $\lambda_1(t)\dif t$, respectively (assuming that the probability + of more than one event in the interval of length $\dif t$ is + 0). Hence the conditional probabilities \emph{given a transition + time} in $(t,t+\dif t)$ is: + \[ + \frac{\lambda_i(t)\dif t}{\lambda_1(t)\dif t+\lambda_2(t)\dif t+\lambda_3(t)\dif t}= + \frac{\lambda_i(t)}{\lambda_1(t)+\lambda_2(t)+\lambda_3(t)} + \] + --- exactly as above. + \end{enumerate} +\end{enumerate} + +\section{Components of \texttt{simLexis}} + +This section explains the actually existing code for +\texttt{simLexis}, as it is in the current version of \texttt{Epi}. + +% The following code sources the originally formatted code in +% simLexis.R in order to make it printable with all the formatting and +% comments in it. However the sourcing does not work when compiling +% the vignettes in R CMD build. So when checking the code we comment +% this out by putting eval=FALSE, and moreover, because it is only +% simLexis that is exported, we also need to bring the functions simX, +% sim1, lint, get.next and chop.lex into the workspace. So depending +% on whether we actually construct the pdf for the inst/doc folder or +% test (upload) the package, one of the following to chunks are run +% with eval=FALSE and the other with eval=TRUE + +% When compiling the published vignette +% When checking the package +The function \texttt{simLexis} takes a \texttt{Lexis} object as +input. This defines the initial state(s) and times of the start for a +number of persons. Since the purpose is to simulate a history through +the estimated multistate model, the input values of the outcome +variables \texttt{lex.Xst} and \texttt{lex.dur} are ignored --- the +aim is to simulate values for them. + +Note however that the attribute \texttt{time.since} must be present in the +object. This is used for initializing timescales defined as time since +entry into a particular state, it is a character vector of the same +length as the \texttt{time.scales} attribute, with value equal to a +state name if the corresponding time scale is defined as time since +entry into that state. In this example the 4th timescale is time since +entry into the ``Ins'' state, and hence: +\begin{Schunk} +\begin{Sinput} +> cbind( ++ attr( ini, "time.scale" ), ++ attr( ini, "time.since" ) ) +\end{Sinput} +\begin{Soutput} + [,1] [,2] +[1,] "Per" "" +[2,] "Age" "" +[3,] "DMdur" "" +[4,] "t.Ins" "Ins" +\end{Soutput} +\end{Schunk} +\texttt{Lexis} objects will have this attribute set for time scales created +using \texttt{cutLexis}. + +The other necessary argument is a transition object \texttt{Tr}, which +is a list of lists. The elements of the lists should be \texttt{glm} +objects derived by fitting Poisson models to a \texttt{Lexis} object +representing follow-up data in a multistate model. It is assumed (but +not checked) that timescales enter in the model via the timescales of +the \texttt{Lexis} object. Formally, there are no assumptions about +how \texttt{lex.dur} enters in the model. + +Optional arguments are \texttt{t.range}, \texttt{n.int} or +\texttt{time.pts}, specifying the times after entry at which the +cumulative rates will be computed (the maximum of which will be taken +as the censoring time), and \texttt{N} a scalar or numerical vector of +the number of persons with a given initial state each record of the +\texttt{init} object should represent. + +The central part of the functions uses a \texttt{do.call} / +\texttt{lapply} / \texttt{split} construction to do simulations for +different initial states. This is the construction in the middle that +calls \texttt{simX}. \texttt{simLexis} also calls \texttt{get.next} +which is further detailed below. +\begin{Schunk} +\begin{Sinput} +> simLexis +\end{Sinput} +\begin{Soutput} +function (Tr, init, N = 1, lex.id, t.range = 20, n.int = 101, + time.pts = seq(0, t.range, length.out = n.int)) +{ + if (time.pts[1] != 0) + stop("First time point must be 0, time.pts[1:3]= ", time.pts[1:3]) + if (!missing(N)) { + if (length(N) == 1) + init <- init[rep(1:nrow(init), each = N), ] + else init <- init[rep(1:nrow(init), N), ] + } + if (!missing(lex.id)) { + if (length(lex.id) == nrow(init)) + init$lex.id <- lex.id + else init$lex.id <- 1:nrow(init) + } + else init$lex.id <- 1:nrow(init) + if (is.null(tS <- attr(init, "time.scales"))) + stop("No time.scales attribute for init") + if (is.null(tF <- attr(init, "time.since"))) { + attr(init, "time.since") <- tF <- rep("", tS) + warning("'time.since' attribute set to blanks") + } + np <- length(time.pts) + tr.st <- names(Tr) + sf <- NULL + nxt <- init[init$lex.Cst %in% tr.st, ] + if (nrow(nxt) < nrow(init)) { + tt <- table(init$lex.Cst) + tt <- tt[tt > 0] + nt <- length(tt) + warning("\nSome initiators start in a absorbing state\n", + "Initiator states represented are: ", paste(rbind(names(tt), + rep(":", nt), paste(tt), rep(" ", nt)), collapse = ""), + "\n", "Transient states are: ", paste(names(Tr), + coll = " ")) + if (nrow(nxt) == 0) + stop("\nNo initiators in transient states!") + } + while (nrow(nxt) > 0) { + nx <- do.call(rbind.data.frame, lapply(split(nxt, nxt$lex.Cst), + simX, Tr, time.pts, tS)) + sf <- rbind.data.frame(sf, nx) + nxt <- get.next(nx, tr.st, tS, tF) + } + sf$lex.Xst <- factor(sf$lex.Xst, levels = levels(sf$lex.Cst)) + sf$lex.Xst[is.na(sf$lex.Xst)] <- sf$lex.Cst[is.na(sf$lex.Xst)] + nord <- match(c("lex.id", tS, "lex.dur", "lex.Cst", "lex.Xst"), + names(sf)) + noth <- setdiff(1:ncol(sf), nord) + sf <- sf[order(sf$lex.id, sf[, tS[1]]), c(nord, noth)] + rownames(sf) <- NULL + attr(sf, "time.scales") <- tS + attr(sf, "time.since") <- tF + chop.lex(sf, tS, max(time.pts)) +} + + +\end{Soutput} +\end{Schunk} + +\subsection{\texttt{simX}} + +\texttt{simX} is called by \texttt{simLexis} and simulates +transition-times and -types for a set of patients assumed to be in the +same state. It is called from \texttt{simLexis} with a data frame as +argument, uses the state in \texttt{lex.Cst} to select the relevant +component of \texttt{Tr} and compute predicted cumulative intensities +for all states reachable from this state. + +Note that it is here the switch between \texttt{glm}, \texttt{coxph} +and objects of class \texttt{function} is made. + +The dataset on which this prediction is done has +\texttt{length(time.pts)} rows per person. +\begin{Schunk} +\begin{Sinput} +> simX +\end{Sinput} +\begin{Soutput} +function (nd, Tr, time.pts, tS) +{ + np <- length(time.pts) + nr <- nrow(nd) + if (nr == 0) + return(NULL) + cst <- as.character(unique(nd$lex.Cst)) + if (length(cst) > 1) + stop("More than one lex.Cst present:\n", cst, "\n") + prfrm <- nd[rep(1:nr, each = np), ] + prfrm[, tS] <- prfrm[, tS] + rep(time.pts, nr) + prfrm$lex.dur <- il <- min(diff(time.pts)) + prfrp <- prfrm + prfrp[, tS] <- prfrp[, tS] + il/2 + rt <- data.frame(lex.id = prfrm$lex.id) + for (i in 1:length(Tr[[cst]])) { + if (inherits(Tr[[cst]][[i]], "glm")) + rt <- cbind(rt, predict(Tr[[cst]][[i]], type = "response", + newdata = prfrp)) + else if (inherits(Tr[[cst]][[i]], "coxph")) + rt <- cbind(rt, predict(Tr[[cst]][[i]], type = "expected", + newdata = prfrm)) + else if (is.function(Tr[[cst]][[i]])) + rt <- cbind(rt, Tr[[cst]][[i]](prfrm)) + else stop("Invalid object supplied as transition, elements of the list must be either:\n", + "- a glm(poisson) object fitted to a Lexis object\n", + "- a coxph object fitted to a Lexis object\n", "- a function that takes a Lexis object as argument and returns\n", + " average rates for each record in the same units as lex.dur.") + } + names(rt)[-1] <- names(Tr[[cst]]) + xx <- match(c("lex.dur", "lex.Xst"), names(nd)) + if (any(!is.na(xx))) + nd <- nd[, -xx[!is.na(xx)]] + merge(nd, do.call(rbind, lapply(split(rt, rt$lex.id), sim1, + time.pts)), by = "lex.id") +} + + +\end{Soutput} +\end{Schunk} +As we see, \texttt{simX} calls \texttt{sim1} which simulates the +transition for one person. + +\subsection{\texttt{sim1}} + +The predicted cumulative intensities are fed, person by person, to +\texttt{sim1} --- again via a \texttt{do.call} / \texttt{lapply} / +\texttt{split} construction --- and the resulting time and state is +appended to the \texttt{nd} data frame. This way we have simulated +\emph{one} transition (time and state) for each person: +\begin{Schunk} +\begin{Sinput} +> sim1 +\end{Sinput} +\begin{Soutput} +function (rt, time.pts) +{ + ci <- apply(rbind(0, rt[, -1, drop = FALSE]), 2, cumsum)[1:nrow(rt), + , drop = FALSE] + tt <- uu <- -log(runif(ncol(ci))) + for (i in 1:ncol(ci)) tt[i] <- lint(ci[, i], time.pts, uu[i]) + data.frame(lex.id = rt[1, 1], lex.dur = min(tt, na.rm = TRUE), + lex.Xst = factor(if (min(tt) < max(time.pts)) + colnames(ci)[tt == min(tt)] + else NA)) +} + + +\end{Soutput} +\end{Schunk} +The \texttt{sim1} function uses \texttt{lint} to do linear interpolation. + +\subsection{\texttt{lint}} + +We do not use \texttt{approx} to do the linear interpolation, because +this function does not do the right thing if the cumulative incidences +(\texttt{ci}) are constant across a number of times. Therefore we have +a custom made linear interpolator that does the interpolation +exploiting the fact the the \texttt{ci} is non-decreasing and +\texttt{tt} is strictly monotonously increasing: +\begin{Schunk} +\begin{Sinput} +> lint +\end{Sinput} +\begin{Soutput} +function (ci, tt, u) +{ + if (any(diff(ci) < 0) | any(diff(tt) < 0)) + stop("Non-icreasing arguments") + c.u <- min(c(ci[ci > u], max(ci))) + c.l <- max(c(ci[ci < u], min(ci))) + t.u <- min(c(tt[ci > u], max(tt))) + t.l <- max(c(tt[ci < u], min(tt))) + ifelse(c.u == c.l, t.l, t.l + (u - c.l)/(c.u - c.l) * (t.u - + t.l)) +} + + +\end{Soutput} +\end{Schunk} + +\subsection{\texttt{get.next}} + +We must repeat the simulation operation on those that have a simulated +entry to a transient state, and also make sure that any time scales +defined as time since entry to one of these states be initialized to 0 +before a call to \texttt{simX} is made for these persons. This +accomplished by the function \texttt{get.next}: +\begin{Schunk} +\begin{Sinput} +> get.next +\end{Sinput} +\begin{Soutput} +function (sf, tr.st, tS, tF) +{ + if (nrow(sf) == 0) + return(sf) + nxt <- sf[sf$lex.Xst %in% tr.st, ] + if (nrow(nxt) == 0) + return(nxt) + nxt[, tS] <- nxt[, tS] + nxt$lex.dur + wh <- tF + for (i in 1:length(wh)) if (wh[i] != "") + nxt[nxt$lex.Xst == wh[i], tS[i]] <- 0 + nxt$lex.Cst <- nxt$lex.Xst + return(nxt) +} + + +\end{Soutput} +\end{Schunk} + +\subsection{\texttt{chop.lex}} + +The operation so far has censored individuals \texttt{max(time.pts)} +after \emph{each} new entry to a transient state. In order to groom +the output data we use \texttt{chop.lex} to censor all persons at the +same designated time after \emph{initial} entry. +\begin{Schunk} +\begin{Sinput} +> chop.lex +\end{Sinput} +\begin{Soutput} +function (obj, tS, cens) +{ + zz <- entry(obj, 1, by.id = TRUE) + ww <- merge(obj, data.frame(lex.id = as.numeric(names(zz)), + cens = zz + cens)) + ww <- ww[ww[, tS[1]] < ww$cens, ] + x.dur <- pmin(ww$lex.dur, ww[, "cens"] - ww[, tS[1]]) + ww$lex.Xst[x.dur < ww$lex.dur] <- ww$lex.Cst[x.dur < ww$lex.dur] + ww$lex.dur <- pmin(x.dur, ww$lex.dur) + ww +} + + +\end{Soutput} +\end{Schunk} + +\section{Probabilities from simulated \texttt{Lexis} objects} + +Once we have simulated a Lexis object we will of course want to use it +for estimating probabilities, so basically we will want to enumerate +the number of persons in each state at a pre-specified set of time +points. + +\subsection{\texttt{nState}} + +Since we are dealing with multistate model with potentially multiple +time scales, it is necessary to define the timescale +(\texttt{time.scale}), the starting point on this timescale +(\texttt{from}) and the points after this where we compute the number +of occupants in each state, (\texttt{at}). +\begin{Schunk} +\begin{Sinput} +> nState +\end{Sinput} +\begin{Soutput} +function (obj, at, from, time.scale = 1) +{ + tS <- check.time.scale(obj, time.scale) + TT <- tmat(obj) + absorb <- rownames(TT)[apply(!is.na(TT), 1, sum) == 0] + transient <- setdiff(rownames(TT), absorb) + tab.frm <- obj[rep(1:nrow(obj), each = length(at)), c(tS, + "lex.dur", "lex.Cst", "lex.Xst")] + tab.frm$when <- rep(at, nrow(obj)) + from + tab.tr <- tab.frm[tab.frm[, tS] <= tab.frm$when & tab.frm[, + tS] + tab.frm$lex.dur > tab.frm$when, ] + tab.tr$State <- tab.tr$lex.Cst + tab.ab <- tab.frm[tab.frm[, tS] + tab.frm$lex.dur <= tab.frm$when & + tab.frm$lex.Xst %in% absorb, ] + tab.ab$State <- tab.ab$lex.Xst + with(rbind(tab.ab, tab.tr), table(when, State)) +} + + +\end{Soutput} +\end{Schunk} + +\subsection{\texttt{pState}, \texttt{plot.pState} and \texttt{lines.pState}} + +In order to plot probabilities of state-occupancy it is useful to +compute cumulative probabilities across states in any given order; +this is done by the function \texttt{pState}, which returns a matrix +of class \texttt{pState}: +\begin{Schunk} +\begin{Sinput} +> pState +\end{Sinput} +\begin{Soutput} +function (nSt, perm = 1:ncol(nSt)) +{ + tt <- t(apply(nSt[, perm], 1, cumsum)) + tt <- sweep(tt, 1, tt[, ncol(tt)], "/") + class(tt) <- c("pState", "matrix") + tt +} + +\end{Soutput} +\end{Schunk} +There is also a \texttt{plot} and \texttt{lines} method for the +resulting \texttt{pState} objects: +\begin{Schunk} +\begin{Sinput} +> plot.pState +\end{Sinput} +\begin{Soutput} +function (x, col = rainbow(ncol(x)), border = "transparent", + xlab = "Time", ylim = 0:1, ylab = "Probability", ...) +{ + matplot(as.numeric(rownames(x)), x, type = "n", ylim = ylim, + yaxs = "i", xaxs = "i", xlab = xlab, ylab = ylab, ...) + lines.pState(x, col = col, border = border, ...) +} + +\end{Soutput} +\begin{Sinput} +> lines.pState +\end{Sinput} +\begin{Soutput} +function (x, col = rainbow(ncol(x)), border = "transparent", + ...) +{ + nc <- ncol(x) + col <- rep(col, nc)[1:nc] + border <- rep(border, nc)[1:nc] + pSt <- cbind(0, x) + for (i in 2:ncol(pSt)) { + polygon(c(as.numeric(rownames(pSt)), rev(as.numeric(rownames(pSt)))), + c(pSt[, i], rev(pSt[, i - 1])), col = col[i - 1], + border = border[i - 1], ...) + } +} + + +\end{Soutput} +\end{Schunk} + +\bibliographystyle{plain} +\begin{thebibliography}{1} + +\bibitem{Carstensen.2011a} +B~Carstensen and M~Plummer. +\newblock Using {L}exis objects for multi-state models in {R}. +\newblock {\em Journal of Statistical Software}, 38(6):1--18, 1 2011. + +\bibitem{Iacobelli.2013} +S~Iacobelli and B~Carstensen. +\newblock {Multiple time scales in multi-state models}. +\newblock {\em Stat Med}, 32(30):5315--5327, Dec 2013. + +\bibitem{Plummer.2011} +M~Plummer and B~Carstensen. +\newblock Lexis: An {R} class for epidemiological studies with long-term + follow-up. +\newblock {\em Journal of Statistical Software}, 38(5):1--12, 1 2011. + +\end{thebibliography} + +\addcontentsline{toc}{chapter}{References} + +\end{document} Binary files /tmp/tmp1TClAk/RJn9eX0x2u/r-cran-epi-2.19/vignettes/sL.pdf and /tmp/tmp1TClAk/HK_qO2S7tg/r-cran-epi-2.30/vignettes/sL.pdf differ diff -Nru r-cran-epi-2.19/vignettes/toparticle.tex r-cran-epi-2.30/vignettes/toparticle.tex --- r-cran-epi-2.19/vignettes/toparticle.tex 1970-01-01 00:00:00.000000000 +0000 +++ r-cran-epi-2.30/vignettes/toparticle.tex 2018-02-12 06:11:27.000000000 +0000 @@ -0,0 +1,121 @@ + +%---------------------------------------------------------------------- +%\usepackage[inline]{showlabels} +\usepackage[utf8]{inputenc} +\usepackage[T1]{fontenc} +\usepackage[english]{babel} +\usepackage[font=it,labelfont=normalfont]{caption} +\usepackage[colorlinks,urlcolor=blue,linkcolor=red,citecolor=Maroon]{hyperref} +% \usepackage[ae,hyper]{Rd} +\usepackage[dvipsnames]{xcolor} +\usepackage[super]{nth} +% \usepackage[retainorgcmds]{IEEEtrantools} +\usepackage{makeidx,Sweave,floatflt,amsmath,amsfonts,amsbsy,enumitem,verbatim,dcolumn,needspace} +\usepackage{booktabs,longtable,rotating,graphicx,verbatim,fancyhdr,datetime,afterpage,setspace} +\usepackage{ifthen,calc,eso-pic,everyshi,pdfpages} +\usepackage[abspath]{currfile} +% \usepackage{mslapa} +\definecolor{blaa}{RGB}{99,99,255} +\DeclareGraphicsExtensions{.png,.pdf,.jpg} +% Make the Sweave output nicer +\DefineVerbatimEnvironment{Sinput}{Verbatim}{fontsize=\footnotesize,fontshape=sl,formatcom=\color{Blue}} +\DefineVerbatimEnvironment{Soutput}{Verbatim}{fontsize=\footnotesize,formatcom=\color{Maroon},xleftmargin=0em} +\DefineVerbatimEnvironment{Scode}{Verbatim}{fontsize=\footnotesize} +\fvset{listparameters={\setlength{\topsep}{0pt}}} +\renewenvironment{Schunk}% +{\renewcommand{\baselinestretch}{0.85} \vspace{\topsep}}% +{\renewcommand{\baselinestretch}{1.00} \vspace{\topsep}} + +%---------------------------------------------------------------------- +% The usual usefuls +\input{useful.tex} +\setcounter{secnumdepth}{2} +\setcounter{tocdepth}{2} + +%---------------------------------------------------------------------- +% How to insert a figure +\newcommand{\rwpre}{./graph/gr} +\newcommand{\insfig}[3]{ +\begin{figure}[h] + \centering + \includegraphics[width=#2\textwidth]{\rwpre-#1} + \caption{#3} + \label{fig:#1} +% \afterpage{\clearpage} +\end{figure}} + +%---------------------------------------------------------------------- +% Set up layout of pages +\oddsidemargin 1mm +\evensidemargin 1mm +\topmargin -5mm +\headheight 8mm +\headsep 8mm +\textheight 230mm +\textwidth 165mm +%\footheight 5mm +\footskip 15mm +\renewcommand{\topfraction}{0.9} +\renewcommand{\bottomfraction}{0.9} +\renewcommand{\textfraction}{0.1} +\renewcommand{\floatpagefraction}{0.9} +\renewcommand{\headrulewidth}{0.1pt} +\setcounter{secnumdepth}{4} +\setcounter{tocdepth}{4} + +%---------------------------------------------------------------------- +% Here is the document starting with the titlepage +\begin{document} + +%---------------------------------------------------------------------- +% The title page +\setcounter{page}{1} +\pagenumbering{roman} +\pagestyle{plain} +\thispagestyle{empty} +%\vspace*{0.1\textheight} +\flushright +{\Huge \bfseries \Title + +}\ \\[-1.5ex] +\noindent\textcolor{blaa}{\rule[-1ex]{\textwidth}{5pt}}\\[2.5ex] +\large +\Where \\ +\Homepage \\ +\Dates \\ +\Version \\[1em] +\normalsize +Compiled \today,\ \currenttime\\ +from: \texttt{\currfileabspath}\\[1em] +\normalsize +\vfill +\Faculty +% End of titlepage +\newpage +\raggedright +\parindent 1em +\parskip 0ex +% \section*{Preface} +% Here is space for the preface +% \input{preface} + +%---------------------------------------------------------------------- +% Table of contents +\tableofcontents +\newpage +% \listoftables +% \listoffigures +% \cleardoublepage + +%---------------------------------------------------------------------- +% General page style +\pagenumbering{arabic} +\setcounter{page}{1} +\pagestyle{fancy} +\renewcommand{\sectionmark}[1]{\markright{%\thesection\ +\textsl{#1}}{}} +\fancyhead[EL]{\bf \thepage} +\fancyhead[ER]{\sl \Tit} +\fancyhead[OR]{\bf \thepage} +\fancyhead[OL]{\rm \rightmark} +\fancyfoot{} diff -Nru r-cran-epi-2.19/vignettes/topreport.tex r-cran-epi-2.30/vignettes/topreport.tex --- r-cran-epi-2.19/vignettes/topreport.tex 1970-01-01 00:00:00.000000000 +0000 +++ r-cran-epi-2.30/vignettes/topreport.tex 2018-03-09 08:32:21.000000000 +0000 @@ -0,0 +1,141 @@ +%---------------------------------------------------------------------- +% Packages +%\usepackage[inline]{showlabels} +%\usepackage[latin1]{inputenc} +\usepackage[utf8]{inputenc} +\usepackage[T1]{fontenc} +\usepackage[english]{babel} +\usepackage[font=it,labelfont=normalfont]{caption} +\usepackage[colorlinks,urlcolor=blue,linkcolor=red,citecolor=Maroon]{hyperref} +\usepackage[ae,hyper]{Rd} +\usepackage[dvipsnames]{xcolor} +\usepackage[super]{nth} +% \usepackage[retainorgcmds]{IEEEtrantools} +\usepackage[noae]{Sweave} +\usepackage{makeidx,floatflt,amsmath,amsfonts,amsbsy,enumitem,dcolumn,needspace} +\usepackage{ifthen,calc,eso-pic,everyshi} +\usepackage{booktabs,longtable,rotating,graphicx,subfig} +\usepackage{pdfpages,verbatim,fancyhdr,datetime,afterpage} +\usepackage[abspath]{currfile} +% \usepackage{times} +\renewcommand{\textfraction}{0.0} +\renewcommand{\topfraction}{1.0} +\renewcommand{\bottomfraction}{1.0} +\renewcommand{\floatpagefraction}{0.9} +% \usepackage{mslapa} +\definecolor{blaa}{RGB}{99,99,255} +\DeclareGraphicsExtensions{.png,.pdf,.jpg} +% Make the Sweave output nicer (slightly mor compact) +\DefineVerbatimEnvironment{Sinput}{Verbatim}{fontsize=\small,fontshape=sl,formatcom=\color{BlueViolet}} +\DefineVerbatimEnvironment{Soutput}{Verbatim}{fontsize=\small,formatcom=\color{Sepia},xleftmargin=0em} +\DefineVerbatimEnvironment{Scode}{Verbatim}{fontsize=\small} +\fvset{listparameters={\setlength{\topsep}{-0.1ex}}} +\renewenvironment{Schunk}% +{\renewcommand{\baselinestretch}{0.87} \vspace{\topsep}}% +{\renewcommand{\baselinestretch}{1.00} \vspace{\topsep}} +% \renewenvironment{knitrout} +% {\renewcommand{\baselinestretch}{0.87}} +% {\renewcommand{\baselinestretch}{1.00}} +\input{useful} + +%---------------------------------------------------------------------- +% Set up layout of pages +\oddsidemargin 1mm +\evensidemargin 1mm +\topmargin -10mm +\headheight 8mm +\headsep 5mm +\textheight 240mm +\textwidth 165mm +%\footheight 5mm +\footskip 15mm +\renewcommand{\topfraction}{0.9} +\renewcommand{\bottomfraction}{0.9} +\renewcommand{\textfraction}{0.1} +\renewcommand{\floatpagefraction}{0.9} +\renewcommand{\headrulewidth}{0.1pt} +\setcounter{secnumdepth}{2} +\setcounter{tocdepth}{3} + +%---------------------------------------------------------------------- +% How to insert a figure in a .rnw file +\newcommand{\rwpre}{./graph/gr} +\newcommand{\insfig}[3]{ +\begin{figure}[h] + \centering + \includegraphics[width=#2\textwidth]{\rwpre-#1} +% \caption{#3} + \caption{#3\hfill\mbox{\footnotesize \textrm{\tt \rwpre-#1}}} + \label{fig:#1} +% \afterpage{\clearpage} +\end{figure}} +\newcommand{\linput}[1]{ +% \clearpage +\afterpage{\hfill \ldots now input from \texttt{#1.tex}\\} +\fancyfoot[OR,EL]{\footnotesize \texttt{#1.tex}} +\input{#1}} + +%---------------------------------------------------------------------- +% Here is the document starting with the titlepage +\begin{document} + +%---------------------------------------------------------------------- +% The title page +\setcounter{page}{1} +\pagenumbering{roman} +\pagestyle{plain} +\thispagestyle{empty} +% \vspace*{0.05\textheight} +\flushright +% The blank below here is necessary in order not to muck up the +% linespacing in title if it has more than 2 lines +{\Huge \bfseries \Title + +}\ \\[-1.5ex] +\noindent\textcolor{blaa}{\rule[-1ex]{\textwidth}{5pt}}\\[2.5ex] +\large +\Where \\ +\Dates \\ +\Homepage \\ +\Version \\[1em] +\normalsize +Compiled \today,\ \currenttime\\ +from: \texttt{\currfileabspath}\\[1em] +% \input{wordcount} +\normalsize +\vfill +\Faculty +% End of titlepage +% \newpage + +%---------------------------------------------------------------------- +% Table of contents +\tableofcontents +% \listoftables +% \listoffigures +\clearpage +% \begingroup +% \let\clearpage\relax +% \listoftables +% \listoffigures +% \endgroup + +%---------------------------------------------------------------------- +% General text layout +\raggedright +\parindent 1em +\parskip 0ex +\cleardoublepage + +%---------------------------------------------------------------------- +% General page style +\pagenumbering{arabic} +\setcounter{page}{1} +\pagestyle{fancy} +\renewcommand{\chaptermark}[1]{\markboth{\textsl{#1}}{}} +\renewcommand{\sectionmark}[1]{\markright{\thesection\ \textsl{#1}}{}} +\fancyhead[EL]{\bf \thepage \quad \rm \rightmark} +\fancyhead[ER]{\rm \Tit} +\fancyhead[OL]{\rm \leftmark} +\fancyhead[OR]{\rm \rightmark \quad \bf \thepage} +\fancyfoot{} diff -Nru r-cran-epi-2.19/vignettes/useful.tex r-cran-epi-2.30/vignettes/useful.tex --- r-cran-epi-2.19/vignettes/useful.tex 1970-01-01 00:00:00.000000000 +0000 +++ r-cran-epi-2.30/vignettes/useful.tex 2018-02-12 06:08:52.000000000 +0000 @@ -0,0 +1,298 @@ +% This is a file of useful extra commands snatched from +% Michael Hills, David Clayton, Bendix Carstensen & Esa Laara. +% + +% Commands to draw observation lines on follow-up diagrams +% +% Horizontal lines +% +\providecommand{\hfail}[1]{\begin{picture}(250,5) + \put(0,0){\line(1,0){#1}} + \put(#1,0){\circle*{5}} + \end{picture}} + +\providecommand{\hcens}[1]{\begin{picture}(250,5) + \put(0,0){\line(1,0){#1}} + \put(#1,0){\line(0,1){2.5}} + \put(#1,0){\line(0,-1){2.5}} + \end{picture}} + +% +% Diagonals for Lexis diagrams +% +\providecommand{\dfail}[1]{\begin{picture}(250,250) + \put(0,0){\line(1,1){#1}} + \put(#1,#1){\circle*{5}} + \end{picture}} + +\providecommand{\dcens}[1]{\begin{picture}(250,250) + \put(0,0){\line(1,1){#1}} +% \put(#1,#1){\line(0,1){2.5}} +% \put(#1,#1){\line(0,-1){2.5}} +% BxC Changed this to an open circle instead of a line + \put(#1,#1){\circle{5}} + \end{picture}} + +% +% Horizontal range diagrams +% +\providecommand{\hrange}[1]{\begin{picture}(200,5) + \put(0,0){\circle*{5}} + \put(0,0){\line(1,0){#1}} + \put(0,0){\line(-1,0){#1}} + \end{picture}} + +% +% Tree drawing +% +\providecommand{\Tree}[3]{\setlength{\unitlength}{#1\unitlength}\begin{picture}(0,0) + \put(0,0){\line(3, 2){1}} + \put(0,0){\line(3,-2){1}} + \put(0.81, 0.54){\makebox(0,0)[br]{\footnotesize #2\ }} + \put(0.81,-0.54){\makebox(0,0)[tr]{\footnotesize #3\ }} +\end{picture}} + +\providecommand{\Wtree}[3]{\setlength{\unitlength}{#1\unitlength}\begin{picture}(0,0) + \put(0,0){\line(1, 1){1}} + \put(0,0){\line(1,-1){1}} + \put(0.8,0.8){\makebox(0,0)[br]{\footnotesize #2\ }} + \put(0.8,-0.8){\makebox(0,0)[tr]{\footnotesize #3\ }} +\end{picture}} + +\providecommand{\Ntree}[3]{\setlength{\unitlength}{#1\unitlength}\begin{picture}(0,0) + \put(0,0){\line(2, 1){1}} + \put(0,0){\line(2,-1){1}} + \put(0.8,0.4){\makebox(0,0)[br]{\footnotesize #2\ }} + \put(0.8,-0.4){\makebox(0,0)[tr]{\footnotesize #3\ }} +\end{picture}} + +\providecommand{\Nutree}[3]{\setlength{\unitlength}{#1\unitlength}\begin{picture}(0,0) + \put(0,0){\line(2, 1){#1}} + \put(0,0){\line(2,-1){#1}} + \put(0.8,0.4){\makebox(0,0)[br]{#2\ }} + \put(0.8,-0.4){\makebox(0,0)[tr]{#3\ }} +\end{picture}} + +% +% Tree drawing +% +\providecommand{\tree}[3]{\setlength{\unitlength}{#1}\begin{picture}(0,0) + \put(0,0){\line(3,2){1}} + \put(0,0){\line(3,-2){1}} + \put(0.81,0.54){\makebox(0,0)[br]{\footnotesize #2\ }} + \put(0.81,-0.54){\makebox(0,0)[tr]{\footnotesize #3\ }} +\end{picture}} + +\providecommand{\wtree}[3]{\setlength{\unitlength}{#1}\begin{picture}(0,0) + \put(0,0){\line(1,1){1}} + \put(0,0){\line(1,-1){1}} + \put(0.8,0.8){\makebox(0,0)[br]{\footnotesize #2\ }} + \put(0.8,-0.8){\makebox(0,0)[tr]{\footnotesize #3\ }} +\end{picture}} + +\providecommand{\ntree}[3]{\setlength{\unitlength}{#1}\begin{picture}(0,0) + \put(0,0){\line(2,1){1}} + \put(0,0){\line(2,-1){1}} + \put(0.8,0.4){\makebox(0,0)[br]{\footnotesize #2\ }} + \put(0.8,-0.4){\makebox(0,0)[tr]{\footnotesize #3\ }} +\end{picture}} + +\providecommand{\nutree}[3]{\begin{picture}(0,0) + \put(0,0){\line(2,1){#1}} + \put(0,0){\line(2,-1){#1}} + \put(0.8,0.4){\makebox(0,0)[br]{#2\ }} + \put(0.8,-0.4){\makebox(0,0)[tr]{#3\ }} +\end{picture}} + +% +% Other commands +% +\providecommand{\ip}[2]{\langle #1 \vert #2 \rangle} +\providecommand{\I}{\text{\rm gI}} +\providecommand{\prob}[0]{\text{\rm Pr}} +\providecommand{\nhy}[0]{_{\oslash}} +\providecommand{\true}[0]{_{\text{\rm \tiny T}}} +\providecommand{\hyp}[0]{_{\text{\rm \tiny H}}} +% \providecommand{\mpydiv}[0]{\stackrel{\textstyle \times}{\div}} +% Changed to slightly smaller symbols +\providecommand{\mpydiv}[0]{\stackrel{\scriptstyle\times}{\scriptstyle\div}} +\providecommand{\mie}[1]{{\it #1}} +\providecommand{\mycircle}[0]{\circle*{5}} +\providecommand{\smcircle}[0]{\circle*{1}} +\providecommand{\corner}[0]{_{\text{\rm \tiny C}}} +\providecommand{\ind}[0]{\hspace{10pt}} +\providecommand{\gap}[0]{\\[5pt]} +\renewcommand{\S}[0]{section~} +\providecommand{\blank}[0]{$\;\,$} +\providecommand{\vone}{\vspace{1cm}} +\providecommand{\ljust}[1]{\multicolumn{1}{l}{#1}} +\providecommand{\cjust}[1]{\multicolumn{1}{c}{#1}} +\providecommand{\transpose}{^{\text{\sf T}}} +\providecommand{\histog}[5]{\rule{1mm}{#1mm}\,\rule{1mm}{#2mm}\,\rule{1mm}{#3mm}\,\rule{1mm}{#4mm}\,\rule{1mm}{#5mm}} +\providecommand{\pmiss}{P_{\mbox{\tiny miss}}} + +% Below is BxCs commands inserted + +% Only works with hyperref package: +\newcommand{\mailto}[1]{\href{mailto:#1}{\tt #1}} + +\providecommand{\bc}{\begin{center}} +\providecommand{\ec}{\end{center}} +\providecommand{\bd}{\begin{description}} +\providecommand{\ed}{\end{description}} +\providecommand{\bi}{\begin{itemize}} +\providecommand{\ei}{\end{itemize}} +\providecommand{\bn}{\begin{equation}} +\providecommand{\en}{\end{equation}} +\providecommand{\be}{\begin{enumerate}} +\providecommand{\ee}{\end{enumerate}} +\providecommand{\bes}{\begin{eqnarray*}} +\providecommand{\ees}{\end{eqnarray*}} + +\DeclareMathOperator{\Pp}{P} +\DeclareMathOperator{\pp}{p} +% \providecommand{\p}{{\mathrm p}} +\providecommand{\e}{{\mathrm e}} +\providecommand{\D}{{\mathrm D}} +\providecommand{\dif}{{\,\mathrm d}} +\providecommand{\pmat}[1]{\Pp\!\left\{#1\right\}} +\providecommand{\ptxt}[1]{\Pp\!\left\{\text{#1}\right\}} +\providecommand{\E}{\text{\rm E}} +\providecommand{\V}{\text{\rm V}} +\providecommand{\BLUP}{\text{\rm BLUP}} +\providecommand{\se}{\text{\rm s.e.}} +\providecommand{\sem}{\text{\rm s.e.m.}} +\providecommand{\std}{\text{\rm std}} +\providecommand{\sd}{\text{\rm s.d.}} +\providecommand{\Var}{\text{\rm var}} +\providecommand{\VAR}{\text{\rm var}} +\providecommand{\var}{\text{\rm var}} +\providecommand{\cov}{\text{\rm cov}} +\providecommand{\corr}{\text{\rm corr}} +\providecommand{\mean}{\text{\rm mean}} +\providecommand{\CV}{\text{\rm CV}} +\providecommand{\median}{\text{\rm median}} +\providecommand{\cv}{\text{\rm c.v.}} +\providecommand{\erf}{\text{\rm erf}} +\providecommand{\ef}{\text{\rm ef}} +\providecommand{\SSD}{\text{\rm SSD}} +\providecommand{\SPD}{\text{\rm SPD}} +\providecommand{\odds}{\text{\rm odds}} +\providecommand{\bin}{\text{\rm binom}} +\providecommand{\half}{\frac{1}{2}} +% \providecommand{\td}[0]{\stackrel{\textstyle \times}{\div}} +% Changed to slightly smaller symbols +\providecommand{\td}[0]{\stackrel{\scriptstyle \times}{\scriptstyle \div}} +\providecommand{\dt}[0]{\stackrel{\scriptstyle \div}{\scriptstyle \times}} +\providecommand{\diag}{\text{\rm diag}} +\providecommand{\det}{\text{\rm det}} +\providecommand{\dim}{\text{\rm dim}} +\providecommand{\spcol}{\text{\rm span}} +\providecommand{\logit}{\text{\rm logit}} +% \providecommand{\link}{\text{\rm link}} +\providecommand{\spn}{\text{\rm span}} +\providecommand{\CI}{\text{\rm CI}} +\providecommand{\IP}{\text{\rm IP}} +\providecommand{\OR}{\text{\rm OR}} +\providecommand{\RR}{\text{\rm RR}} +\providecommand{\ER}{\text{\rm ER}} +\providecommand{\EM}{\text{\rm EM}} +\providecommand{\EF}{\text{\rm EF}} +\providecommand{\RD}{\text{\rm RD}} +\providecommand{\AC}{\text{\rm AC}} +\providecommand{\AF}{\text{\rm AF}} +\providecommand{\PAF}{\text{\rm PAF}} +\providecommand{\AR}{\text{\rm AR}} +\providecommand{\CR}{\text{\rm CR}} +\providecommand{\PAR}{\text{\rm PAR}} +\providecommand{\EL}{\text{\rm EL}} +\providecommand{\ERL}{\text{\rm ERL}} +\providecommand{\YLL}{\text{\rm YLL}} +\providecommand{\SD}{\text{\rm SD}} +\providecommand{\SE}{\text{\rm SE}} +\providecommand{\SEM}{\text{\rm SEM}} +\providecommand{\SR}{\text{\rm SR}} +\providecommand{\SMR}{\text{\rm SMR}} +\providecommand{\RSR}{\text{\rm RSR}} +\providecommand{\CMF}{\text{\rm CMF}} +\providecommand{\pvp}{\text{\rm PV$+$}} +\providecommand{\pvn}{\text{\rm PV$-$}} +\providecommand{\R}{{\textsf{\textbf{R}}}} +\providecommand{\sas}{\textsl{\textbf{SAS}}} +\providecommand{\SAS}{\textsl{\textbf{SAS}}} +%\providecommand{\gap}[0]{\\[5pt]} +%\providecommand{\blank}[0]{$\;\,$} +% Conditional independence sign from Philip Dawid +\providecommand{\cip}{\mbox{$\perp\!\!\!\perp$}} + +%%% Commands to comment out parts of the text +\providecommand{\GLEM}[1]{} +\providecommand{\FORGETIT}[1]{} +\providecommand{\OMIT}[1]{} + +%%% Insert output from program in small text +%%% (requires package verbatim) +\providecommand{\insoutsmall}[1]{ +% \small + \footnotesize + \renewcommand{\baselinestretch}{0.8} + \verbatiminput{#1} + \renewcommand{\baselinestretch}{1.0} + \normalsize +} +\providecommand{\insout}[1]{ + \scriptsize + \renewcommand{\baselinestretch}{0.8} + \verbatiminput{#1} + \renewcommand{\baselinestretch}{1.0} + \normalsize +} +\providecommand{\insouttiny}[1]{ +\tiny +\renewcommand{\baselinestretch}{0.8} +\verbatiminput{#1} +\renewcommand{\baselinestretch}{1.0} +\normalsize +} + +% From Esa: +\providecommand{\T}{\text{\rm \small{T}}} +\providecommand{\id}{\text{\rm id}} +\providecommand{\Dev}{\text{\rm Dev}} +\providecommand{\Bin}{\text{\rm Bin}} +\providecommand{\probit}{\text{\rm probit}} +\providecommand{\cloglog}{\text{\rm cloglog}} + +% Special commands to include output from R, Bugs and Stata + +\providecommand{\Rin}[2]{ +\subsection{\texttt{#1.R}} +#2 + +\insout{./R/#1.Rout} + +} + +\providecommand{\Statain}[2]{ +\subsection{\texttt{#1.do}} +#2 + +\insout{./do/#1.log} + +} + +\providecommand{\Bugsin}[2]{ +\subsection{\texttt{#1.bug}} +#2 + +\insout{./bugs/#1.bug} + +} + +\newlength{\wdth} +\providecommand{\fxbl}[1]{\settowidth{\wdth}{#1} \makebox[\wdth]{}} + +%%% Local Variables: +%%% mode: latex +%%% TeX-master: t +%%% End: Binary files /tmp/tmp1TClAk/RJn9eX0x2u/r-cran-epi-2.19/vignettes/yll-imm.pdf and /tmp/tmp1TClAk/HK_qO2S7tg/r-cran-epi-2.30/vignettes/yll-imm.pdf differ Binary files /tmp/tmp1TClAk/RJn9eX0x2u/r-cran-epi-2.19/vignettes/yll.pdf and /tmp/tmp1TClAk/HK_qO2S7tg/r-cran-epi-2.30/vignettes/yll.pdf differ diff -Nru r-cran-epi-2.19/vignettes/yll.R r-cran-epi-2.30/vignettes/yll.R --- r-cran-epi-2.19/vignettes/yll.R 1970-01-01 00:00:00.000000000 +0000 +++ r-cran-epi-2.30/vignettes/yll.R 2018-05-03 14:36:26.000000000 +0000 @@ -0,0 +1,211 @@ +### R code from vignette source 'yll' +### Encoding: UTF-8 + +################################################### +### code chunk number 1: yll.rnw:21-24 +################################################### +options( width=90, + SweaveHooks=list( fig=function() + par(mar=c(3,3,1,1),mgp=c(3,1,0)/1.6,las=1,bty="n") ) ) + + +################################################### +### code chunk number 2: states +################################################### +library( Epi ) +TM <- matrix(NA,4,4) +rownames(TM) <- +colnames(TM) <- c("Well","DM","Dead","Dead(DM)") +TM[1,2:3] <- TM[2,4] <- 1 +zz <- boxes( TM, boxpos=list(x=c(20,80,20,80),y=c(80,80,20,20)), wm=1.5, hm=4 ) + + +################################################### +### code chunk number 3: states +################################################### +zz$Arrowtext <- c( expression(lambda), + expression(mu[W]), + expression(mu[D][M]) ) +boxes( zz ) + + +################################################### +### code chunk number 4: yll.rnw:265-266 +################################################### +data( DMepi ) + + +################################################### +### code chunk number 5: yll.rnw:271-273 +################################################### +str( DMepi ) +head( DMepi ) + + +################################################### +### code chunk number 6: yll.rnw:293-297 +################################################### +DMepi <- transform( subset( DMepi, A>30 ), + D.T = D.nD + D.DM, + Y.T = Y.nD + Y.DM ) +head(DMepi) + + +################################################### +### code chunk number 7: yll.rnw:303-329 +################################################### +# Knots used in all models +( a.kn <- seq(40,95,,6) ) +( p.kn <- seq(1997,2015,,4) ) +( c.kn <- seq(1910,1976,,6) ) +# Check the number of events between knots +ae <- xtabs( cbind(D.nD,D.DM,X) ~ cut(A,c(30,a.kn,Inf)) + sex, data=DMepi ) +ftable( addmargins(ae,1), col.vars=3:2 ) +pe <- xtabs( cbind(D.nD,D.DM,X) ~ cut(P,c(1990,p.kn,Inf)) + sex, data=DMepi ) +ftable( addmargins(pe,1), col.vars=3:2 ) +ce <- xtabs( cbind(D.nD,D.DM,X) ~ cut(P-A,c(-Inf,c.kn,Inf)) + sex, data=DMepi ) +ftable( addmargins(ce,1), col.vars=3:2 ) +# Fit an APC-model for all transitions, seperately for men and women +mW.m <- glm( D.nD ~ -1 + Ns(A ,knots=a.kn,int=TRUE) + + Ns( P,knots=p.kn,ref=2005) + + Ns(P-A,knots=c.kn,ref=1950), + offset = log(Y.nD), + family = poisson, + data = subset( DMepi, sex=="M" ) ) +mD.m <- update( mW.m, D.DM ~ . , offset=log(Y.DM) ) +mT.m <- update( mW.m, D.T ~ . , offset=log(Y.T ) ) +lW.m <- update( mW.m, X ~ . ) +# Model for women +mW.f <- update( mW.m, data = subset( DMepi, sex=="F" ) ) +mD.f <- update( mD.m, data = subset( DMepi, sex=="F" ) ) +mT.f <- update( mT.m, data = subset( DMepi, sex=="F" ) ) +lW.f <- update( lW.m, data = subset( DMepi, sex=="F" ) ) + + +################################################### +### code chunk number 8: yll.rnw:336-373 +################################################### +a.ref <- 30:90 +p.ref <- 1996:2016 +aYLL <- NArray( list( type = c("Imm","Tot","Sus"), + sex = levels( DMepi$sex ), + age = a.ref, + date = p.ref ) ) +str( aYLL ) +system.time( +for( ip in p.ref ) + { + nd <- data.frame( A = seq(30,90,0.2)+0.1, + P = ip, + Y.nD = 1, + Y.DM = 1, + Y.T = 1 ) + muW.m <- ci.pred( mW.m, nd )[,1] + muD.m <- ci.pred( mD.m, nd )[,1] + muT.m <- ci.pred( mT.m, nd )[,1] + lam.m <- ci.pred( lW.m, nd )[,1] + muW.f <- ci.pred( mW.f, nd )[,1] + muD.f <- ci.pred( mD.f, nd )[,1] + muT.f <- ci.pred( mT.f, nd )[,1] + lam.f <- ci.pred( lW.f, nd )[,1] + aYLL["Imm","M",,paste(ip)] <- yll( int=0.2, muW.m, muD.m, lam=NULL, + A=a.ref, age.in=30, note=FALSE )[-1] + aYLL["Imm","F",,paste(ip)] <- yll( int=0.2, muW.f, muD.f, lam=NULL, + A=a.ref, age.in=30, note=FALSE )[-1] + aYLL["Tot","M",,paste(ip)] <- yll( int=0.2, muT.m, muD.m, lam=NULL, + A=a.ref, age.in=30, note=FALSE )[-1] + aYLL["Tot","F",,paste(ip)] <- yll( int=0.2, muT.f, muD.f, lam=NULL, + A=a.ref, age.in=30, note=FALSE )[-1] + aYLL["Sus","M",,paste(ip)] <- yll( int=0.2, muW.m, muD.m, lam=lam.m, + A=a.ref, age.in=30, note=FALSE )[-1] + aYLL["Sus","F",,paste(ip)] <- yll( int=0.2, muW.f, muD.f, lam=lam.f, + A=a.ref, age.in=30, note=FALSE )[-1] + } ) +round( ftable( aYLL[,,seq(1,61,10),], col.vars=c(3,2) ), 1 ) + + +################################################### +### code chunk number 9: imm +################################################### +plyll <- function(wh){ +par( mfrow=c(1,2), mar=c(3,3,1,1), mgp=c(3,1,0)/1.6, bty="n", las=1 ) + +matplot( a.ref, aYLL[wh,"M",,], + type="l", lty=1, col="blue", lwd=1:2, + ylim=c(0,12), xlab="Age", + ylab="Years lost to DM", yaxs="i" ) +abline(v=50,h=1:10,col=gray(0.7)) +text( 90, 11, "Men", col="blue", adj=1 ) +text( 40, aYLL[wh,"M","40","1996"], "1996", adj=c(0,0), col="blue" ) +text( 43, aYLL[wh,"M","44","2016"], "2016", adj=c(1,1), col="blue" ) + +matplot( a.ref, aYLL[wh,"F",,], + type="l", lty=1, col="red", lwd=1:2, + ylim=c(0,12), xlab="Age", + ylab="Years lost to DM", yaxs="i" ) +abline(v=50,h=1:10,col=gray(0.7)) +text( 90, 11, "Women", col="red", adj=1 ) +text( 40, aYLL[wh,"F","40","1996"], "1996", adj=c(0,0), col="red" ) +text( 43, aYLL[wh,"F","44","2016"], "2016", adj=c(1,1), col="red" ) +} +plyll("Imm") + + +################################################### +### code chunk number 10: tot +################################################### +plyll("Tot") + + +################################################### +### code chunk number 11: sus +################################################### +plyll("Sus") + + +################################################### +### code chunk number 12: CHANGE1 (eval = FALSE) +################################################### +## source( "../R/erl.R", keep.source=TRUE ) + + +################################################### +### code chunk number 13: CHANGE2 +################################################### +surv1 <- Epi::surv1 +surv2 <- Epi::surv2 +erl1 <- Epi::erl1 +erl <- Epi::erl +yll <- Epi::yll + + +################################################### +### code chunk number 14: yll.rnw:484-485 +################################################### +surv1 + + +################################################### +### code chunk number 15: yll.rnw:489-490 +################################################### +erl1 + + +################################################### +### code chunk number 16: yll.rnw:497-498 +################################################### +surv2 + + +################################################### +### code chunk number 17: yll.rnw:502-503 +################################################### +erl + + +################################################### +### code chunk number 18: yll.rnw:507-508 +################################################### +yll + + diff -Nru r-cran-epi-2.19/vignettes/yll.rnw r-cran-epi-2.30/vignettes/yll.rnw --- r-cran-epi-2.19/vignettes/yll.rnw 2017-04-14 14:59:35.000000000 +0000 +++ r-cran-epi-2.30/vignettes/yll.rnw 2018-03-08 11:48:08.000000000 +0000 @@ -1,14 +1,12 @@ \SweaveOpts{results=verbatim,keep.source=TRUE,include=FALSE,eps=FALSE} -%\VignetteIndexEntry{Years of life lost: simLexis} +%\VignetteIndexEntry{Years of life lost (YLL)} \documentclass[a4paper,twoside,12pt]{report} -% ---------------------------------------------------------------------- -% General information for the title page and the page headings \newcommand{\Title}{Years of Life Lost (YLL) to disease\\Diabetes in DK as example} \newcommand{\Tit}{YLL} -\newcommand{\Version}{Version 1.2} -\newcommand{\Dates}{February 2017} +\newcommand{\Version}{February 1.2} +\newcommand{\Dates}{November 2017} \newcommand{\Where}{SDC} \newcommand{\Homepage}{\url{http://bendixcarstensen.com/Epi}} \newcommand{\Faculty}{\begin{tabular}{rl} @@ -19,130 +17,7 @@ & \texttt{b@bxc.dk}\\ & \url{http://BendixCarstensen.com} \\[1em] \end{tabular}} -% Packages -\usepackage[utf8]{inputenc} -\usepackage[T1]{fontenc} -\usepackage[english]{babel} -\usepackage[font=it,labelfont=normalfont]{caption} -\usepackage[colorlinks,urlcolor=blue,linkcolor=red,,citecolor=Maroon]{hyperref} -\usepackage[ae,hyper]{Rd} -\usepackage[dvipsnames]{xcolor} -\usepackage[super]{nth} -\usepackage[noae]{Sweave} -\usepackage{makeidx,floatflt,amsmath,amsfonts,amsbsy,enumitem,dcolumn,needspace} -\usepackage{ifthen,calc,eso-pic,everyshi} -\usepackage{booktabs,longtable,rotating,graphicx,subfig} -\usepackage{pdfpages,verbatim,fancyhdr,datetime,% -afterpage} -\usepackage[abspath]{currfile} -% \usepackage{times} -\renewcommand{\textfraction}{0.0} -\renewcommand{\topfraction}{1.0} -\renewcommand{\bottomfraction}{1.0} -\renewcommand{\floatpagefraction}{0.9} -% \usepackage{mslapa} -\definecolor{blaa}{RGB}{99,99,255} -\DeclareGraphicsExtensions{.png,.pdf,.jpg} -% Make the Sweave output nicer -\DefineVerbatimEnvironment{Sinput}{Verbatim}{fontsize=\small,fontshape=sl,formatcom=\color{Blue}} -\DefineVerbatimEnvironment{Soutput}{Verbatim}{fontsize=\small,formatcom=\color{Maroon},xleftmargin=0em} -\DefineVerbatimEnvironment{Scode}{Verbatim}{fontsize=\small} -\fvset{listparameters={\setlength{\topsep}{-0.1ex}}} -\renewenvironment{Schunk}% -{\renewcommand{\baselinestretch}{0.85} \vspace{\topsep}}% -{\renewcommand{\baselinestretch}{1.00} \vspace{\topsep}} -\providecommand{\ptxt}[1]{\Pp\left\{\text{#1}\right\}} -\providecommand{\dif}{{\,\mathrm d}} -\DeclareMathOperator{\YLL}{YLL} -\DeclareMathOperator{\Pp}{P} - -%---------------------------------------------------------------------- -% Set up layout of pages -\oddsidemargin 1mm -\evensidemargin 1mm -\topmargin -10mm -\headheight 8mm -\headsep 5mm -\textheight 240mm -\textwidth 165mm -%\footheight 5mm -\footskip 15mm -\renewcommand{\topfraction}{0.9} -\renewcommand{\bottomfraction}{0.9} -\renewcommand{\textfraction}{0.1} -\renewcommand{\floatpagefraction}{0.9} -\renewcommand{\headrulewidth}{0.1pt} -\setcounter{secnumdepth}{4} -% \setcounter{tocdepth}{2} - -%---------------------------------------------------------------------- -% How to insert a figure in a .rnw file -\newcommand{\rwpre}{./graph/gr} -\newcommand{\insfig}[3]{ -\begin{figure}[h] - \centering - \includegraphics[width=#2\textwidth]{\rwpre-#1} - \caption{#3} - \label{fig:#1} -% \afterpage{\clearpage} -\end{figure}} - -%---------------------------------------------------------------------- -% Here is the document starting with the titlepage -\begin{document} - -%---------------------------------------------------------------------- -% The title page -\setcounter{page}{1} -\pagenumbering{roman} -\pagestyle{plain} -\thispagestyle{empty} -% \vspace*{0.05\textheight} -\flushright -% The blank below here is necessary in order not to muck up the -% linespacing in title if it has more than 2 lines -{\Huge \bfseries \Title - -}\ \\[-1.5ex] -\noindent\textcolor{blaa}{\rule[-1ex]{\textwidth}{5pt}}\\[2.5ex] -\large -\Where \\ -\Dates \\ -\Homepage \\ -\Version \\[1em] -\normalsize -Compiled \today,\ \currenttime\\ -from: \texttt{\currfileabspath}\\[1em] -% \input{wordcount} -\normalsize -\vfill -\Faculty -% End of titlepage -% \newpage - -%---------------------------------------------------------------------- -% Table of contents -\tableofcontents - -%---------------------------------------------------------------------- -% General text layout -\raggedright -\parindent 1em -\parskip 0ex -\cleardoublepage - -%---------------------------------------------------------------------- -% General page style -\pagenumbering{arabic} -\setcounter{page}{1} -\pagestyle{fancy} -\renewcommand{\chaptermark}[1]{\markboth{\textsl{#1}}{}} -\renewcommand{\sectionmark}[1]{\markright{\thesection\ \textsl{#1}}{}} -\fancyhead[EL]{\bf \thepage \quad \rm \leftmark} -\fancyhead[ER,OL]{\sl \Tit} -\fancyhead[OR]{\rm \rightmark \quad \bf \thepage} -\fancyfoot{} - +\input{topreport} <>= options( width=90, SweaveHooks=list( fig=function() @@ -150,12 +25,10 @@ @ % \renewcommand{\rwpre}{./yll} -%---------------------------------------------------------------------- -% Here comes the substance part \chapter{Theory and technicalities} This vignette for the \texttt{Epi} package describes the -probabilistic/demographic background for and technical implementation +probabilistic and demographic background for and technical implementation of the \texttt{erl} and \texttt{yll} functions that computes the expected residual life time and years of life lost in an illness-death model. @@ -581,9 +454,25 @@ \section{Function definitions} -<>= +% The following code sources the originally formatted code in erl.R in +% order to make it printable with all the formatting and comments in +% it. However the sourcing does not work when compiling the vignettes +% in R CMD build. So when checking the code we comment this out by +% putting eval=FALSE. So depending on whether we actually construct +% the pdf for the inst/doc folder or test (upload) the package, one of +% the following two chunks are run with eval=FALSE and the other with +% eval=TRUE. +<>= source( "../R/erl.R", keep.source=TRUE ) -@ +@ % +% When checking the package +<>= +surv1 <- Epi::surv1 +surv2 <- Epi::surv2 +erl1 <- Epi::erl1 +erl <- Epi::erl +yll <- Epi::yll +@ % When using the functions it is assumed that the functions $\mu_W$, $\mu_D$ and $\lambda$ are given as vectors corresponding to equidistantly (usually tightly) spaced ages from 0 to $K$ where K is @@ -626,10 +515,10 @@ \bibitem{Carstensen.2007a} B~Carstensen. \newblock Age-{P}eriod-{C}ohort models for the {L}exis diagram. -\newblock {\em Statistics in Medicine}, 26(15):3018--3045, July 2007. +\newblock {\em Statistics in Medicine}, 26(15):3018--3045, 2007. \bibitem{Carstensen.2008c} -B.~Carstensen, J.K. Kristensen, P.~Ottosen, and K.~Borch-Johnsen. +B~Carstensen, JK~Kristensen, P~Ottosen, and K~Borch-Johnsen. \newblock The {D}anish {N}ational {D}iabetes {R}egister: {T}rends in incidence, prevalence and mortality. \newblock {\em Diabetologia}, 51:2187--2196, 2008. diff -Nru r-cran-epi-2.19/vignettes/yll.rwl r-cran-epi-2.30/vignettes/yll.rwl --- r-cran-epi-2.19/vignettes/yll.rwl 1970-01-01 00:00:00.000000000 +0000 +++ r-cran-epi-2.30/vignettes/yll.rwl 2018-05-03 14:36:44.000000000 +0000 @@ -0,0 +1,33 @@ +R version 3.4.4 (2018-03-15) + --------------------------------------------- + Folder: /home/bendix/stat/R/lib.src/Epi/pkg/vignettes + Started: Thursday 03. May 2018, 16:36:26 + --------------------------------------------- +Writing to file yll.tex +Processing code chunks with options ... + 1 : keep.source term verbatim (yll.rnw:21) + 2 : keep.source term verbatim pdf (label = states, yll.rnw:87) + 3 : keep.source term verbatim pdf (label = states, yll.rnw:95) + 4 : echo keep.source term verbatim (yll.rnw:265) + 5 : echo keep.source term verbatim (yll.rnw:271) + 6 : echo keep.source term verbatim (yll.rnw:293) + 7 : echo keep.source term verbatim (yll.rnw:303) + 8 : echo keep.source term verbatim (yll.rnw:336) + 9 : echo keep.source term verbatim pdf (label = imm, yll.rnw:378) +10 : echo keep.source term verbatim pdf (label = tot, yll.rnw:402) +11 : echo keep.source term verbatim pdf (label = sus, yll.rnw:405) +12 : keep.source (label = CHANGE1, yll.rnw:465) +13 : keep.source term hide (label = CHANGE2, yll.rnw:469) +14 : echo keep.source term verbatim (yll.rnw:484) +15 : echo keep.source term verbatim (yll.rnw:489) +16 : echo keep.source term verbatim (yll.rnw:497) +17 : echo keep.source term verbatim (yll.rnw:502) +18 : echo keep.source term verbatim (yll.rnw:507) + +You can now run (pdf)latex on ‘yll.tex’ + + --------------------------------------------- + Folder: /home/bendix/stat/R/lib.src/Epi/pkg/vignettes + Ended: Thursday 03. May 2018, 16:36:44 + Elapsed: 00:00:18 + --------------------------------------------- Binary files /tmp/tmp1TClAk/RJn9eX0x2u/r-cran-epi-2.19/vignettes/yll-states.pdf and /tmp/tmp1TClAk/HK_qO2S7tg/r-cran-epi-2.30/vignettes/yll-states.pdf differ Binary files /tmp/tmp1TClAk/RJn9eX0x2u/r-cran-epi-2.19/vignettes/yll-sus.pdf and /tmp/tmp1TClAk/HK_qO2S7tg/r-cran-epi-2.30/vignettes/yll-sus.pdf differ diff -Nru r-cran-epi-2.19/vignettes/yll.tex r-cran-epi-2.30/vignettes/yll.tex --- r-cran-epi-2.19/vignettes/yll.tex 1970-01-01 00:00:00.000000000 +0000 +++ r-cran-epi-2.30/vignettes/yll.tex 2018-05-03 14:36:44.000000000 +0000 @@ -0,0 +1,842 @@ + +%\VignetteIndexEntry{Years of life lost (YLL)} +\documentclass[a4paper,twoside,12pt]{report} + +\newcommand{\Title}{Years of Life Lost (YLL) + to disease\\Diabetes in DK as example} +\newcommand{\Tit}{YLL} +\newcommand{\Version}{February 1.2} +\newcommand{\Dates}{November 2017} +\newcommand{\Where}{SDC} +\newcommand{\Homepage}{\url{http://bendixcarstensen.com/Epi}} +\newcommand{\Faculty}{\begin{tabular}{rl} +Bendix Carstensen + & Steno Diabetes Center, Gentofte, Denmark\\ + & {\small \& Department of Biostatistics, + University of Copenhagen} \\ + & \texttt{b@bxc.dk}\\ + & \url{http://BendixCarstensen.com} \\[1em] + \end{tabular}} +\input{topreport} +\renewcommand{\rwpre}{./yll} + +\chapter{Theory and technicalities} + +This vignette for the \texttt{Epi} package describes the +probabilistic and demographic background for and technical implementation +of the \texttt{erl} and \texttt{yll} functions that computes the +expected residual life time and years of life lost in an illness-death +model. + +\section{Years of life lost (YLL)} + +\ldots to diabetes or any other disease for that matter. + +The general concept in calculation of ``years lost to\ldots'' is the +comparison of the expected lifetime between two groups of persons; one +with and one without disease (in this example DM). The expected +lifetime is the area under the survival curve, so basically the +exercise requires that two survival curves that are deemed relevant be +available. + +The years of life lost is therefore just the area between the survival +curves for those ``Well'', $S_W(t)$, and for those ``Diseased'', +$S_D(t)$: +\[ + \YLL = \int_0^\infty S_W(t) - S_D(t) \dif t +\] +The time $t$ could of course be age, but it could also be ``time after +age 50'' and the survival curves compared would then be survival +curves \emph{conditional} on survival till age 50, and the YLL would +be the years of life lost for a 50-year old person with diabetes. + +If we are referring to the expected lifetime we will more precisely use +the label expected residual lifetime, ERL. + +\section{Constructing the survival curves} + +YLL can be computed in two different ways, depending on the way the +survival curve and hence the expected lifetime of a person +\emph{without} diabetes is computed: +\begin{itemize} +\item Assume that the ``Well'' persons are \emph{immune} to disease + --- using only the non-DM mortality rates throughout for calculation + of expected life time. +\item Assume that the ``Well'' persons \emph{can} acquire the disease and + thereby see an increased mortality, thus involving all three rates + shown in figure \ref{fig:states}. +\end{itemize} +The former gives a higher YLL because the comparison is to persons +assumed immune to DM (and yet with the same mortality as non-immune +prior to diagnosis), the latter gives a more realistic picture of the +comparison of group of persons with and without diabetes at a given +age that can be interpreted in the real world. + +The differences can be illustrated by figure \ref{fig:states}; the +immune approach corresponds to an assumption of $\lambda(t)=0$ in the +calculation of the survival curve for a person in the ``Well'' state. + +Calculation of the survival of a diseased person already in the ``DM'' +state is unaffected by assumptions about $\lambda$. + +\insfig{states}{0.7}{Illness-death model describing diabetes incidence + and -mortality.} + +\subsection{Total mortality --- a shortcut?} + +A practical crude shortcut could be to compare the ERL in the diabetic +population to the ERL for the \emph{entire} population (that is use +the total mortality ignoring diabetes status). + +Note however that this approach also counts the mortality of persons +that acquired the disease earlier, thus making the comparison +population on average more ill than the population we aim at, namely +those well at a given time, which only then become more gradually ill. + +How large these effects are can however be empirically explored, as we +shall do later. + +\subsection{Disease duration} + +In the exposition above there is no explicit provision for the effect of +disease duration, but if we were able to devise mortality rates for +any combination of age and duration, this could be taken into account. + +There are however severe limitations in this as we in principle would +want to have duration effects as long as the age-effects --- in +principle for all $(a,d)$ where $d\leq A$, where $A$ is the age at +which we condition. So even if we were only to compute ERL from +age, say, 40 we would still need duration effects up to 60 years +(namely to age 100). + +The incorporation of duration effects is in principle trivial from a +computational point of view, but we would be forced to entertain +models predicting duration effects way beyond what is actually +observed disease duration in any practical case. + +\subsection{Computing integrals} + +The practical calculations of survival curves, ERL and YLL involves +calculation of (cumulative) integrals of rates and functions of these +as we shall see below. This is easy if we have a closed form +expression of the function, so its value may be computed at any time +point --- this will be the case if we model rates by smooth parametric +functions. + +Computing the (cumulative) integral of a function is done as follows: +\begin{itemize} +\item Compute the value of the function (mortality rate for example) + at the midpoints of a sequence of narrow equidistant intervals --- + for example one- or three month intervals of age, say. +\item Take the cumulative sum of these values multiplied by the + interval length --- this will be a very close approximation to the + cumulative integral evaluated at the end of each interval. +\item If the intervals are really small (like 1/100 year), the + distinction between the value at the middle and at the end of each + interval becomes irrelevant. +\end{itemize} +Note that in the above it is assumed that the rates are given in units +corresponding to the interval length --- or more precisely, as the +cumulative rates over the interval. + +\section{Survival functions in the illness-death model} + +The survival functions for persons in the ``Well'' state can be +computed under two fundamentally different scenarios, depending on +whether persons in the ``Well'' state are assumed to be immune to the +disease ($\lambda(a)=0$) or not. + +\subsection{Immune approach} + +In this case both survival functions for person in the two states are +the usual simple transformation of the cumulative mortality rates: +\[ + S_W(a) = \exp\left(-\int_0^a\!\!\mu_W(u) \dif u \right), \qquad + S_D(a) = \exp\left(-\int_0^a\!\!\mu_D(u) \dif u \right) +\] + +\subsubsection{Conditional survival functions} + +If we want the \emph{conditional} survival functions given survival to +age $A$, say, they are just: +\[ + S_W(a|A) = S_W(a)/S_W(A), \qquad S_D(a|A) = S_D(a)/S_D(A) +\] + +\subsection{Non-immune approach} + +For a diseased person, the survival function in this states is the same +as above, but the survival function for a person without disease (at +age 0) is (see figure \ref{fig:states}): +\[ +S(a) = \ptxt{Well}\!(a) + \ptxt{DM}\!(a) +\] +In the appendix of the paper \cite{Carstensen.2008c} is an indication +of how to compute the probability of being in any of the four states +shown in figure \ref{fig:states}, which I shall repeat here: + +In terms of the rates, the probability of being in the ``Well'' box is +simply the probability of escaping both death (at a rate of $\mu_W(a)$) +and diabetes (at a rate of $\lambda(a)$): +\[ + \ptxt{Well}(a) = \exp\left(\!-\int_0^a\!\!\mu_W(u)+\lambda(u) \right) \dif u +\] +The probability of being alive with diabetes at age $a$, is computed given that + diabetes occurred at age $s$ ($s data( DMepi ) +\end{Sinput} +\end{Schunk} +The dataset \texttt{DMepi} contains diabetes events, deaths and +person-years for persons without diabetes and deaths and person-years +for persons with diabetes: +\begin{Schunk} +\begin{Sinput} +> str( DMepi ) +\end{Sinput} +\begin{Soutput} +'data.frame': 4000 obs. of 8 variables: + $ sex : Factor w/ 2 levels "M","F": 1 2 1 2 1 2 1 2 1 2 ... + $ A : num 0 0 1 1 2 2 3 3 4 4 ... + $ P : num 1996 1996 1996 1996 1996 ... + $ X : num 1 9 4 7 7 2 6 5 9 4 ... + $ D.nD: num 28 19 23 19 7 8 8 8 6 7 ... + $ Y.nD: num 35454 33095 36451 34790 35329 ... + $ D.DM: num 0 0 0 0 0 0 0 0 0 0 ... + $ Y.DM: num 0.476 3.877 4.92 7.248 12.474 ... +\end{Soutput} +\begin{Sinput} +> head( DMepi ) +\end{Sinput} +\begin{Soutput} + sex A P X D.nD Y.nD D.DM Y.DM +1 M 0 1996 1 28 35453.65 0 0.4757016 +2 F 0 1996 9 19 33094.86 0 3.8767967 +3 M 1 1996 4 23 36450.73 0 4.9199179 +4 F 1 1996 7 19 34789.99 0 7.2484600 +5 M 2 1996 7 7 35328.92 0 12.4743326 +6 F 2 1996 2 8 33673.43 0 8.0951403 +\end{Soutput} +\end{Schunk} +For each combination of sex, age, period and date of birth in 1 year +age groups, we have the person-years in the ``Well'' (\texttt{Y.nD}) +and the ``DM'' (\texttt{Y.DM}) states, as well as the number of deaths +from these (\texttt{D.nD}, \texttt{D.DM}) and the number of incident +diabetes cases from the ``Well'' state (\texttt{X}). + +In order to compute the years of life lost to diabetes and how this +has changed over time, we fit models for the mortality and incidence +of both groups (and of course, separately for men and women). The +models we use will be age-period-cohort models \cite{Carstensen.2007a} +providing estimated mortality rates for ages 0--99 and dates +1.1.1996--1.1.2016. + +First we transform the age and period variables to reflect the mean +age and period in each of the Lexis triangles. We also compute the +total number of deaths and amount of risk time, as we are going to +model the total mortality as well. Finally we restrict the dataset to +ages over 30 only: +\begin{Schunk} +\begin{Sinput} +> DMepi <- transform( subset( DMepi, A>30 ), ++ D.T = D.nD + D.DM, ++ Y.T = Y.nD + Y.DM ) +> head(DMepi) +\end{Sinput} +\begin{Soutput} + sex A P X D.nD Y.nD D.DM Y.DM D.T Y.T +63 M 31 1996 21 51 43909.32 0 291.4107 51 44200.73 +64 F 31 1996 33 16 41376.91 2 287.4969 18 41664.41 +65 M 32 1996 26 67 43159.94 0 299.6571 67 43459.59 +66 F 32 1996 20 23 40706.49 1 275.2615 24 40981.75 +67 M 33 1996 35 54 41251.06 4 321.0397 58 41572.10 +68 F 33 1996 32 23 39102.29 1 277.4463 24 39379.74 +\end{Soutput} +\end{Schunk} +With the correct age and period coding in the Lexis triangles, we fit +models for the mortalities and incidences. Note that we for +comparative purposes also fit a model for the \emph{total} mortality, +ignoring the +\begin{Schunk} +\begin{Sinput} +> # Knots used in all models +> ( a.kn <- seq(40,95,,6) ) +\end{Sinput} +\begin{Soutput} +[1] 40 51 62 73 84 95 +\end{Soutput} +\begin{Sinput} +> ( p.kn <- seq(1997,2015,,4) ) +\end{Sinput} +\begin{Soutput} +[1] 1997 2003 2009 2015 +\end{Soutput} +\begin{Sinput} +> ( c.kn <- seq(1910,1976,,6) ) +\end{Sinput} +\begin{Soutput} +[1] 1910.0 1923.2 1936.4 1949.6 1962.8 1976.0 +\end{Soutput} +\begin{Sinput} +> # Check the number of events between knots +> ae <- xtabs( cbind(D.nD,D.DM,X) ~ cut(A,c(30,a.kn,Inf)) + sex, data=DMepi ) +> ftable( addmargins(ae,1), col.vars=3:2 ) +\end{Sinput} +\begin{Soutput} + D.nD D.DM X + sex M F M F M F +cut(A, c(30, a.kn, Inf)) +(30,40] 8899 4525 564 269 9912 8622 +(40,51] 24686 15296 2886 1399 31668 20769 +(51,62] 57747 38968 10276 4916 53803 34495 +(62,73] 102877 78771 24070 13008 51000 38731 +(73,84] 154804 153842 31006 25414 25444 26804 +(84,95] 97698 175484 13972 21231 4726 7852 +(95,Inf] 5800 20563 545 1522 74 238 +Sum 452511 487449 83319 67759 176627 137511 +\end{Soutput} +\begin{Sinput} +> pe <- xtabs( cbind(D.nD,D.DM,X) ~ cut(P,c(1990,p.kn,Inf)) + sex, data=DMepi ) +> ftable( addmargins(pe,1), col.vars=3:2 ) +\end{Sinput} +\begin{Soutput} + D.nD D.DM X + sex M F M F M F +cut(P, c(1990, p.kn, Inf)) +(1990,1997] 51901 54162 6012 5378 12477 10030 +(1997,2003] 145418 157768 21028 17976 43749 34255 +(2003,2009] 133175 144717 25172 20595 56556 43891 +(2009,2015] 122017 130802 31107 23810 63845 49335 +(2015,Inf] 0 0 0 0 0 0 +Sum 452511 487449 83319 67759 176627 137511 +\end{Soutput} +\begin{Sinput} +> ce <- xtabs( cbind(D.nD,D.DM,X) ~ cut(P-A,c(-Inf,c.kn,Inf)) + sex, data=DMepi ) +> ftable( addmargins(ce,1), col.vars=3:2 ) +\end{Sinput} +\begin{Soutput} + D.nD D.DM X + sex M F M F M F +cut(P - A, c(-Inf, c.kn, Inf)) +(-Inf,1.91e+03] 19912 49797 1784 3731 536 1143 +(1.91e+03,1.92e+03] 130691 190012 18160 23709 9765 13519 +(1.92e+03,1.94e+03] 154227 146284 32435 24876 34897 32481 +(1.94e+03,1.95e+03] 93397 67909 22921 11437 65012 44292 +(1.95e+03,1.96e+03] 40948 26234 6724 3326 46155 29891 +(1.96e+03,1.98e+03] 12534 6810 1245 654 19293 15311 +(1.98e+03, Inf] 802 403 50 26 969 874 +Sum 452511 487449 83319 67759 176627 137511 +\end{Soutput} +\begin{Sinput} +> # Fit an APC-model for all transitions, seperately for men and women +> mW.m <- glm( D.nD ~ -1 + Ns(A ,knots=a.kn,int=TRUE) + ++ Ns( P,knots=p.kn,ref=2005) + ++ Ns(P-A,knots=c.kn,ref=1950), ++ offset = log(Y.nD), ++ family = poisson, ++ data = subset( DMepi, sex=="M" ) ) +> mD.m <- update( mW.m, D.DM ~ . , offset=log(Y.DM) ) +> mT.m <- update( mW.m, D.T ~ . , offset=log(Y.T ) ) +> lW.m <- update( mW.m, X ~ . ) +> # Model for women +> mW.f <- update( mW.m, data = subset( DMepi, sex=="F" ) ) +> mD.f <- update( mD.m, data = subset( DMepi, sex=="F" ) ) +> mT.f <- update( mT.m, data = subset( DMepi, sex=="F" ) ) +> lW.f <- update( lW.m, data = subset( DMepi, sex=="F" ) ) +\end{Sinput} +\end{Schunk} + +\section{Residual life time and years lost to DM} + +We now collect the estimated years of life lost classified by method +(immune assumption or not), sex, age and calendar time: +\begin{Schunk} +\begin{Sinput} +> a.ref <- 30:90 +> p.ref <- 1996:2016 +> aYLL <- NArray( list( type = c("Imm","Tot","Sus"), ++ sex = levels( DMepi$sex ), ++ age = a.ref, ++ date = p.ref ) ) +> str( aYLL ) +\end{Sinput} +\begin{Soutput} + logi [1:3, 1:2, 1:61, 1:21] NA NA NA NA NA NA ... + - attr(*, "dimnames")=List of 4 + ..$ type: chr [1:3] "Imm" "Tot" "Sus" + ..$ sex : chr [1:2] "M" "F" + ..$ age : chr [1:61] "30" "31" "32" "33" ... + ..$ date: chr [1:21] "1996" "1997" "1998" "1999" ... +\end{Soutput} +\begin{Sinput} +> system.time( ++ for( ip in p.ref ) ++ { ++ nd <- data.frame( A = seq(30,90,0.2)+0.1, ++ P = ip, ++ Y.nD = 1, ++ Y.DM = 1, ++ Y.T = 1 ) ++ muW.m <- ci.pred( mW.m, nd )[,1] ++ muD.m <- ci.pred( mD.m, nd )[,1] ++ muT.m <- ci.pred( mT.m, nd )[,1] ++ lam.m <- ci.pred( lW.m, nd )[,1] ++ muW.f <- ci.pred( mW.f, nd )[,1] ++ muD.f <- ci.pred( mD.f, nd )[,1] ++ muT.f <- ci.pred( mT.f, nd )[,1] ++ lam.f <- ci.pred( lW.f, nd )[,1] ++ aYLL["Imm","M",,paste(ip)] <- yll( int=0.2, muW.m, muD.m, lam=NULL, ++ A=a.ref, age.in=30, note=FALSE )[-1] ++ aYLL["Imm","F",,paste(ip)] <- yll( int=0.2, muW.f, muD.f, lam=NULL, ++ A=a.ref, age.in=30, note=FALSE )[-1] ++ aYLL["Tot","M",,paste(ip)] <- yll( int=0.2, muT.m, muD.m, lam=NULL, ++ A=a.ref, age.in=30, note=FALSE )[-1] ++ aYLL["Tot","F",,paste(ip)] <- yll( int=0.2, muT.f, muD.f, lam=NULL, ++ A=a.ref, age.in=30, note=FALSE )[-1] ++ aYLL["Sus","M",,paste(ip)] <- yll( int=0.2, muW.m, muD.m, lam=lam.m, ++ A=a.ref, age.in=30, note=FALSE )[-1] ++ aYLL["Sus","F",,paste(ip)] <- yll( int=0.2, muW.f, muD.f, lam=lam.f, ++ A=a.ref, age.in=30, note=FALSE )[-1] ++ } ) +\end{Sinput} +\begin{Soutput} + user system elapsed + 18.571 8.688 16.366 +\end{Soutput} +\begin{Sinput} +> round( ftable( aYLL[,,seq(1,61,10),], col.vars=c(3,2) ), 1 ) +\end{Sinput} +\begin{Soutput} + age 30 40 50 60 70 80 90 + sex M F M F M F M F M F M F M F +type date +Imm 1996 11.7 10.8 9.8 9.7 7.8 8.2 5.6 6.3 3.4 3.9 1.5 1.6 0.0 0.0 + 1997 11.5 10.6 9.7 9.4 7.7 7.9 5.6 6.1 3.4 3.9 1.4 1.6 0.0 0.0 + 1998 11.3 10.3 9.6 9.2 7.6 7.7 5.5 5.9 3.4 3.8 1.4 1.5 0.0 0.0 + 1999 11.1 10.0 9.4 9.0 7.5 7.5 5.4 5.7 3.3 3.7 1.4 1.5 0.0 0.0 + 2000 10.9 9.8 9.3 8.7 7.4 7.3 5.4 5.6 3.3 3.6 1.4 1.5 0.0 0.0 + 2001 10.7 9.5 9.1 8.5 7.3 7.1 5.3 5.4 3.3 3.4 1.4 1.4 0.0 0.0 + 2002 10.5 9.2 9.0 8.3 7.1 6.9 5.2 5.2 3.2 3.3 1.3 1.4 0.0 0.0 + 2003 10.3 9.0 8.8 8.1 7.0 6.7 5.1 5.1 3.1 3.2 1.3 1.3 0.0 0.0 + 2004 10.0 8.8 8.6 7.8 6.8 6.5 5.0 4.9 3.1 3.1 1.3 1.3 0.0 0.0 + 2005 9.7 8.5 8.4 7.6 6.6 6.3 4.8 4.8 3.0 3.0 1.2 1.3 0.0 0.0 + 2006 9.4 8.3 8.1 7.5 6.5 6.2 4.7 4.6 2.9 2.9 1.2 1.2 0.0 0.0 + 2007 9.1 8.1 7.9 7.3 6.3 6.0 4.6 4.5 2.8 2.8 1.1 1.2 0.0 0.0 + 2008 8.9 7.9 7.7 7.1 6.1 5.9 4.4 4.3 2.7 2.7 1.1 1.1 0.0 0.0 + 2009 8.6 7.7 7.5 6.9 6.0 5.7 4.3 4.2 2.7 2.6 1.1 1.1 0.0 0.0 + 2010 8.4 7.5 7.3 6.8 5.9 5.6 4.2 4.1 2.6 2.5 1.1 1.1 0.0 0.0 + 2011 8.3 7.3 7.2 6.7 5.8 5.5 4.2 4.0 2.6 2.5 1.0 1.0 0.0 0.0 + 2012 8.1 7.2 7.1 6.5 5.7 5.4 4.1 4.0 2.6 2.4 1.0 1.0 0.0 0.0 + 2013 8.0 7.1 7.0 6.4 5.6 5.3 4.1 3.9 2.5 2.4 1.0 1.0 0.0 0.0 + 2014 7.8 6.9 6.9 6.3 5.6 5.3 4.1 3.8 2.5 2.3 1.0 0.9 0.0 0.0 + 2015 7.7 6.8 6.8 6.2 5.5 5.2 4.0 3.8 2.5 2.2 1.0 0.9 0.0 0.0 + 2016 7.6 6.7 6.7 6.1 5.5 5.1 4.0 3.7 2.5 2.2 1.0 0.9 0.0 0.0 +Tot 1996 11.1 10.4 9.3 9.2 7.3 7.7 5.2 5.9 3.1 3.7 1.3 1.5 0.0 0.0 + 1997 10.9 10.1 9.1 9.0 7.2 7.5 5.1 5.7 3.1 3.6 1.3 1.5 0.0 0.0 + 1998 10.7 9.8 9.0 8.7 7.0 7.3 5.0 5.5 3.1 3.5 1.3 1.4 0.0 0.0 + 1999 10.5 9.6 8.8 8.5 6.9 7.1 5.0 5.4 3.0 3.4 1.3 1.4 0.0 0.0 + 2000 10.3 9.3 8.6 8.3 6.8 6.9 4.9 5.2 3.0 3.3 1.3 1.4 0.0 0.0 + 2001 10.0 9.0 8.5 8.0 6.6 6.6 4.8 5.0 2.9 3.2 1.2 1.3 0.0 0.0 + 2002 9.8 8.8 8.3 7.8 6.5 6.4 4.7 4.8 2.9 3.1 1.2 1.3 0.0 0.0 + 2003 9.5 8.5 8.1 7.6 6.3 6.2 4.5 4.7 2.8 2.9 1.2 1.2 0.0 0.0 + 2004 9.3 8.2 7.8 7.3 6.1 6.0 4.4 4.5 2.7 2.8 1.1 1.2 0.0 0.0 + 2005 9.0 8.0 7.6 7.1 5.9 5.9 4.3 4.3 2.6 2.7 1.1 1.1 0.0 0.0 + 2006 8.7 7.8 7.4 6.9 5.8 5.7 4.1 4.2 2.5 2.6 1.0 1.1 0.0 0.0 + 2007 8.4 7.5 7.1 6.7 5.6 5.5 4.0 4.1 2.4 2.5 1.0 1.1 0.0 0.0 + 2008 8.1 7.3 6.9 6.6 5.4 5.4 3.8 3.9 2.3 2.4 1.0 1.0 0.0 0.0 + 2009 7.8 7.1 6.7 6.4 5.2 5.2 3.7 3.8 2.3 2.3 0.9 1.0 0.0 0.0 + 2010 7.6 7.0 6.5 6.3 5.1 5.1 3.6 3.7 2.2 2.2 0.9 0.9 0.0 0.0 + 2011 7.4 6.8 6.4 6.1 5.0 5.0 3.5 3.6 2.2 2.2 0.9 0.9 0.0 0.0 + 2012 7.3 6.6 6.3 6.0 4.9 4.9 3.5 3.5 2.1 2.1 0.9 0.9 0.0 0.0 + 2013 7.1 6.5 6.1 5.9 4.8 4.8 3.4 3.4 2.1 2.0 0.9 0.8 0.0 0.0 + 2014 7.0 6.3 6.0 5.8 4.8 4.7 3.4 3.4 2.0 2.0 0.8 0.8 0.0 0.0 + 2015 6.8 6.2 5.9 5.6 4.7 4.6 3.3 3.3 2.0 1.9 0.8 0.8 0.0 0.0 + 2016 6.7 6.1 5.8 5.5 4.6 4.6 3.3 3.2 2.0 1.9 0.8 0.7 0.0 0.0 +Sus 1996 10.7 10.1 8.9 9.0 7.1 7.6 5.2 5.9 3.2 3.8 1.4 1.6 0.0 0.0 + 1997 10.5 9.8 8.8 8.8 7.0 7.4 5.1 5.7 3.2 3.7 1.4 1.5 0.0 0.0 + 1998 10.3 9.6 8.7 8.5 6.9 7.2 5.1 5.5 3.2 3.6 1.4 1.5 0.0 0.0 + 1999 10.2 9.3 8.5 8.3 6.8 7.0 5.0 5.4 3.1 3.5 1.4 1.5 0.0 0.0 + 2000 10.0 9.0 8.4 8.1 6.7 6.8 4.9 5.2 3.1 3.4 1.4 1.4 0.0 0.0 + 2001 9.7 8.8 8.2 7.8 6.5 6.6 4.8 5.0 3.1 3.3 1.3 1.4 0.0 0.0 + 2002 9.5 8.5 8.1 7.6 6.4 6.4 4.7 4.9 3.0 3.2 1.3 1.4 0.0 0.0 + 2003 9.2 8.3 7.8 7.4 6.2 6.2 4.6 4.7 2.9 3.1 1.3 1.3 0.0 0.0 + 2004 8.9 8.0 7.6 7.2 6.0 5.9 4.5 4.5 2.9 3.0 1.2 1.3 0.0 0.0 + 2005 8.6 7.7 7.3 6.9 5.8 5.8 4.3 4.4 2.8 2.8 1.2 1.2 0.0 0.0 + 2006 8.3 7.5 7.1 6.7 5.6 5.6 4.2 4.2 2.7 2.7 1.1 1.2 0.0 0.0 + 2007 8.0 7.2 6.8 6.5 5.4 5.4 4.0 4.1 2.6 2.6 1.1 1.1 0.0 0.0 + 2008 7.7 7.0 6.6 6.3 5.2 5.2 3.9 3.9 2.5 2.5 1.1 1.1 0.0 0.0 + 2009 7.4 6.8 6.4 6.2 5.1 5.1 3.8 3.8 2.4 2.4 1.0 1.1 0.0 0.0 + 2010 7.3 6.6 6.2 6.0 5.0 5.0 3.7 3.7 2.4 2.4 1.0 1.0 0.0 0.0 + 2011 7.1 6.5 6.2 5.9 4.9 4.9 3.7 3.7 2.4 2.3 1.0 1.0 0.0 0.0 + 2012 7.0 6.4 6.1 5.8 4.9 4.9 3.6 3.6 2.3 2.3 1.0 1.0 0.0 0.0 + 2013 7.0 6.3 6.0 5.8 4.9 4.8 3.6 3.6 2.3 2.2 1.0 0.9 0.0 0.0 + 2014 6.9 6.3 6.0 5.7 4.9 4.8 3.6 3.5 2.3 2.2 1.0 0.9 0.0 0.0 + 2015 6.8 6.2 6.0 5.7 4.9 4.8 3.6 3.5 2.3 2.1 1.0 0.9 0.0 0.0 + 2016 6.8 6.1 6.0 5.6 4.9 4.7 3.6 3.5 2.3 2.1 1.0 0.8 0.0 0.0 +\end{Soutput} +\end{Schunk} +We now have the relevant points for the graph showing YLL to diabetes +for men and women by age, and calendar year, both under the immunity +and susceptibility models for the calculation of YLL. +\begin{Schunk} +\begin{Sinput} +> plyll <- function(wh){ ++ par( mfrow=c(1,2), mar=c(3,3,1,1), mgp=c(3,1,0)/1.6, bty="n", las=1 ) ++ ++ matplot( a.ref, aYLL[wh,"M",,], ++ type="l", lty=1, col="blue", lwd=1:2, ++ ylim=c(0,12), xlab="Age", ++ ylab="Years lost to DM", yaxs="i" ) ++ abline(v=50,h=1:10,col=gray(0.7)) ++ text( 90, 11, "Men", col="blue", adj=1 ) ++ text( 40, aYLL[wh,"M","40","1996"], "1996", adj=c(0,0), col="blue" ) ++ text( 43, aYLL[wh,"M","44","2016"], "2016", adj=c(1,1), col="blue" ) ++ ++ matplot( a.ref, aYLL[wh,"F",,], ++ type="l", lty=1, col="red", lwd=1:2, ++ ylim=c(0,12), xlab="Age", ++ ylab="Years lost to DM", yaxs="i" ) ++ abline(v=50,h=1:10,col=gray(0.7)) ++ text( 90, 11, "Women", col="red", adj=1 ) ++ text( 40, aYLL[wh,"F","40","1996"], "1996", adj=c(0,0), col="red" ) ++ text( 43, aYLL[wh,"F","44","2016"], "2016", adj=c(1,1), col="red" ) ++ } +> plyll("Imm") +\end{Sinput} +\end{Schunk} +\begin{Schunk} +\begin{Sinput} +> plyll("Tot") +\end{Sinput} +\end{Schunk} +\begin{Schunk} +\begin{Sinput} +> plyll("Sus") +\end{Sinput} +\end{Schunk} +\begin{figure}[h] + \centering + \includegraphics[width=\textwidth]{yll-imm} + \caption{Years of life lost to DM: the difference in expected + residual life time at different ages between persons with and + without diabetes, assuming the persons without diabetes at a given + age remain free from diabetes (immunity assumption --- not + reasonable). The lines refer to date of evaluation; the top lines + refer to 1.1.1996 the bottom ones to 1.1.2016. Blue curves are + men, red women.} + \label{fig:imm} +\end{figure} + +\begin{figure}[h] + \centering + \includegraphics[width=\textwidth]{yll-sus} + \caption{Years of life lost to DM: the difference in expected + residual life time at different ages between persons with and + without diabetes, allowing the persons without diabetes at a given + to contract diabetes and thus be subject to higher mortality. The + lines refer to date of evaluation; the top lines refer to 1.1.1996 + the bottom ones to 1.1.2016. Blue curves are men, red women.} + \label{fig:sus} +\end{figure} + +\begin{figure}[h] + \centering + \includegraphics[width=\textwidth]{yll-tot} + \caption{Years of life lost to DM: the difference in expected + residual life time at different ages between persons with and + without diabetes. Allowance for susceptibility is approximated by + using the total population mortality instead of non-DM + mortality. The lines refer to date of evaluation; the top lines + refer to 1.1.1996 the bottom ones to 1.1.2016. Blue curves are + men, red women.} + \label{fig:tot} +\end{figure} + +From figure \ref{fig:sus} we see that for men aged 50 the years lost to +diabetes has decreased from a bit over 8 to a bit less than 6 years, +and for women from 8.5 to 5 years; so a greater improvement for women. + +\chapter{Practical implementation} + +We have devised functions that wraps these formulae up for practical +use. + +\section{Function definitions} + +% The following code sources the originally formatted code in erl.R in +% order to make it printable with all the formatting and comments in +% it. However the sourcing does not work when compiling the vignettes +% in R CMD build. So when checking the code we comment this out by +% putting eval=FALSE. So depending on whether we actually construct +% the pdf for the inst/doc folder or test (upload) the package, one of +% the following two chunks are run with eval=FALSE and the other with +% eval=TRUE. +% When checking the package +When using the functions it is assumed that the functions $\mu_W$, +$\mu_D$ and $\lambda$ are given as vectors corresponding to +equidistantly (usually tightly) spaced ages from 0 to $K$ where K is +the age where everyone can safely be assumed dead. + +\texttt{surv1} is a simple function that computes the survival +function from a vector of mortality rates, and optionally the +conditional survival given being alive at prespecified ages: +\begin{Schunk} +\begin{Sinput} +> surv1 +\end{Sinput} +\begin{Soutput} +function (int, mu, age.in = 0, A = NULL) +{ + if (any(is.na(mu))) { + mu[is.na(mu)] <- 0 + warning("NAs in agument 'mu' set to 0") + } + age <- 0:length(mu) * int + age.in + Mu <- c(0, cumsum(mu) * int) + Sv <- exp(-Mu) + surv <- data.frame(age = age, surv = Sv) + if (cond <- !is.null(A)) { + j <- 0 + cage <- NULL + for (ia in A) { + j <- j + 1 + cA <- which(diff(age > ia) == 1) + surv <- cbind(surv, pmin(1, surv$surv/(surv$surv[cA]))) + cage[j] <- surv$age[cA] + } + } + names(surv)[-1] <- paste("A", c(age.in, if (cond) cage else NULL), + sep = "") + rownames(surv) <- NULL + return(surv) +} + + +\end{Soutput} +\end{Schunk} +\texttt{erl1} basically just expands the result of \texttt{surv1} with +a column of expected residual life times: +\begin{Schunk} +\begin{Sinput} +> erl1 +\end{Sinput} +\begin{Soutput} +function (int, mu, age.in = 0) +{ + age <- 0:length(mu) * int + age.in + musmuc <- function(x) rev(cumsum(rev(x))) + surv <- surv1(int = int, mu = mu, age.in = age.in)[, 2] + cbind(age = age, surv = surv, erl = c(musmuc((surv[-1] - + diff(surv)/2))/surv[-length(surv)], 0) * int) +} + +\end{Soutput} +\end{Schunk} +We also define a function, \texttt{surv2}, that computes the survival +function for a non-diseased person that may become diseased with rate +\texttt{lam} and after that die at a rate of \texttt{muD} +(corresponding to the formulae above). This is the sane way of +handling years of life lost to a particular illness: +\begin{Schunk} +\begin{Sinput} +> surv2 +\end{Sinput} +\begin{Soutput} +function (int, muW, muD, lam, age.in = 0, A = NULL) +{ + if (length(muW) != length(muD) | length(muD) != length(lam)) + stop("Vectors with rates must have same length:\n", "length(muW)=", + length(muW), ", length(muD)=", length(muD), ", length(lam)=", + length(lam)) + if (!is.null(lam)) { + if (any(is.na(lam))) { + lam[is.na(lam)] <- 0 + warning("NAs in agument 'lam' set to 0") + } + } + if (any(is.na(muD))) { + muD[is.na(muD)] <- 0 + warning("NAs in agument 'muD' set to 0") + } + if (any(is.na(muW))) { + muW[is.na(muW)] <- 0 + warning("NAs in agument 'muW' set to 0") + } + wsurv2 <- function(int, muW, muD, lam, age.in = 0, A = 0) { + age <- 0:length(muW) * int + age.in + MuW <- cumsum(c(0, muW) * (age > A)) * int + MuD <- cumsum(c(0, muD) * (age > A)) * int + Lam <- cumsum(c(0, lam) * (age > A)) * int + pW <- exp(-(Lam + MuW)) + Dis <- c(0, lam) * (age > A) * exp(-(Lam + MuW)) * int + pDM <- Dis * 0 + for (ia in 1:length(age)) pDM[ia] <- sum(Dis[1:ia] * + exp(-(MuD[ia] - MuD[1:ia]))) + surv <- data.frame(age = age, surv = pDM + pW) + return(surv) + } + surv <- wsurv2(int, muW, muD, lam, age.in = age.in, A = 0) + if (!is.null(A)) { + for (j in 1:length(A)) { + surv <- cbind(surv, wsurv2(int, muW, muD, lam, age.in = age.in, + A = A[j])[, 2]) + } + } + Al <- A + for (i in 1:length(A)) Al[i] <- max(surv$age[surv$age <= + A[i]]) + colnames(surv)[-1] <- paste("A", c(age.in, Al), sep = "") + return(surv) +} + + +\end{Soutput} +\end{Schunk} +Finally we devised a function using these to compute the expected +residual lifetime at select ages: +\begin{Schunk} +\begin{Sinput} +> erl +\end{Sinput} +\begin{Soutput} +function (int, muW, muD, lam = NULL, age.in = 0, A = NULL, immune = is.null(lam), + yll = TRUE, note = TRUE) +{ + trsum <- function(x) { + x[c(diff(x) == 0, TRUE)] <- NA + sum((x[-length(x)] + x[-1])/2, na.rm = TRUE) + } + if (!immune & is.null(lam)) + stop("'lam' is required when immune=FALSE\n") + if (!is.null(lam)) { + if (any(is.na(lam))) { + lam[is.na(lam)] <- 0 + warning("NAs in agument 'lam' set to 0") + } + } + if (any(is.na(muD))) { + muD[is.na(muD)] <- 0 + warning("NAs in agument 'muD' set to 0") + } + if (any(is.na(muW))) { + muW[is.na(muW)] <- 0 + warning("NAs in agument 'muW' set to 0") + } + sD <- surv1(int = int, muD, age.in = age.in, A = A) + if (immune) + sW <- surv1(int = int, muW, age.in = age.in, A = A) + else sW <- surv2(int = int, muW, muD, lam, age.in = age.in, + A = A) + erl <- cbind(apply(sW[, -1], 2, trsum), apply(sD[, -1], 2, + trsum)) * int + colnames(erl) <- c("Well", "Dis") + rownames(erl) <- colnames(sW)[-1] + if (yll) + erl <- cbind(erl, YLL = erl[, "Well"] - erl[, "Dis"]) + if (immune) { + attr(erl, "NOTE") <- "Calculations assume that Well persons cannot get Ill (quite silly!)." + if (note) + cat("NOTE:", attr(erl, "NOTE"), "\n") + } + return(erl) +} + + +\end{Soutput} +\end{Schunk} +\ldots and a wrapper for this if we only want the years of life lost +returned: +\begin{Schunk} +\begin{Sinput} +> yll +\end{Sinput} +\begin{Soutput} +function (int, muW, muD, lam = NULL, age.in = 0, A = NULL, immune = is.null(lam), + note = TRUE) +erl(int = int, muW = muW, muD = muD, lam = lam, age.in = age.in, + A = A, immune = immune, yll = TRUE, note = note)[, "YLL"] + +\end{Soutput} +\end{Schunk} + +\bibliographystyle{plain} + +\begin{thebibliography}{1} + +\bibitem{Carstensen.2007a} +B~Carstensen. +\newblock Age-{P}eriod-{C}ohort models for the {L}exis diagram. +\newblock {\em Statistics in Medicine}, 26(15):3018--3045, 2007. + +\bibitem{Carstensen.2008c} +B~Carstensen, JK~Kristensen, P~Ottosen, and K~Borch-Johnsen. +\newblock The {D}anish {N}ational {D}iabetes {R}egister: {T}rends in incidence, + prevalence and mortality. +\newblock {\em Diabetologia}, 51:2187--2196, 2008. + +\end{thebibliography} + +\addcontentsline{toc}{chapter}{References} + +\end{document} Binary files /tmp/tmp1TClAk/RJn9eX0x2u/r-cran-epi-2.19/vignettes/yll-tot.pdf and /tmp/tmp1TClAk/HK_qO2S7tg/r-cran-epi-2.30/vignettes/yll-tot.pdf differ Binary files /tmp/tmp1TClAk/RJn9eX0x2u/r-cran-epi-2.19/vignettes/yl.pdf and /tmp/tmp1TClAk/HK_qO2S7tg/r-cran-epi-2.30/vignettes/yl.pdf differ