diff -Nru mgcv-1.8-26/ChangeLog mgcv-1.8-27/ChangeLog --- mgcv-1.8-26/ChangeLog 2018-11-20 20:28:17.000000000 +0000 +++ mgcv-1.8-27/ChangeLog 2019-02-02 18:46:31.000000000 +0000 @@ -17,11 +17,44 @@ * OpenBLAS 0.3.2/3 appears not to be thread safe at present - can cause problems when opemmp multithreading. +1.8-27 + +** Added routine 'ginla' for fully Bayesian inference based on an integrated + nested Laplace approximation (INLA) approach. See ?ginla. + +* Tweedie location scale family added: 'twlss'. + +* gam.fit5 modified to distinguish more carefully between +ve semi definite + and +ve definite. Previously could fail, claiming indefinteness when it + should not have. Affects general families. + +* bam was ignoring supplied scale parameter in extended family cases - fixed. + +* work around in list formula handling for reformulate sometimes setting + response to a name in place of a call. + +* preinitialize in general families is now a function, not an expression. See + cox.ph for an example. + +* added routine cholup for rank one modification of Cholesky factor. + +* two stage gam/bam fitting now allows 'sp' to be modified. + +* predict.gam could fail with type="response" for families requiring the + response to be provided in this case (e.g. cox.ph). Fixed. + +* sp.vcov defaults to extracting edge corrected log sp cov matrix, if + gam(...,gam.control(edge.control=TRUE)) used for fitting. + +* gam(...,gam.control(edge.correct=TRUE)) could go into infinite loop if + sp was effectively zero. Corrected. + + 1.8-26 * LINPACK dependency removed. -* Added service routine chol.down to down date a Cholesky factor on row/col +* Added service routine choldrop to down date a Cholesky factor on row/col deletion. * liu2 had a check messed up when vectorized. Fix to stop vector being diff -Nru mgcv-1.8-26/debian/changelog mgcv-1.8-27/debian/changelog --- mgcv-1.8-26/debian/changelog 2018-11-21 14:00:26.000000000 +0000 +++ mgcv-1.8-27/debian/changelog 2019-02-09 14:47:04.000000000 +0000 @@ -1,3 +1,12 @@ +mgcv (1.8-27-1) unstable; urgency=medium + + * New upstream release + + * debian/control: Set Build-Depends: to current R version + * debian/control: Set Standards-Version: to current version + + -- Dirk Eddelbuettel Sat, 09 Feb 2019 08:47:04 -0600 + mgcv (1.8-26-1) unstable; urgency=medium * New upstream release diff -Nru mgcv-1.8-26/debian/control mgcv-1.8-27/debian/control --- mgcv-1.8-26/debian/control 2018-11-02 22:24:06.000000000 +0000 +++ mgcv-1.8-27/debian/control 2019-02-09 14:46:49.000000000 +0000 @@ -2,8 +2,8 @@ Section: gnu-r Priority: optional Maintainer: Dirk Eddelbuettel -Build-Depends: debhelper (>= 7.0.0), r-base-dev (>= 3.5.1), r-cran-nlme, r-cran-matrix, dh-r -Standards-Version: 4.2.1 +Build-Depends: debhelper (>= 7.0.0), r-base-dev (>= 3.5.2), r-cran-nlme, r-cran-matrix, dh-r +Standards-Version: 4.3.0 Vcs-Browser: https://salsa.debian.org/edd/r-cran-mgcv Vcs-Git: https://salsa.debian.org/edd/r-cran-mgcv.git Homepage: https://cran.r-project.org/package=mgcv diff -Nru mgcv-1.8-26/DESCRIPTION mgcv-1.8-27/DESCRIPTION --- mgcv-1.8-26/DESCRIPTION 2018-11-21 11:50:02.000000000 +0000 +++ mgcv-1.8-27/DESCRIPTION 2019-02-06 15:00:03.000000000 +0000 @@ -1,5 +1,5 @@ Package: mgcv -Version: 1.8-26 +Version: 1.8-27 Author: Simon Wood Maintainer: Simon Wood Title: Mixed GAM Computation Vehicle with Automatic Smoothness @@ -7,17 +7,19 @@ Description: Generalized additive (mixed) models, some of their extensions and other generalized ridge regression with multiple smoothing parameter estimation by (Restricted) Marginal Likelihood, - Generalized Cross Validation and similar. Includes a gam() - function, a wide variety of smoothers, JAGS support and - distributions beyond the exponential family. + Generalized Cross Validation and similar, or using iterated + nested Laplace approximation for fully Bayesian inference. See + Wood (2017) for an overview. + Includes a gam() function, a wide variety of smoothers, 'JAGS' + support and distributions beyond the exponential family. Priority: recommended Depends: R (>= 2.14.0), nlme (>= 3.1-64) -Imports: methods, stats, graphics, Matrix -Suggests: splines, parallel, survival, MASS +Imports: methods, stats, graphics, Matrix, splines, utils +Suggests: parallel, survival, MASS LazyLoad: yes ByteCompile: yes License: GPL (>= 2) NeedsCompilation: yes -Packaged: 2018-11-20 20:40:48 UTC; sw283 +Packaged: 2019-02-06 10:57:24 UTC; sw283 Repository: CRAN -Date/Publication: 2018-11-21 11:50:02 UTC +Date/Publication: 2019-02-06 15:00:03 UTC diff -Nru mgcv-1.8-26/man/bam.Rd mgcv-1.8-27/man/bam.Rd --- mgcv-1.8-26/man/bam.Rd 2018-11-17 10:56:00.000000000 +0000 +++ mgcv-1.8-27/man/bam.Rd 2018-12-11 20:35:40.000000000 +0000 @@ -151,7 +151,7 @@ involving factor variables you might want to turn this off. Only do so if you know what you are doing.} \item{G}{if not \code{NULL} then this should be the object returned by a previous call to \code{bam} with -\code{fit=FALSE}. Causes all other arguments to be ignored except \code{chunk.size}, \code{gamma},\code{nthreads}, \code{cluster}, \code{rho}, \code{gc.level}, \code{samfrac}, \code{use.chol} and \code{method}.} +\code{fit=FALSE}. Causes all other arguments to be ignored except \code{sp}, \code{chunk.size}, \code{gamma},\code{nthreads}, \code{cluster}, \code{rho}, \code{gc.level}, \code{samfrac}, \code{use.chol}, \code{method} and \code{scale} (if >0).} \item{fit}{if \code{FALSE} then the model is set up for fitting but not estimated, and an object is returned, suitable for passing as the \code{G} argument to \code{bam}.} diff -Nru mgcv-1.8-26/man/chol.down.Rd mgcv-1.8-27/man/chol.down.Rd --- mgcv-1.8-26/man/chol.down.Rd 2018-11-20 20:40:37.000000000 +0000 +++ mgcv-1.8-27/man/chol.down.Rd 2018-12-04 12:29:15.000000000 +0000 @@ -1,25 +1,33 @@ \name{choldrop} \alias{choldrop} +\alias{cholup} %- Also NEED an `\alias' for EACH other topic documented here. -\title{Down date a Cholesky factor on dropping a row/col} -\description{Given a Cholesky factor, \code{R}, of a matrix, \code{A}, finds the Cholesky factor of \code{A[-k,-k]}, -where \code{k} is an integer. +\title{Deletion and rank one Cholesky factor update} +\description{Given a Cholesky factor, \code{R}, of a matrix, \code{A}, \code{choldrop} finds the Cholesky factor of \code{A[-k,-k]}, +where \code{k} is an integer. \code{cholup} finds the factor of \eqn{A + uu^T}{A+uu'} (update) or \eqn{A - uu^T}{A-uu'} (downdate). } \usage{ choldrop(R,k) +cholup(R,u,up) } %- maybe also `usage' for other objects documented here. \arguments{ \item{R}{Cholesky factor of a matrix, \code{A}.} \item{k}{row and column of \code{A} to drop.} +\item{u}{vector defining rank one update.} +\item{up}{if \code{TRUE} compute update, otherwise downdate.} } -\details{If \code{R} is upper triangular then \code{t(R[,-k])\%*\%R[,-k] == A[-k,-k]}, but \code{R[,-k]} has elements on the first sub-diagonal, from its kth column onwards. To get from this to a triangular Cholesky factor of \code{A[-k,-k]} we can apply a sequence of Given rotations from the left to eliminate the sub-diagonal elements. The routine does this. If \code{R} is a lower triangular factor then Givens rotations from the right are needed to remove the extra elements. If \code{n} is the dimension of \code{R} then the update has O(n^2) computational cost. +\details{First consider \code{choldrop}. If \code{R} is upper triangular then \code{t(R[,-k])\%*\%R[,-k] == A[-k,-k]}, but \code{R[,-k]} has elements on the first sub-diagonal, from its kth column onwards. To get from this to a triangular Cholesky factor of \code{A[-k,-k]} we can apply a sequence of Givens rotations from the left to eliminate the sub-diagonal elements. The routine does this. If \code{R} is a lower triangular factor then Givens rotations from the right are needed to remove the extra elements. If \code{n} is the dimension of \code{R} then the update has \eqn{O(n^2)}{O(n^2)} computational cost. -Note that the update is vector oriented, and is hence not susceptible to speed up by use of an optimized BLAS. The update is set up to be relatively Cache friendly, in that in the upper triangular case successive Givens rotations are stored for sequential application columnwise, rather than being applied rowwise as soon as they are computed. Even so, the upper triangular update is slightly slower than the lower triangular update. +\code{cholup} (which assumes \code{R} is upper triangular) updates based on the observation that \eqn{ R^TR + uu^T = [u,R^T][u,R^T]^T = [u,R^T]Q^TQ[u,R^T]^T}{R'R + uu' = [u,R'][u,R']' = [u,R']Q'Q[u,R']'}, and therefore we can construct \eqn{Q}{Q} so that \eqn{Q[u,R^T]^T=[0,R_1^T]^T}{Q[u,R']'=[0,R1']'}, where \eqn{R_1}{R1} is the modified factor. \eqn{Q}{Q} is constructed from a sequence of Givens rotations in order to zero the elements of \eqn{u}{u}. Downdating is similar except that hyperbolic rotations have to be used in place of Givens rotations --- see Golub and van Loan (2013, section 6.5.4) for details. Downdating only works if \eqn{A - uu^T}{A-uu'} is positive definite. Again the computational cost is \eqn{O(n^2)}{O(n^2)}. +Note that the updates are vector oriented, and are hence not susceptible to speed up by use of an optimized BLAS. The updates are set up to be relatively Cache friendly, in that in the upper triangular case successive Givens rotations are stored for sequential application column-wise, rather than being applied row-wise as soon as they are computed. Even so, the upper triangular update is slightly slower than the lower triangular update. +} + +\references{ +Golub GH and CF Van Loan (2013) Matrix Computations (4th edition) Johns Hopkins } - \author{ Simon N. Wood \email{simon.wood@r-project.org}} @@ -39,6 +47,19 @@ Ld <- choldrop(L,k) range(Ld-t(chol(A[-k,-k]))) Ld;t(chol(A[-k,-k])) + + ## Rank one update example + u <- runif(n) + R <- cholup(R0,u,TRUE) + Ru <- chol(A+u \%*\% t(u)) ## direct for comparison + R;Ru + range(R-Ru) + + ## Downdate - just going back from R to R0 + Rd <- cholup(R,u,FALSE) + R0;Rd + range(R-Ru) + } \keyword{models} \keyword{smooth} \keyword{regression}%-- one or more .. diff -Nru mgcv-1.8-26/man/coxph.Rd mgcv-1.8-27/man/coxph.Rd --- mgcv-1.8-26/man/coxph.Rd 2018-05-30 16:18:39.000000000 +0000 +++ mgcv-1.8-27/man/coxph.Rd 2018-12-04 07:05:22.000000000 +0000 @@ -32,8 +32,8 @@ Stratification is possible, allowing for different baseline hazards in different strata. In that case the response has two columns: the first is event/censoring time and the second is a numeric stratum index. See below for an example. Prediction from the fitted model object (using the \code{predict} method) with \code{type="response"} will predict on the -survivor function scale. See example code below for extracting the cumulative baseline hazard/survival directly. -Martingale or deviance +survivor function scale. This requires evaluation times to be provided as well as covariates (see example). Also see example code +below for extracting the cumulative baseline hazard/survival directly. Martingale or deviance residuals can be extracted. The \code{fitted.values} stored in the model object are survival function estimates for each subject at their event/censoring time. diff -Nru mgcv-1.8-26/man/coxpht.Rd mgcv-1.8-27/man/coxpht.Rd --- mgcv-1.8-26/man/coxpht.Rd 2018-09-10 06:09:18.000000000 +0000 +++ mgcv-1.8-27/man/coxpht.Rd 2019-02-05 10:50:22.000000000 +0000 @@ -51,13 +51,13 @@ ## Now do the interpolation of covariates to event times... um <- data.frame(lapply(X=di,FUN=app,t=di[[t]],to=tr)) ## Mark the actual event... - if (um[[et]][1]==max(tr)&&um[[status]]==1) um[[event]][nrow(um)] <- 1 + if (um[[et]][1]==max(tr)&&um[[status]][1]==1) um[[event]][nrow(um)] <- 1 um[[et]] <- tr ## reset time to relevant event times dap[start:(start-1+nrow(um)),] <- um ## copy to dap start <- start + nrow(um) if (inter) setTxtProgressBar(prg, i) } - close(prg) + if (inter) close(prg) dap[1:(start-1),] } ## tdpois diff -Nru mgcv-1.8-26/man/FFdes.Rd mgcv-1.8-27/man/FFdes.Rd --- mgcv-1.8-26/man/FFdes.Rd 1970-01-01 00:00:00.000000000 +0000 +++ mgcv-1.8-27/man/FFdes.Rd 2018-12-08 16:29:19.000000000 +0000 @@ -0,0 +1,45 @@ +\name{FFdes} +\alias{FFdes} +%- Also NEED an `\alias' for EACH other topic documented here. +\title{Level 5 fractional factorial designs} +\description{Computes level 5 fractional factorial designs for up to 120 factors +using the agorithm of Sanchez and Sanchez (2005), and optionally central composite designs. +} +\usage{ +FFdes(size=5,ccd=FALSE) +} +%- maybe also `usage' for other objects documented here. +\arguments{ + \item{size}{number of factors up to 120.} +\item{ccd}{if \code{TRUE}, adds points along each axis at the same distance from the origin as the points in the +fractional factorial design, to create the outer points of a central composite design. Add central points to complete.} +} + +\details{Basically a translation of the code provided in the appendix of Sanchez and Sanchez (2005). +} + +\references{ +Sanchez, S. M. & Sanchez, P. J. (2005) Very large fractional factorial and central composite designs. +ACM Transactions on Modeling and Computer Simulation. 15: 362-377 +} + +\author{ Simon N. Wood \email{simon.wood@r-project.org}} + +\examples{ + require(mgcv) + plot(rbind(0,FFdes(2,TRUE)),xlab="x",ylab="y", + col=c(2,1,1,1,1,4,4,4,4),pch=19,main="CCD") + FFdes(5) + FFdes(5,TRUE) +} + +\keyword{models} \keyword{smooth} \keyword{regression}%-- one or more .. + + + + + + + + + diff -Nru mgcv-1.8-26/man/gam.Rd mgcv-1.8-27/man/gam.Rd --- mgcv-1.8-26/man/gam.Rd 2018-09-21 09:16:36.000000000 +0000 +++ mgcv-1.8-27/man/gam.Rd 2018-11-21 09:29:41.000000000 +0000 @@ -137,7 +137,7 @@ \item{G}{Usually \code{NULL}, but may contain the object returned by a previous call to \code{gam} with \code{fit=FALSE}, in which case all other arguments are ignored except for -\code{gamma}, \code{in.out}, \code{scale}, \code{control}, \code{method} \code{optimizer} and \code{fit}.} +\code{sp}, \code{gamma}, \code{in.out}, \code{scale}, \code{control}, \code{method} \code{optimizer} and \code{fit}.} \item{in.out}{optional list for initializing outer iteration. If supplied then this must contain two elements: \code{sp} should be an array of diff -Nru mgcv-1.8-26/man/ginla.Rd mgcv-1.8-27/man/ginla.Rd --- mgcv-1.8-26/man/ginla.Rd 1970-01-01 00:00:00.000000000 +0000 +++ mgcv-1.8-27/man/ginla.Rd 2018-12-15 18:53:23.000000000 +0000 @@ -0,0 +1,140 @@ +\name{ginla} +\alias{ginla} +%- Also NEED an `\alias' for EACH other topic documented here. +\title{GAM Integrated Nested Laplace Approximation Newton Enhanced} +\description{Apply Integrated Nested Laplace Approximation (INLA, Rue et al. 2009) to models estimable by \code{\link{gam}} or \code{\link{bam}}, using the INLA variant described in Wood (2018). Produces marginal posterior densities for each coefficient, selected coefficients or linear transformations of the coefficient vector. +} +\usage{ +ginla(G,A=NULL,nk=16,nb=100,J=1,interactive=FALSE,int=0,approx=0) +} +%- maybe also `usage' for other objects documented here. +\arguments{ + \item{G}{A pre-fit gam object, as produced by \code{gam(...,fit=FALSE)} or \code{bam(...,discrete=TRUE,fit=FALSE)}.} +\item{A}{Either a matrix of transforms of the coefficients that are of interest, or an array of indices of the parameters of interest. If \code{NULL} then distributions are produced for all coefficients.} +\item{nk}{Number of values of each coefficient at which to evaluate its log marginal posterior density. These points are then spline interpolated.} +\item{nb}{Number of points at which to evaluate posterior density of coefficients for returning as a gridded function.} +\item{J}{How many determinant updating steps to take in the log determinant approximation step. Not recommended to increase this. } +\item{interactive}{If this is \code{>0} or \code{TRUE} then every approximate posterior is plotted in red, overlaid on the simple Gaussian approximate posterior. If \code{2} then waits for user to press return between each plot. Useful for judging whether anything is gained by using INLA approach. } +\item{int}{0 to skip integration and just use the posterior modal smoothing parameter. >0 for integration using the CCD approach proposed in Rue et al. (2009).} +\item{approx}{0 for full approximation; 1 to update Hessian, but use approximate modes; 2 as 1 and assume constant Hessian. See details.} +} + +\value{ +A list with elements \code{beta} and \code{density}, both of which are matrices. Each row relates to one coefficient (or linear coefficient combination) of interest. Both matrices have \code{nb} columns. If \code{int!=0} then a further element \code{reml} gives the integration weights used in the CCD integration, with the central point weight given first. +} + +\details{Let \eqn{\beta}{b}, \eqn{\theta}{h} and \eqn{y}{y} denote the model coefficients, hyperparameters/smoothing parameters and response data, respectively. In principle, INLA employs Laplace approximations for \eqn{\pi(\beta_i|\theta,y)}{p(b_i|h,y)} and \eqn{\pi(\theta|y)}{p(h|y)} and then obtains the marginal posterior distribution \eqn{\pi(\beta_i|y)}{p(b_i|y)} by intergrating the approximations to \eqn{\pi(\beta_i|\theta,y)\pi(\theta|y)}{p(b_i|h,y)p(h|y)} w.r.t \eqn{\theta}{h} (marginals for the hyperparameters can also be produced). In practice the Laplace approximation for \eqn{\pi(\beta_i|\theta,y)}{p(b_i|h,y)} is too expensive to compute for each \eqn{\beta_i}{b_i} and must itself be approximated. To this end, there are two quantities that have to be computed: the posterior mode \eqn{\beta^*|\beta_i}{b*|b_i} and the determinant of the Hessian of the joint log density \eqn{\log \pi(\beta,\theta,y)}{log p(b,h,y)} w.r.t. \eqn{\beta}{b} at the mode. Rue et al. (2009) originally approximated the posterior conditional mode by the conditional mode implied by a simple Gaussian approximation to the posterior \eqn{\pi(\beta|y)}{p(b|y)}. They then approximated the log determinant of the Hessian as a function of \eqn{\beta_i}{b_i} using a first order Taylor expansion, which is cheap to compute for the sparse model representaiton that they use, but not when using the dense low rank basis expansions used by \code{\link{gam}}. They also offer a more expensive alternative approximation based on computing the log determiannt with respect only to those elements of \eqn{\beta}{b} with sufficiently high correlation with \eqn{\beta_i}{b_i} according to the simple Gaussian posterior approximation: efficiency again seems to rest on sparsity. Wood (2018) suggests computing the required posterior modes exactly, and basing the log determinant approximation on a BFGS update of the Hessian at the unconditional model. The latter is efficient with or without sparsity, whereas the former is a `for free' improvement. Both steps are efficient because it is cheap to obtain the Cholesky factor of \eqn{H[-i,-i]}{H[-i,-i]} from that of \eqn{H}{H} - see \code{\link{choldrop}}. This is the approach taken by this routine. + +The \code{approx} argument allows two further approximations to speed up computations. For \code{approx==1} the exact posterior conditional modes are not used, but instead the conditional modes implied by the simple Gaussian posterior approximation. For \code{approx==2} the same approximation is used for the modes and the Hessian is assumed constant. The latter is quite fast as no log joint density gradient evaluations are required. + + +Note that for many models the INLA estimates are very close to the usual Gaussian approximation to the posterior, the \code{interactive} argument is useful for investigating this issue. + +\code{\link{bam}} models are only supported with the \code{disrete=TRUE} option. The \code{discrete=FALSE} approach would be too inefficient. AR1 models are not supported (related arguments are simply ignored). +} + +\references{ +Rue, H, Martino, S. & Chopin, N. (2009) Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations (with discussion). Journal of the Royal Statistical Society, Series B. 71: 319-392. + +Wood (2018, submitted) Simplified Integrated Laplace Approximation. +} + +\section{WARNINGS}{ +This routine is still somewhat experimental, so details are liable to change. Also currently not all steps are optimally efficient. + +The routine is written for relatively expert users. + +\code{ginla} is not designed to deal with rank deficient models. + +} + +\author{ Simon N. Wood \email{simon.wood@r-project.org}} + +\examples{ + require(mgcv); require(MASS) + + ## example using a scale location model for the motorcycle data. A simple plotting + ## routine is produced first... + + plot.inla <- function(x,inla,k=1,levels=c(.025,.1,.5,.9,.975), + lcol = c(2,4,4,4,2),lwd = c(1,1,2,1,1),lty=c(1,1,1,1,1), + xlab="x",ylab="y",cex.lab=1.5) { + ## a simple effect plotter, when distributions of function values of + ## 1D smooths have been computed + require(splines) + p <- length(x) + betaq <- matrix(0,length(levels),p) ## storage for beta quantiles + for (i in 1:p) { ## work through x and betas + j <- i + k - 1 + p <- cumsum(inla$density[j,])*(inla$beta[j,2]-inla$beta[j,1]) + ## getting quantiles of function values... + betaq[,i] <- approx(p,y=inla$beta[j,],levels)$y + } + xg <- seq(min(x),max(x),length=200) + ylim <- range(betaq) + ylim <- 1.1*(ylim-mean(ylim))+mean(ylim) + for (j in 1:length(levels)) { ## plot the quantiles + din <- interpSpline(x,betaq[j,]) + if (j==1) { + plot(xg,predict(din,xg)$y,ylim=ylim,type="l",col=lcol[j], + xlab=xlab,ylab=ylab,lwd=lwd[j],cex.lab=1.5,lty=lty[j]) + } else lines(xg,predict(din,xg)$y,col=lcol[j],lwd=lwd[j],lty=lty[j]) + } + } ## plot.inla + + ## set up the model with a `gam' call... + + G <- gam(list(accel~s(times,k=20,bs="ad"),~s(times)), + data=mcycle,family=gaulss(),fit=FALSE) + b <- gam(G=G,method="REML") ## regular GAM fit for comparison + + ## Now use ginla to get posteriors of estimated effect values + ## at evenly spaced times. Create A matrix for this... + + rat <- range(mcycle$times) + pd0 <- data.frame(times=seq(rat[1],rat[2],length=20)) + X0 <- predict(b,newdata=pd0,type="lpmatrix") + X0[,21:30] <- 0 + pd1 <- data.frame(times=seq(rat[1],rat[2],length=10)) + X1 <- predict(b,newdata=pd1,type="lpmatrix") + X1[,1:20] <- 0 + A <- rbind(X0,X1) ## A maps coefs to required function values + + ## call ginla. Set int to 1 for integrated version. + ## Set interactive = 1 or 2 to plot marginal posterior distributions + ## (red) and simple Gaussian approximation (black). + + inla <- ginla(G,A,int=0) + + par(mfrow=c(1,2),mar=c(5,5,1,1)) + fv <- predict(b,se=TRUE) ## usual Gaussian approximation, for comparison + + ## plot inla mean smooth effect... + plot.inla(pd0$times,inla,k=1,xlab="time",ylab=expression(f[1](time))) + + ## overlay simple Gaussian equivalent (in grey) ... + points(mcycle$times,mcycle$accel,col="grey") + lines(mcycle$times,fv$fit[,1],col="grey",lwd=2) + lines(mcycle$times,fv$fit[,1]+2*fv$se.fit[,1],lty=2,col="grey",lwd=2) + lines(mcycle$times,fv$fit[,1]-2*fv$se.fit[,1],lty=2,col="grey",lwd=2) + + ## same for log sd smooth... + plot.inla(pd1$times,inla,k=21,xlab="time",ylab=expression(f[2](time))) + lines(mcycle$times,fv$fit[,2],col="grey",lwd=2) + lines(mcycle$times,fv$fit[,2]+2*fv$se.fit[,2],col="grey",lty=2,lwd=2) + lines(mcycle$times,fv$fit[,2]-2*fv$se.fit[,2],col="grey",lty=2,lwd=2) + + ## ... notice some real differences for the log sd smooth, especially + ## at the lower and upper ends of the time interval. +} + +\keyword{models} \keyword{smooth} \keyword{regression}%-- one or more .. + + + + + + + + + diff -Nru mgcv-1.8-26/man/mgcv-package.Rd mgcv-1.8-27/man/mgcv-package.Rd --- mgcv-1.8-26/man/mgcv-package.Rd 2018-05-30 16:18:39.000000000 +0000 +++ mgcv-1.8-27/man/mgcv-package.Rd 2019-02-03 08:16:35.000000000 +0000 @@ -15,7 +15,7 @@ smooths can be added. A Bayesian approach to confidence/credible interval calculation is provided. Linear functionals of smooths, penalization of parametric model terms and linkage of smoothing parameters are all supported. Lower level routines for generalized ridge -regression and penalized linearly constrained least squares are also available. +regression and penalized linearly constrained least squares are also available. In addition to the main modelling functions, \code{\link{jagam}} provided facilities to ease the set up of models for use with JAGS, while \code{\link{ginla}} provides marginal inference via a version of Integrated Nested Laplace Approximation. } \details{ \code{mgcv} provides generalized additive modelling functions \code{\link{gam}}, diff -Nru mgcv-1.8-26/man/plot.gam.Rd mgcv-1.8-27/man/plot.gam.Rd --- mgcv-1.8-26/man/plot.gam.Rd 2018-05-30 16:18:49.000000000 +0000 +++ mgcv-1.8-27/man/plot.gam.Rd 2019-01-14 08:21:43.000000000 +0000 @@ -147,7 +147,8 @@ always easy to achieve. Note that these plots can not be modified to the same extent as the other plot. For 2-d smooths \code{scheme==1} produces a perspective plot, while \code{scheme==2} produces a heatmap, -with overlaid contours and \code{scheme==3} a greyscale heatmap. +with overlaid contours and \code{scheme==3} a greyscale heatmap (\code{contour.col} controls the +contour colour). Smooths of 3 and 4 variables are displayed as tiled heatmaps with overlaid contours. In the 3 variable case the third variable is discretized and a contour plot of the first 2 variables is produced for each discrete value. The panels in the lower and upper rows are labelled with the corresponding third variable value. The lowest value is bottom left, and highest at top right. For 4 variables, two of the variables are coarsely discretized and a square array of image plots is produced for each combination of the discrete values. The first two arguments of the smooth are the ones used for the image/contour plots, unless a tensor product term has 2D marginals, in which case the first 2D marginal is image/contour plotted. \code{n3} controls the number of panels. See also \code{\link{vis.gam}}. diff -Nru mgcv-1.8-26/man/sp.vcov.Rd mgcv-1.8-27/man/sp.vcov.Rd --- mgcv-1.8-26/man/sp.vcov.Rd 2017-04-11 14:08:22.000000000 +0000 +++ mgcv-1.8-27/man/sp.vcov.Rd 2018-11-22 09:07:24.000000000 +0000 @@ -7,27 +7,30 @@ that evaluated the required Hessian. } \usage{ -sp.vcov(x) +sp.vcov(x,edge.correct=TRUE,reg=1e-3) } %- maybe also `usage' for other objects documented here. \arguments{ \item{x}{ a fitted model object of class \code{gam} as produced by \code{gam()}.} +\item{edge.correct}{ if the model was fitted with \code{edge.correct=TRUE} (see \code{\link{gam.control}}), then thereturned covariance matrix will be for the edge corrected log smoothing parameters.} +\item{reg}{regularizer for Hessian - default is equivalent to prior variance of 1000 on log smoothing parameters.} } \details{ Just extracts the inverse of the hessian matrix of the negative (restricted) log likelihood w.r.t the log smoothing parameters, if this has been obtained as part of fitting. } \value{ A matrix corresponding to the estimated covariance matrix of the log smoothing parameter estimators, -if this can be extracted, otherwise \code{NULL}. If the scale parameter has been (RE)ML estimated (i.e. if the method was -\code{"ML"} or \code{"REML"} and the scale parameter was unknown) then the -last row and column relate to the log scale parameter. +if this can be extracted, otherwise \code{NULL}. If the scale parameter has been (RE)ML estimated (i.e. if the method was \code{"ML"} or \code{"REML"} and the scale parameter was unknown) then the +last row and column relate to the log scale parameter. If \code{edge.correct=TRUE} and this was used in fitting then the edge corrected smoothing parameters are in attribute \code{lsp} of the returned matrix. } \author{Simon N. Wood \email{simon.wood@r-project.org} } -\references{ -Wood, S.N. (2006) On confidence intervals for generalized additive models based on penalized regression splines. Australian and New Zealand Journal of Statistics. 48(4): 445-464. +\references{Wood, S.N., N. Pya and B. Saefken (2016), Smoothing parameter and +model selection for general smooth models (with discussion). +Journal of the American Statistical Association 111, 1548-1575 +\url{http://dx.doi.org/10.1080/01621459.2016.1180986} } diff -Nru mgcv-1.8-26/man/twlss.Rd mgcv-1.8-27/man/twlss.Rd --- mgcv-1.8-26/man/twlss.Rd 1970-01-01 00:00:00.000000000 +0000 +++ mgcv-1.8-27/man/twlss.Rd 2019-02-02 19:06:45.000000000 +0000 @@ -0,0 +1,84 @@ +\name{twlss} +\alias{twlss} +%- Also NEED an `\alias' for EACH other topic documented here. +\title{Tweedie location scale family} +\description{Tweedie family in which the mean, power and scale parameters can all depend on smooth linear predictors. Restricted to estimation via the extended Fellner Schall method of Wood and Fasiolo (2017). Only usable with \code{\link{gam}}. Tweedie distributions are exponential family with variance given by \eqn{\phi \mu^p}{s*m^p} where \eqn{\phi}{s} is a scale parameter, \eqn{p}{p} a parameter (here between 1 and 2) and \eqn{\mu}{m} is the mean. +} + +\usage{ +twlss(link=list("log","identity","identity"),a=1.01,b=1.99) +} +\arguments{ + +\item{link}{The link function list: currently no choise.} + +\item{a}{lower limit on the power parameter relating variance to mean.} + +\item{b}{upper limit on power parameter.} + +} +\value{ +An object inheriting from class \code{general.family}. +} + +\details{ A Tweedie random variable with 11) { ## estimate theta - scale1 <- if (!is.null(family$scale)) family$scale else scale - if (family$n.theta>0||scale<0) theta <- estimate.theta(theta,family,y,mu,scale=scale1,wt=G$w,tol=1e-7) - if (!is.null(family$scale) && family$scale<0) { + #scale1 <- if (!is.null(family$scale)) family$scale else scale + if (family$n.theta>0||scale1<0) theta <- estimate.theta(theta,family,y,mu,scale=scale1,wt=G$w,tol=1e-7) + if (!is.null(family$scale) && scale1<0) { scale <- exp(theta[family$n.theta+1]) theta <- theta[1:family$n.theta] } @@ -791,8 +793,9 @@ bgam.fit <- function (G, mf, chunk.size, gp ,scale ,gamma,method, coef=NULL,etastart = NULL, mustart = NULL, offset = rep(0, nobs), control = gam.control(), intercept = TRUE, - cl = NULL,gc.level=0,use.chol=FALSE,nobs.extra=0,samfrac=1,npt=1) -{ y <- mf[[gp$response]] + cl = NULL,gc.level=0,use.chol=FALSE,nobs.extra=0,samfrac=1,npt=1) { + #y <- mf[[gp$response]] + y <- G$y weights <- G$w conv <- FALSE nobs <- nrow(mf) @@ -805,7 +808,8 @@ pini <- if (is.null(G$family$preinitialize)) NULL else G$family$preinitialize(y,G$family) if (!is.null(pini$Theta)) G$family$putTheta(pini$Theta) if (!is.null(pini$y)) y <- pini$y - scale <- if (is.null(G$family$scale)) 1 else G$family$scale + if (is.null(G$family$scale)) scale <- 1 else scale <- if (G$family$scale<0) scale else G$family$scale + scale1 <-scale if (scale < 0) scale <- var(y) *.1 ## initial guess } else efam <- FALSE @@ -1086,9 +1090,9 @@ } if (efam && iter>1) { ## estimate theta - scale1 <- if (!is.null(family$scale)) family$scale else scale - if (family$n.theta>0||scale<0) theta <- estimate.theta(theta,family,G$y,linkinv(eta),scale=scale1,wt=G$w,tol=1e-7) - if (!is.null(family$scale) && family$scale<0) { + #scale1 <- if (!is.null(family$scale)) family$scale else scale + if (family$n.theta>0||scale1<0) theta <- estimate.theta(theta,family,G$y,linkinv(eta),scale=scale1,wt=G$w,tol=1e-7) + if (!is.null(family$scale) && scale1<0) { scale <- exp(theta[family$n.theta+1]) theta <- theta[1:family$n.theta] } @@ -1377,9 +1381,9 @@ bam.fit <- function(G,mf,chunk.size,gp,scale,gamma,method,rho=0, - cl=NULL,gc.level=0,use.chol=FALSE,npt=1) + cl=NULL,gc.level=0,use.chol=FALSE,npt=1) { ## function that does big additive model fit in strictly additive case -{ ## first perform the QR decomposition, blockwise.... + ## first perform the QR decomposition, blockwise.... n <- nrow(mf) if (rho!=0) { ## AR1 error model ld <- 1/sqrt(1-rho^2) ## leading diagonal of root inverse correlation @@ -1526,7 +1530,7 @@ yX.last <- res[[n.threads]]$yX.last } G$n <- n - G$y <- mf[[gp$response]] + #G$y <- mf[[gp$response]] } else { ## n <= chunk.size if (rho==0) qrx <- qr.update(sqrt(G$w)*G$X,sqrt(G$w)*(G$y-G$offset),use.chol=use.chol,nt=npt) else { @@ -1946,9 +1950,6 @@ if (inherits(family,"extended.family")) { family <- fix.family.link(family); efam <- TRUE } else efam <- FALSE - #else { - #if (inherits(family,"extended.family")) stop("used bam(...,discrete=TRUE) with extended families") - #} if (method%in%c("fREML")&&!is.null(min.sp)) { min.sp <- NULL @@ -1995,11 +1996,7 @@ if (gc.level>0) gc() mf <- eval(mf, parent.frame()) # the model frame now contains all the data - # if ("matrix"%in%unlist(lapply(mf,class))) { - # mfattr <- attributes(mf) - # mf <- lapply(mf,drop) # avoid single column matrices - # mfattr -> attributes(mf) - # } + if (nrow(mf)<2) stop("Not enough (non-NA) data to do anything meaningful") terms <- attr(mf,"terms") if (gc.level>0) gc() @@ -2121,7 +2118,6 @@ rind <- k:(k+dt[kb] - 1 - by.present) dk$nr[rind] <- dk$nr[k+G$smooth[[i]]$rind-1] G$ks[rind,] <- G$ks[k+G$smooth[[i]]$rind-1,] # either this line or next not both - #G$kd[,rind] <- G$kd[,k+G$smooth[[i]]$rind-1] } for (j in 1:nmar) { G$Xd[[k]] <- G$smooth[[i]]$margin[[j]]$X[1:dk$nr[k],,drop=FALSE] @@ -2175,7 +2171,8 @@ if (is.null(mf$"(weights)")) G$w<-rep(1,n) else G$w<-mf$"(weights)" - + + G$y <- mf[[gp$response]] G$offset <- model.offset(mf) if (is.null(G$offset)) G$offset <- rep(0,n) @@ -2198,17 +2195,28 @@ environment(G$formula) <- .BaseNamespaceEnv } else { ## G supplied - scale <- G$scale + if (scale<=0) scale <- G$scale efam <- G$efam mf <- G$mf; G$mf <- NULL gp <- G$gp; G$gp <- NULL na.action <- G$na.action; G$na.action <- NULL + if (!is.null(sp)&&any(sp>=0)) { ## request to modify smoothing parameters + if (is.null(G$L)) G$L <- diag(length(G$sp)) + if (length(sp)!=ncol(G$L)) stop('length of sp must be number of free smoothing parameters in original model') + ind <- sp>=0 ## which smoothing parameters are now fixed + spind <- log(sp[ind]); + spind[!is.finite(spind)] <- -30 ## set any zero parameters to effective zero + G$lsp0 <- G$lsp0 + drop(G$L[,ind,drop=FALSE] %*% spind) ## add fix to lsp0 + G$L <- G$L[,!ind,drop=FALSE] ## drop the cols of G + G$sp <- rep(-1,ncol(G$L)) + } } ## end of G setup if (!fit) { G$efam <- efam G$scale <- scale G$mf <- mf;G$na.action <- na.action;G$gp <- gp + class(G) <- "bam.prefit" return(G) } diff -Nru mgcv-1.8-26/R/coxph.r mgcv-1.8-27/R/coxph.r --- mgcv-1.8-26/R/coxph.r 2018-06-01 12:36:23.000000000 +0000 +++ mgcv-1.8-27/R/coxph.r 2018-12-04 18:34:16.000000000 +0000 @@ -25,11 +25,29 @@ ## initialization is tough here... need data frame in reverse time order, ## and intercept removed from X... - preinitialize <- expression({ +# preinitialize <- expression({ ## code to evaluate in estimate.gam... ## sort y (time) into decending order, and ## re-order weights and rows of X accordingly ## matrix y has strat as second column +# G$family$data <- list() +# y.order <- if (is.matrix(G$y)) order(G$y[,2],G$y[,1],decreasing=TRUE) else +# order(G$y,decreasing=TRUE) +# G$family$data$y.order <- y.order +# G$y <- if (is.matrix(G$y)) G$y[y.order,] else G$y[y.order] +# attrX <- attributes(G$X) +# G$X <- G$X[y.order,,drop=FALSE] +# attributes(G$X) <- attrX +# G$w <- G$w[y.order] +# G$offset <- G$offset[y.order] +# }) + + preinitialize <- function(G) { + ## G is a gam pre-fit object. Pre-initialize can manipulate some of its + ## elements, returning a named list of the modified ones. + ## sort y (time) into decending order, and + ## re-order weights and rows of X accordingly + ## matrix y has strat as second column G$family$data <- list() y.order <- if (is.matrix(G$y)) order(G$y[,2],G$y[,1],decreasing=TRUE) else order(G$y,decreasing=TRUE) @@ -40,7 +58,8 @@ attributes(G$X) <- attrX G$w <- G$w[y.order] G$offset <- G$offset[y.order] - }) + list(family=G$family,y=G$y,X=G$X,w=G$w,offset=G$offset) + } ## preinitialize postproc <- expression({ ## code to evaluate in estimate.gam, to do with data ordering and @@ -59,6 +78,7 @@ object$null.deviance <- ## sum of squares of null deviance residuals 2*sum(abs((object$prior.weights + log(s0.base) + object$prior.weights*(log(-log(s0.base)))))) ## and undo the re-ordering... + y.order <- G$family$data$y.order object$linear.predictors[y.order] <- object$linear.predictors object$fitted.values[y.order] <- object$fitted.values if (is.matrix(object$y)) object$y[y.order,] <- object$y else object$y[y.order] <- object$y diff -Nru mgcv-1.8-26/R/efam.r mgcv-1.8-27/R/efam.r --- mgcv-1.8-26/R/efam.r 2018-05-30 16:18:39.000000000 +0000 +++ mgcv-1.8-27/R/efam.r 2019-01-24 09:40:07.000000000 +0000 @@ -45,11 +45,19 @@ } else H <- NULL list(nll=nll,g=g,H=H) } ## nlogl + + if (scale>=0&&family$n.theta==0) stop("erroneous call to estimate.theta - no free parameters") + n.theta <- length(theta) ## dimension of theta vector (family$n.theta==0 => all fixed) + del.ind <- 1:n.theta if (scale<0) theta <- c(theta,log(var(y)*.1)) nll <- nlogl(theta,family,y,mu,scale,wt,2) + g <- if (family$n.theta==0) nll$g[-del.ind] else nll$g + H <- if (family$n.theta==0) nll$H[-del.ind,-del.ind,drop=FALSE] else nll$H step.failed <- FALSE for (i in 1:100) { ## main Newton loop - eh <- eigen(nll$H,symmetric=TRUE) + #H <- if (family$n.theta==0) nll$H[-del.ind,-del.ind,drop=FALSE] else nll$H + #g <- if (family$n.theta==0) nll$g[-del.ind] else nll$g + eh <- eigen(H,symmetric=TRUE) pdef <- sum(eh$values <= 0)==0 if (!pdef) { ## Make the Hessian pos def eh$values <- abs(eh$values) @@ -57,10 +65,11 @@ eh$values[eh$values4) step <- step*4/ms + if (ms>4) step <- step*4/ms nll1 <- nlogl(theta+step,family,y,mu,scale,wt,2) iter <- 0 @@ -76,11 +85,13 @@ theta <- theta + step ## accept updated theta ## accept log lik and derivs ... nll <- if (iter>0) nlogl(theta,family,y,mu,scale,wt,2) else nll1 + g <- if (family$n.theta==0) nll$g[-del.ind] else nll$g + H <- if (family$n.theta==0) nll$H[-del.ind,-del.ind,drop=FALSE] else nll$H ## convergence checking... - if (sum(abs(nll$g) > tol*abs(nll$nll))==0) break + if (sum(abs(g) > tol*abs(nll$nll))==0) break } ## main Newton loop if (step.failed) warning("step failure in theta estimation") - if (attachH) attr(theta,"H") <- nll$H + if (attachH) attr(theta,"H") <- H#nll$H theta } ## estimate.theta diff -Nru mgcv-1.8-26/R/fast-REML.r mgcv-1.8-27/R/fast-REML.r --- mgcv-1.8-26/R/fast-REML.r 2018-09-28 18:18:28.000000000 +0000 +++ mgcv-1.8-27/R/fast-REML.r 2019-02-01 20:00:20.000000000 +0000 @@ -123,6 +123,7 @@ Sl[[b]]$start <- G$smooth[[i]]$first.para + sbStart[j]-1 Sl[[b]]$stop <- G$smooth[[i]]$first.para + sbStop[j]-1 Sl[[b]]$rank <- G$smooth[[i]]$rank[j] + Sl[[b]]$lambda <- 1 ## dummy here Sl[[b]]$repara <- TRUE ## signals ok to linearly reparameterize if (!is.null(G$smooth[[i]]$g.index)) { ## then some parameters are non-linear - can't re-param if (any(G$smooth[[i]]$g.index[ind])) Sl[[b]]$repara <- FALSE @@ -774,7 +775,6 @@ } XX.db <- t(X)%*%(X%*%db) S.db <- Sl.mult(Sl,db,k=0) -## Sk.db <- Sl.termMult(Sl,db,full=TRUE) ## Sk.db[[j]][,k] is S_j d beta / d rho_k rss2 <- bSb2 <- matrix(0,nd,nd) for (k in 1:nd) { ## second derivative loop @@ -805,39 +805,16 @@ ## timing close to identical - modified for parallel exec D <- matrix(unlist(Skb),nrow(XX),nd) bSb1 <- colSums(beta*D) - #D <- D[piv,]/d[piv] D1 <- .Call(C_mgcv_Rpforwardsolve,R,D[piv,]/d[piv],nt) ## note R transposed internally unlike forwardsolve db[piv,] <- -.Call(C_mgcv_Rpbacksolve,R,D1,nt)/d[piv] - #db[piv,] <- -backsolve(R,forwardsolve(t(R),D))/d[piv] - - ## original serial - a bit slow with very large numbers of smoothing - ## parameters.... - #Rt <- t(R) ## essential to do this first, or t(R) dominates cost in loop - #for (i in 1:nd) { ## compute the first derivatives - # db[piv,i] <- -backsolve(R,forwardsolve(Rt,Skb[[i]][piv]/d[piv]))/d[piv] ## d beta/ d rho_i - # bSb1[i] <- sum(beta*Skb[[i]]) ## d b'Sb / d_rho_i - #} ## XX.db <- XX%*%db XX.db <- .Call(C_mgcv_pmmult2,XX,db,0,0,nt) - #XX.db[piv,] <- d[piv]*(t(R)%*%(R%*%(d[piv]*db[piv,]))) ## X'Xdb - + S.db <- Sl.mult(Sl,db,k=0) - ##Sk.db <- Sl.termMult(Sl,db,full=TRUE) ## Sk.db[[j]][,k] is S_j d beta / d rho_k - ## following loop very slow if large numbers of smoothing parameters... - #rss2 <- bSb2 <- matrix(0,nd,nd) - #for (k in 1:nd) { ## second derivative loop - # for (j in k:nd) { - # rss2[j,k] <- rss2[k,j] <- 2 * sum(db[,j]*XX.db[,k]) - # bSb2[j,k] <- bSb2[k,j] <- (k==j)*sum(beta*Skb[[k]]) + 2*(sum(db[,k]*(Skb[[j]]+S.db[,j])) + - # sum(db[,j]*Skb[[k]])) - # } - #} - ## rss2 <- 2 * t(db) %*% XX.db rss2 <- 2 * .Call(C_mgcv_pmmult2,db,XX.db,1,0,nt) bSb2 <- diag(x=colSums(beta*D),nrow=nd) - ## bSb2 <- bSb2 + 2*(t(db)%*%(D+S.db) + t(D)%*%db) bSb2 <- bSb2 + 2 * (.Call(C_mgcv_pmmult2,db,D+S.db,1,0,nt) + .Call(C_mgcv_pmmult2,D,db,1,0,nt)) list(bSb=sum(beta*Sb),bSb1=bSb1,bSb2=bSb2, d1b=db ,rss1=rss1,rss2=rss2) diff -Nru mgcv-1.8-26/R/gam.fit3.r mgcv-1.8-27/R/gam.fit3.r --- mgcv-1.8-26/R/gam.fit3.r 2018-09-14 15:10:19.000000000 +0000 +++ mgcv-1.8-27/R/gam.fit3.r 2019-02-04 16:21:47.000000000 +0000 @@ -1262,6 +1262,7 @@ ## initial fit + initial.lsp <- lsp ## used if edge correcting to set direction of correction b<-gam.fit3(x=X, y=y, sp=L%*%lsp+lsp0,Eb=Eb,UrS=UrS, offset = offset,U1=U1,Mp=Mp,family = family,weights=weights,deriv=2, control=control,gamma=gamma,scale=scale,printWarn=FALSE,start=start, @@ -1627,18 +1628,20 @@ REML <- b$REML alpha <- if (is.logical(edge.correct)) .02 else abs(edge.correct) ## target RE/ML change per sp b1 <- b; lsp1 <- lsp - if (length(flat)) for (i in flat) { - REML <- b1$REML + alpha - while (b1$REML < REML) { - lsp1[i] <- lsp1[i] - 1 - b1 <- gam.fit3(x=X, y=y, sp=L%*%lsp1+lsp0,Eb=Eb,UrS=UrS, + if (length(flat)) { + step <- as.numeric(initial.lsp - lsp)*2-1 + for (i in flat) { + REML <- b1$REML + alpha + while (b1$REML < REML) { + lsp1[i] <- lsp1[i] + step[i] + b1 <- gam.fit3(x=X, y=y, sp=L%*%lsp1+lsp0,Eb=Eb,UrS=UrS, offset = offset,U1=U1,Mp=Mp,family = family,weights=weights,deriv=0, control=control,gamma=gamma,scale=scale,printWarn=FALSE,start=start, mustart=mustart,scoreType=scoreType,null.coef=null.coef, pearson.extra=pearson.extra,dev.extra=dev.extra,n.true=n.true,Sl=Sl,...) + } } - } - + } ## if length(flat) b1 <- gam.fit3(x=X, y=y, sp=L%*%lsp1+lsp0,Eb=Eb,UrS=UrS, offset = offset,U1=U1,Mp=Mp,family = family,weights=weights,deriv=2, control=control,gamma=gamma,scale=scale,printWarn=FALSE,start=start, @@ -2837,7 +2840,7 @@ ## evaluates log Tweedie density for 1<=p<=2, using series summation of ## Dunn & Smyth (2005) Statistics and Computing 15:267-280. n <- length(y) - if (!is.na(rho)&&!is.na(theta)) { ## use rho and theta and get derivs w.r.t. these + if (all(!is.na(rho))&&all(!is.na(theta))) { ## use rho and theta and get derivs w.r.t. these #if (length(rho)>1||length(theta)>1) stop("only scalar `rho' and `theta' allowed.") if (a>=b||a<=1||b>=2) stop("10) stop("non finite values in Hessian") - indefinite <- FALSE - if (sum(D <= 0)) { ## Hessian indefinite, for sure - D <- rep(1,ncol(Hp)) + + if (min(D)<0) { ## 2/2/19 replaces any D<0 indicating indef + Dthresh <- max(D)*sqrt(.Machine$double.eps) + if (-min(D) < Dthresh) { ## could be indef or +ve semi def + indefinite <- FALSE + D[D0) { - # Sb <- St%*%coef Skb <- Sl.termMult(rp$Sl,fcoef,full=TRUE) d1bSb <- rep(0,m) for (i in 1:m) { Skb[[i]] <- Skb[[i]][!bdrop] - d1bSb[i] <- # 2*sum(d1b[,i]*Sb) + # cancels - sum(coef*Skb[[i]]) + d1bSb[i] <- sum(coef*Skb[[i]]) } } if (deriv>1) { d2bSb <- matrix(0,m,m) - # k <- 0 for (i in 1:m) { Sd1b <- St%*%d1b[,i] for (j in i:m) { - k <- k + 1 - d2bSb[j,i] <- d2bSb[i,j] <- 2*sum( # d2b[,k]*Sb + # cancels + d2bSb[j,i] <- d2bSb[i,j] <- 2*sum( d1b[,i]*Skb[[j]] + d1b[,j]*Skb[[i]] + d1b[,j]*Sd1b) } d2bSb[i,i] <- d2bSb[i,i] + sum(coef*Skb[[i]]) @@ -1217,12 +1218,13 @@ if (!is.null(REML1)) cat("REML1 =",REML1,"\n") } REML2 <- if (deriv<2) NULL else -( (d2l - d2bSb/2)/gamma + rp$ldet2/2 - d2ldetH/2 ) - ## bSb <- t(coef)%*%St%*%coef + + ## Get possibly multiple linear predictors lpi <- attr(x,"lpi") - if (is.null(lpi)) { + if (is.null(lpi)) { ## only one... linear.predictors <- if (is.null(offset)) as.numeric(x%*%coef) else as.numeric(x%*%coef+offset) fitted.values <- family$linkinv(linear.predictors) - } else { + } else { ## multiple... fitted.values <- linear.predictors <- matrix(0,nrow(x),length(lpi)) if (!is.null(offset)) offset[[length(lpi)+1]] <- 0 for (j in 1:length(lpi)) { @@ -1280,6 +1282,7 @@ fit <- gam.fit5(x=x,y=y,lsp=lsp,Sl=Sl,weights=weights,offset=offset,deriv=0,family=family, control=control,Mp=Mp,start=start,gamma=1) score.hist <- rep(0,200) + tiny <- .Machine$double.eps^.5 ## used to bound above zero for (iter in 1:200) { start <- fit$coefficients ## obtain Vb... @@ -1305,8 +1308,9 @@ for (i in 1:length(Sb)) { bSb[i] <- sum(start*Sb[[i]]) } - a <- pmax(0,fit$S1*exp(-lsp) - trVS) - r <- a/pmax(0,bSb) + + a <- pmax(tiny,fit$S1*exp(-lsp) - trVS) + r <- a/pmax(tiny,bSb) r[a==0&bSb==0] <- 1 r[!is.finite(r)] <- 1e6 lsp1 <- pmin(lsp + log(r)*mult,control$efs.lspmax) diff -Nru mgcv-1.8-26/R/gamlss.r mgcv-1.8-27/R/gamlss.r --- mgcv-1.8-26/R/gamlss.r 2018-05-30 16:18:39.000000000 +0000 +++ mgcv-1.8-27/R/gamlss.r 2019-02-02 19:03:20.000000000 +0000 @@ -1,4 +1,4 @@ -## (c) Simon N. Wood (2013-2016) distributed under GPL2 +## (c) Simon N. Wood (2013-2019) distributed under GPL2 ## Code for the gamlss families. ## idea is that there are standard functions converting ## derivatives w.r.t. mu to derivatives w.r.t. eta, given @@ -1572,19 +1572,31 @@ ## been tested against raw translated code. exp1 <- exp(1); ## facilitates lazy auto-translation - aa0 <- (xi*(y-mu))/exp1^rho # added - log.aa1 <- log1p(aa0) # added + ymu <- y - mu + aa0 <- (xi*ymu)/exp1^rho # added + ind <- which(aa0 <= -1) ## added + if (FALSE&&length(ind)>0) { ## all added + xii <- xi[ind] ## this idea is really not a good one - messes up derivatives when triggered + erho <- exp1^rho[ind] + eps1 <- 1-.Machine$double.eps^.25 + ymu[ind] <- -erho/xii*eps1 + aa0[ind] <- -eps1 + } + log.aa1 <- log1p(aa0) ## added aa1 <- aa0 + 1 # (xi*(y-mu))/exp1^rho+1; aa2 <- 1/xi; l <- sum((-aa2*(1+xi)*log.aa1)-1/aa1^aa2-rho); + #if (length(ind)>0) cat(aa0[ind]," l = ",l,"\n") if (deriv>0) { ## first derivatives m, r, x... bb1 <- 1/exp1^rho; - bb2 <- bb1*xi*(y-mu)+1; + bb2 <- bb1*xi*ymu+1; + l1[,1] <- (bb1*(xi+1))/bb2-bb1*bb2^((-1/xi)-1); - cc2 <- y-mu; + cc2 <- ymu; cc0 <- bb1*xi*cc2 ## added + log.cc3 <- log1p(cc0) ## added cc3 <- cc0 + 1 ##bb1*xi*cc2+1; l1[,2] <- (-bb1*cc2*cc3^((-1/xi)-1))+(bb1*(xi+1)*cc2)/cc3-1; @@ -1847,3 +1859,212 @@ available.derivs = 2 ## can use full Newton here ),class = c("general.family","extended.family","family")) } ## end gevlss + + + +## attempt at a Tweedie gamlss family, suitable for extended FS +## estimation (i.e. avoiding horrendous 4th derivative series +## summations) + + +tw.null.fit <- function(y,a=1.001,b=1.999) { +## MLE of c(mu,p,phi) for sample of Tweedie data, y. +## Stabilized, step controlled, Newton + th <- c(0,0,0) + ld <- ldTweedie(y,exp(th[1]),rho=th[3],theta=th[2],a=a,b=b,all.derivs=TRUE) + lds <- colSums(ld) + for (i in 1:50) { + g <- lds[c(7,4,2)] + if (sum(abs(g)>1e-9*abs(lds[1]))==0) break + g[1] <- g[1] * exp(th[1]) ## work on log scale for mu + H <- matrix(0,3,3) ## mu, th, rh + diag(H) <- c(lds[8],lds[5],lds[3]) + H[1,2] <- H[2,1] <- lds[9] + H[1,3] <- H[3,1] <- lds[10] + H[2,3] <- H[3,2] <- lds[6] + H[,1] <- H[,1]*exp(th[1]) + H[1,-1] <- H[1,-1] * exp(th[1]) + eh <- eigen(H,symmetric=TRUE) + tol <- max(abs(eh$values))*1e-7 + eh$values[eh$values>-tol] <- -tol + step <- as.numeric(eh$vectors%*%((t(eh$vectors)%*%g)/eh$values)) + ms <- max(abs(step)) + if (ms>3) step <- step*3/ms + ok <- FALSE + while (!ok) { + th1 <- th - step + ld1 <- ldTweedie(y,exp(th1[1]),rho=th1[3],theta=th1[2],a=a,b=b,all.derivs=TRUE) + if (sum(ld1[,1])0) (b + a*exp(-th[2]))/(1+exp(-th[2])) else (b*exp(th[2])+a)/(exp(th[2])+1) + c(exp(th[1]),p,exp(th[3])) # mu, p, sigma +} ## tw.null.fit + + +twlss <- function(link=list("log","identity","identity"),a=1.01,b=1.99) { +## General family for Tweedie location scale model... +## so mu is mu1, rho = log(sigma) is mu2 and transformed p is mu3 +## Need to redo ldTweedie to allow vector p and phi +## -- advantage is that there is no point buffering +## 1. get derivatives wrt mu, rho and p. +## 2. get required link derivatives and tri indices. +## 3. transform derivs to derivs wrt eta (gamlss.etamu). +## 4. get the grad and Hessian etc for the model +## via a call to gamlss.gH + + ## first deal with links and their derivatives... + if (length(link)!=3) stop("gevlss requires 3 links specified as character strings") + okLinks <- list(c("log", "identity", "sqrt"),"identity",c("identity")) + stats <- list() + for (i in 1:3) { + if (link[[i]] %in% okLinks[[i]]) stats[[i]] <- make.link(link[[i]]) else + stop(link[[i]]," link not available for mu parameter of twlss") + fam <- structure(list(link=link[[i]],canonical="none",linkfun=stats[[i]]$linkfun, + mu.eta=stats[[i]]$mu.eta), + class="family") + fam <- fix.family.link(fam) + stats[[i]]$d2link <- fam$d2link + stats[[i]]$d3link <- fam$d3link + stats[[i]]$d4link <- fam$d4link + } + + env <- new.env(parent = .GlobalEnv) + assign(".a",a, envir = env);assign(".b",b, envir = env) + + residuals <- function(object,type=c("deviance","pearson","response")) { + + a <- get(".a");b <- get(".b") + type <- match.arg(type) + mu <- object$fitted.values[,1] + p <- object$fitted.values[,2] + ind <- p > 0; + ethi <- exp(-p[ind]);ethni <- exp(p[!ind]) + p[ind] <- (b+a*ethi)/(1+ethi) + p[!ind] <- (b*ethni+a)/(ethni+1) + phi <- exp(object$fitted.values[,3]) + if (type=="pearson") { + rsd <- (object$y-mu)/sqrt(phi*mu^p) ## Pearson + } else if (type=="response") rsd <- object$y-mu else { + y1 <- object$y + (object$y == 0) + theta <- (y1^(1 - p) - mu^(1 - p))/(1 - p) + kappa <- (object$y^(2 - p) - mu^(2 - p))/(2 - p) + rsd <- sign(object$y-mu)*sqrt(pmax(2 * (object$y * theta - kappa) * object$prior.weights/phi,0)) + } + return(rsd) ## (y-mu)/sigma + } + + postproc <- expression({ + ## code to evaluate in estimate.gam, to evaluate null deviance + ## used for dev explained - really a mean scale concept. + ## makes no sense to use single scale param here... + tw.para <- tw.null.fit(object$y) ## paramaters mu, p and phi + tw.y1 <- object$y + (object$y == 0) + tw.theta <- (tw.y1^(1 - tw.para[2]) - tw.para[1]^(1 - tw.para[2]))/(1 - tw.para[2]) + tw.kappa <- (object$y^(2 - tw.para[2]) - tw.para[1]^(2 - tw.para[2]))/(2 - tw.para[2]) + object$null.deviance <- sum(pmax(2 * (object$y * tw.theta - tw.kappa) * object$prior.weights/exp(object$fitted.values[,3]),0)) + }) + + ll <- function(y,X,coef,wt,family,offset=NULL,deriv=0,d1b=0,d2b=0,Hp=NULL,rank=0,fh=NULL,D=NULL) { + ## function defining the gamlss Tweedie model log lik. + ## deriv: 0 - eval + ## 1 - grad and Hess + ## 2 - diagonal of first deriv of Hess + ## 3 - first deriv of Hess + ## 4 - everything. + ## This family does not have code for 3 and 4 + if (is.null(offset)) offset <- list(0,0,0) else offset[[4]] <- 0 + for (i in 1:3) if (is.null(offset[[i]])) offset[[i]] <- 0 + jj <- attr(X,"lpi") ## extract linear predictor index + eta <- X[,jj[[1]],drop=FALSE]%*%coef[jj[[1]]] + offset[[1]] + mu <- family$linfo[[1]]$linkinv(eta) ## mean + theta <- X[,jj[[2]],drop=FALSE]%*%coef[jj[[2]]] +offset[[2]] ## transformed p + rho <- X[,jj[[3]],drop=FALSE]%*%coef[jj[[3]]] +offset[[3]] ## log scale parameter + a <- get(".a");b <- get(".b") + + ld <- ldTweedie(y,mu=mu,p=NA,phi=NA,rho=rho,theta=theta,a=a,b=b,all.derivs=TRUE) + ## m, t, r ; mm, mt, mr, tt, tr, rr + l <- sum(ld[,1]) + l1 <- cbind(ld[,7],ld[,4],ld[,2]) + l2 <- cbind(ld[,8],ld[,9],ld[,10],ld[,5],ld[,6],ld[,3]) + + ig1 <- cbind(family$linfo[[1]]$mu.eta(eta),family$linfo[[2]]$mu.eta(theta), + family$linfo[[3]]$mu.eta(rho)) + g2 <- cbind(family$linfo[[1]]$d2link(mu),family$linfo[[2]]$d2link(theta), + family$linfo[[3]]$d2link(rho)) + + n <- length(y) + + l3 <- l4 <- g3 <- g4 <- 0 ## defaults + + + if (deriv) { + i2 <- family$tri$i2;#i3 <- i4 <- 0 + i3 <- family$tri$i3;i4 <- family$tri$i4 + + ## transform derivates w.r.t. mu to derivatives w.r.t. eta... + de <- gamlss.etamu(l1,l2,l3,l4,ig1,g2,g3,g4,i2,i3,i4,0) + + ## get the gradient and Hessian... + ret <- gamlss.gH(X,jj,de$l1,de$l2,i2,l3=de$l3,i3=i3,l4=de$l4,i4=i4, + d1b=d1b,d2b=d2b,deriv=0,fh=fh,D=D) + } else ret <- list() + ret$l <- l; ret + } ## end ll twlss + + initialize <- expression({ + ## idea is to regress g(y) on model matrix for mean, and then + ## to regress the corresponding log absolute scaled residuals on + ## the model matrix for log(sigma) - may be called in both + ## gam.fit5 and initial.spg... note that appropriate E scaling + ## for full calculation may be inappropriate for initialization + ## which is basically penalizing something different here. + ## best we can do here is to use E only as a regularizer. + ## initial theta params are zero for p = 1.5 + n <- rep(1, nobs) + ## should E be used unscaled or not?.. + use.unscaled <- if (!is.null(attr(E,"use.unscaled"))) TRUE else FALSE + if (is.null(start)) { + jj <- attr(x,"lpi") + start <- rep(0,ncol(x)) + yt1 <- if (family$link[[1]]=="identity") y else + family$linfo[[1]]$linkfun(abs(y)+max(y)*1e-7) + x1 <- x[,jj[[1]],drop=FALSE] + e1 <- E[,jj[[1]],drop=FALSE] ## square root of total penalty + #ne1 <- norm(e1); if (ne1==0) ne1 <- 1 + if (use.unscaled) { + qrx <- qr(rbind(x1,e1)) + x1 <- rbind(x1,e1) + startji <- qr.coef(qr(x1),c(yt1,rep(0,nrow(E)))) + startji[!is.finite(startji)] <- 0 + } else startji <- pen.reg(x1,e1,yt1) + ## now the scale parameter + start[jj[[1]]] <- startji + mu1 <- family$linfo[[1]]$linkinv(x[,jj[[1]],drop=FALSE]%*%start[jj[[1]]]) + lres1 <- log(abs((y-mu1)/mu1^1.5)) + x1 <- x[,jj[[3]],drop=FALSE];e1 <- E[,jj[[3]],drop=FALSE] + #ne1 <- norm(e1); if (ne1==0) ne1 <- 1 + if (use.unscaled) { + x1 <- rbind(x1,e1) + startji <- qr.coef(qr(x1),c(lres1,rep(0,nrow(E)))) + startji[!is.finite(startji)] <- 0 + } else startji <- pen.reg(x1,e1,lres1) + start[jj[[3]]] <- startji + } ## is.null(start) + }) ## initialize twlss + + environment(ll) <- environment(residuals) <- env + + structure(list(family="twlss",ll=ll,link=paste(link),nlp=3, + tri = trind.generator(3), ## symmetric indices for accessing derivative arrays + initialize=initialize,postproc=postproc,residuals=residuals, + linfo = stats, ## link information list + d2link=1,d3link=1,d4link=1, ## signals to fix.family.link that all done + ls=1, ## signals that ls not needed here + available.derivs = 0 ## no higher derivs + ),class = c("general.family","extended.family","family")) +} ## end twlss diff -Nru mgcv-1.8-26/R/inla.r mgcv-1.8-27/R/inla.r --- mgcv-1.8-26/R/inla.r 1970-01-01 00:00:00.000000000 +0000 +++ mgcv-1.8-27/R/inla.r 2019-01-24 19:45:54.000000000 +0000 @@ -0,0 +1,435 @@ +## (c) Simon Wood 2018. Released under GPL2. +## Implements the version of INLA (Rue et al. 2009, JRSSB) described in +## Wood "Simplified Integrated Nested Laplace Approximation" (submitted, 2018) + + +FFdes <- function (size=5,ccd=FALSE) { +## creates level 5 fractional factorial designs, up to size=120 +## according to Sanchez and Sanchez (2005) +## ACM Transactions on Modeling and Computer Simulation 15(4), 362-377 +## If ccd==TRUE, appends this to make the outer points of a +## Central Composite design (origin is added for full design). + + fwt <- function(x) { + ## fast Walsh transform + lag <- 1 + while (lag < length(x)) { + offset <- lag * 2 + ngroups <- length(x)/offset + for (group in 0:(ngroups-1)) { ## vectorized + j <- 1:lag + group*offset + k <- j + lag + xj <- x[j];xk <- x[k] + x[j] <- xj + xk + x[k] <- xj - xk + } + lag <- offset + } ## while lag + x + } ## fwt + + index <- c(1, 2, 4, 8, 15, 16, 32, 51, 64, 85, 106, 128, + 150, 171, 219, 237, 247, 256, 279, 297, 455, 512, 537, + 557, 594, 643, 803, 863, 998, 1024, 1051, 1070, 1112, + 1169, 1333, 1345, 1620, 1866, 2048, 2076, 2085, 2185, + 2372, 2456, 2618, 2800, 2873, 3127, 3284, 3483, 3557, + 3763, 4096, 4125, 4135, 4174, 4435, 4459, 4469, 4497, + 4752, 5255, 5732, 5804, 5915, 6100, 6369, 6907, 7069, + 8192, 8263, 8351, 8422, 8458, 8571, 8750, 8858, 9124, + 9314, 9500, 10026, 10455, 10556, 11778, 11885, 11984, + 13548, 14007, 14514, 14965, 15125, 15554, 16384, 16457, + 16517, 16609, 16771, 16853, 17022, 17453, 17891, 18073, + 18562, 18980, 19030, 19932, 20075, 20745, 21544, 22633, + 23200, 24167, 25700, 26360, 26591, 26776, 28443, 28905, + 29577, 32705) + + power <- index;p <- 1 + for (i in 1:length(index)) { + if (index[i]>=p) p <- p * 2 + power[i] <- p + } + + if (size > 120||size<1) stop("size must be in [1,120]") + design <- matrix(0,power[size],size) + for (i in 1:size) { + design[index[i]+1,i] <- 1 + design[,i] <- fwt(design[,i]) + } + if (ccd&&size>1) { + design <- rbind(design,diag(size)*sqrt(size),-diag(size)*sqrt(size)) + + } + design +} ## FFdes + + +logf <- function(beta,b,Bi=NULL,Xm=NULL,deriv=0) { +## get log joint density and first deriv w.r.t. coefs for a gam... +## Bi is a matrix mapping from interesting parameters to model parameters + ## first deal with the log likelihood... + if (is.null(Xm)) Xm <- if (is.null(b$X)) model.matrix(b) else b$X + dd <- NULL + if (!is.null(Bi)) beta <- drop(Bi %*% beta) + if (inherits(b$family,"general.family")) { + foo <- b$family$ll(b$y,Xm,beta,b$prior.weights,b$family,offset=b$offset,deriv=1) + sll <- -2*foo$l + dd <- -foo$lb + } else if (inherits(b$family,"extended.family")) { + theta <- b$family$getTheta() + eta <- if (is.null(b$Xd)) as.numeric(Xm%*%beta + b$offset) else Xbd(b$Xd,beta,b$kd,b$ks,b$ts,b$dt,b$v,b$qc,b$drop) + b$offset + mu <- b$family$linkinv(eta) + sll <- sum(b$family$dev.resids(b$y,mu,b$prior.weights,theta)) ## deviance + if (deriv) { + #dd <- colSums((b$family$mu.eta(eta)*b$family$Dd(b$y,mu,theta,b$prior.weights)$Dmu)*Xm)/2 + dd <- if (is.null(b$Xd)) drop((b$family$mu.eta(eta)*b$family$Dd(b$y,mu,theta,b$prior.weights)$Dmu) %*% Xm)/2 else + XWyd(b$Xd,b$family$mu.eta(eta),b$family$Dd(b$y,mu,theta,b$prior.weights)$Dmu,b$kd,b$ks,b$ts,b$dt,b$v,b$qc,b$drop)/2 + } + } else { ## regular exponential family + eta <- if (is.null(b$Xd)) as.numeric(Xm%*%beta + b$offset) else Xbd(b$Xd,beta,b$kd,b$ks,b$ts,b$dt,b$v,b$qc,b$drop) + b$offset + mu <- b$family$linkinv(eta) + sll <- sum(b$family$dev.resids(b$y,mu,b$prior.weights)) ## deviance + if (deriv) { + ##dd <- -colSums(b$prior.weights*(b$family$mu.eta(eta)*(b$y-mu)/b$family$variance(mu))*Xm) + dd <- if (is.null(b$Xd)) -drop((b$prior.weights*(b$family$mu.eta(eta)*(b$y-mu)/b$family$variance(mu))) %*% Xm) else + -XWyd(b$Xd,b$prior.weights,b$family$mu.eta(eta)*(b$y-mu)/b$family$variance(mu),b$kd,b$ks,b$ts,b$dt,b$v,b$qc,b$drop) + } + } ## deviance done + ## now the smoothing prior/penalty + ## NOTE: id's, fixed sp ??? + if (length(b$smooth)) { + k <- 1;pen <- 0 + for (i in 1:length(b$smooth)) for (j in 1:length(b$smooth[[i]]$S)) { + ind <- b$smooth[[i]]$first.para:b$smooth[[i]]$last.para + b0 <- beta[ind] + Sb <- b$smooth[[i]]$S[[j]] %*% b0 * b$sp[k] + pen <- pen + sum(b0*Sb) + if (deriv) dd[ind] <- dd[ind] + Sb + k <- k + 1 + } + } + if (!is.null(Bi)) dd <- drop(t(dd) %*% Bi) + list(ll =(sll + pen)/(2*b$sig2),dd=dd) ## on neg log lik scale +} ## logf + +glogf <- function(beta,b,Bi=NULL,Xm=NULL) { + logf(beta,b,Bi=Bi,Xm=Xm,deriv=1)$dd +} + +flogf <- function(beta,b,Bi=NULL,Xm=NULL) { + logf(beta,b,Bi=Bi,Xm=Xm,deriv=0)$ll +} + +Acomp <- function(A,ortho=TRUE) { +## simple matrix completion, if A is p by n, p <= n +## then returns full rank n by n matrix, B, whose last n-p +## rows are orthogonal to A, and its inverse Bi. + p <- nrow(A);n<- ncol(A) + if (ortho) { ## use orthogonal methods - very stable + qra <- qr(t(A)) + R <- qr.R(qra) + if (Rrank(R) 0) lsp <- c(b$family$getTheta(),lsp) + } else n.theta <- 0 + + ## check that family supports enough derivatives to allow integration step, + ## otherwise just use empirical Bayes. + if (!is.null(G$family$available.derivs)&&G$family$available.derivs==0&&int>0) { + int <- 0 + warning("integration not available with this family - insufficient derivatives") + } + + ## Gaussian approximation is that log(sp) ~ N(lsp,V) + if (int>0) { ## integration requested + rV <- chol(V) ## Rv'z + lsp gives trial sp + ip <- dg(ncol(rV)) ## integration points + nip <- nrow(ip$D) + } else nip <- 0 + + if (!is.null(G$family$preinitialize)) { + if (inherits(G$family,"general.family")) { + Gmod <- G$family$preinitialize(G) + for (gnam in names(Gmod)) G[[gnam]] <- Gmod[[gnam]] ## copy these into G + } else { + ## extended family - just initializes theta and possibly y + pini <- G$family$preinitialize(G$y,G$family) + #if (!is.null(pini$Theta)) G$family$putTheta(pini$Theta) ## DON'T - will reset stored theta! + if (!is.null(pini$y)) G$y <- pini$y + } + } + + X <- G$X + G$prior.weights <- G$w + G$sp <- b$sp + reml <- rep(0,nip) + dens <- list() ## list of lists of splines representing target density + beta.lim <- matrix(0,ncol(X),2) ## evaluation limits for each A beta + p <- ncol(b$Vp) + if (!is.null(A)) { ## a transformation of original parameters is required + if (is.matrix(A)||length(A)==p) { ## linear transforms needed + B <- Acomp(A,is.null(G$Xd)) ## use orthogonal method only with gam fitting + pa <- nrow(A) + kind <- 1:pa + } else { ## just a list of elements of beta + A <- round(A) + pa <- length(A) + if (max(A)>p||min(A)<1||pa>p) stop("something wrong with A index vector") + kind <- A + B <- list() + } + } else { pa=p;kind <- 1:pa;B <- list()} + iprog <- 0 + if (prog) prg <- txtProgressBar(min = 0, max = (nip+1)*pa, initial = 0, + char = "=",width = NA, title="Progress", style = 3) + for (qq in 0:nip) { ## integration loop + dens[[qq+1]] <- list() ## elements will be log densities for each target parameter + if (qq>0) { ## then a new fit is needed + sp <- drop(t(rV) %*% ip$D[qq,]) + lsp + sp <- pmin(pmax(sp,lsp-10),lsp+10) ## avoid +/- inf + wprior <- -sum((sp-lsp)^2/(200)) ## include the prior assumed in sp.vcov + if (n.theta>0) { ## set family hyper-parameters + ii <- 1:n.theta + G$family$putTheta(sp[ii]) + G$n.theta <- 0 ## fixed not estimated now. + sp <- sp[-ii] + } + if (scale.estimated) { + scale <- exp(sp[length(sp)]) + sp <- sp[-length(sp)] + } else scale <- 1 + sp <- exp(sp) + if (inherits(G,"gam.prefit")) b <- gam(G=G0,method="REML",sp=sp,scale=scale) else + b <- bam(G=G0,sp=sp,scale=scale) + G$sp <- sp + reml[qq] <- -b$gcv.ubre + wprior + } else max.reml <- -b$gcv.ubre ## maximum LAML + G$family <- b$family + G$sig2 <- b$sig2 + + beta <- coef(b) + if (!is.null(B$B)) { ## a transformation of original parameters is required + b$Vp <- B$B%*%b$Vp%*%t(B$B) + beta <- drop(B$B%*%beta) + } + if (approx<2) { + H <- cholinv(b$Vp) ## get Hessian - would be better returned directly by gam/bam + dpc <- 1/sqrt(diag(H)) ## diagonal pre-conditioning + R1 <- chol(dpc*t(H*dpc),pivot=TRUE) + piv <- attr(R1,"pivot") + } + sd <- diag(b$Vp)^.5 ## standard dev of Gaussian approximation + BM <- matrix(0,p,nk) ## storage for beta[k] conditional posterior modes + inla <- list(density=matrix(0,pa,nb),beta=matrix(0,pa,nb)) ## storage for marginals + ldet <- dens0 <- rep(0,nk) + eps <- .0001 + qn <- qnorm(seq(eps,1-eps,length=nk)) + kk <- 0 + for (k in kind) { + kk <- kk + 1 ## counter for output arrays + if (approx<2) { ## need R'R = H[-k,-k] (pivoted & pre-conditioned) + kd <- which(piv==k) ## identify column of pivoted R1 corresponding to col k in H + R <- choldrop(R1,kd) ## update R + pivk <- piv[-kd]; pivk[pivk>k] <- pivk[pivk>k]-1 ## pivots updated + attr(R,"pivot") <- pivk ## pivots of updated R + attr(R,"dpc") <- dpc[-k] ## diagonal pre-conditioner + ldetH <- 2*(sum(log(diag(R)))-sum(log(dpc[-k]))) ## log det of H[-k,-k] + } + bg <- qn*sd[k]+beta[k] + BM[k,] <- bg + BM[-k,] <- beta[-k] + b$Vp[-k,k]%*%((t(bg)-beta[k])/b$Vp[k,k]) ## Gaussian approx. + if (approx==0) { ## get actual modes + db <- db0 <- beta*0 + for (i in c((nk/2):1,(nk/2):nk)) { + beta0 <- BM[,i] + db0 + nn <- logf(beta0,G,B$Bi,X,deriv=1) + if (is.finite(nn$ll)) for (j in 1:20) { ## newton loop + if (max(abs(nn$dd[-k]))<1e-4*abs(nn$ll)) break + # db[-k] <- -backsolve(R,forwardsolve(Rt,nn$dd[-k])) + db[-k] <- -Rsolve(R,nn$dd[-k]) + beta1 <- beta0 + db + nn1 <- logf(beta1,G,B$Bi,X,deriv=1) + get.deriv <- FALSE + hstep <- 0 + while (!is.finite(nn1$ll) || nn1$ll>nn$ll) { + db <- db/2; + hstep <- hstep+1 + beta1 <- beta0 + db + nn1 <- logf(beta1,G,B$Bi,X,deriv=0) + get.deriv <- TRUE + } + if (get.deriv) nn1 <- logf(beta1,G,B$Bi,X,deriv=1) + nn <- nn1 + beta0 <- beta1 + } ## newton loop + db0 <- if (i==1) 0 else beta0 - BM[,i] + BM[,i] <- beta0 + dens0[i] <- nn$ll + } + } else for (i in 1:nk) dens0[i] <- logf(BM[,i],G,B$Bi,X,deriv=0)$ll + ## now get the log determinant correction... + if (approx<2) { + if (J>1) { + vb <- apply(BM,1,var);vb[k] <- 0 + j <- length(vb) + del <- which(rank(vb)%in%(j:(j-J+2))) + } + step.length <- mean(colSums((BM - beta)^2)^.5)/20 + D <- rep(c(-1,1),J) + ## create matrix of steps and matrix of evaluated gradient at steps + for (i in 1:nk) if (is.finite(dens0[i])) { + bm <- BM[,i] + db <- beta - bm;db[k] <- 0 + db <- db/sqrt(sum(db^2))*step.length + for (j in 1:J) { + h = H[-k,-k] %*% db[-k] + if (j>1) u%*%(D[1:(2*(j-1))]*(t(u)%*%db[-k])) else 0 + g1 <- glogf(bm+db/2,G,B$Bi,X) - glogf(bm-db/2,G,B$Bi,X) + v <- cbind(h/sqrt(sum(db[-k]*h)),g1[-k]/sqrt(sum(db*g1))) + u <- if (j>1) cbind(v,u) else v + db <- -db; if (jmaxd*5e-3) { + ok <- FALSE + bg0 <- bg0 - sd[k] + } + if (inla$density[kk,nb]>maxd*5e-3) { + ok <- FALSE + bg1 <- bg1 + sd[k] + } + } + ## normalizing interpolant as well... + din$coefficients[,1] <- din$coefficients[,1] - log(n.const) + dens[[qq+1]][[kk]] <- din + if (qq==0||bg0beta.lim[k,2]) beta.lim[k,2] <- bg1 + ## normalize + iprog <- iprog + 1 + if (prog) setTxtProgressBar(prg, iprog) + if (interactive) + { plot(inla$beta[kk,],inla$density[kk,],ylim=range(inla$density[kk,])*1.2,type="l",col=2, + xlab=bquote(beta[.(k)]),ylab="density") + lines(inla$beta[kk,],dnorm(inla$beta[kk,],mean=beta[k],sd=sd[k])) + if (interactive==2) readline() + } + } ## beta loop + } ## integration loop + ## now the actual integration, if nip>0, otherwise we are done + if (nip) { + ## normalize the integration weights + reml <- c(ip$k0,ip$k1*exp(reml-max.reml)) + reml <- reml/sum(reml) + for (k in 1:pa) { ## parameter loop + inla$beta[k,] <- seq(beta.lim[k,1],beta.lim[k,2],length=nb) ## output beta sequence + inla$density[k,] <- exp(predict(dens[[1]][[k]],inla$beta[k,])$y)*reml[1] ## normalized density + } + for (qq in 2:(nip+1)) { + for (k in 1:pa) { ## parameter loop + inla$density[k,] <- inla$density[k,] + + exp(predict(dens[[qq]][[k]],inla$beta[k,])$y)*reml[qq] ## normalized density + } + } + inla$reml <- reml + } ## if nip + if (prog) cat("\n") + inla +} ## ginla or gam inla newton enhanced (ginlane) diff -Nru mgcv-1.8-26/R/mgcv.r mgcv-1.8-27/R/mgcv.r --- mgcv-1.8-26/R/mgcv.r 2018-11-13 12:24:33.000000000 +0000 +++ mgcv-1.8-27/R/mgcv.r 2019-02-01 22:55:51.000000000 +0000 @@ -369,8 +369,13 @@ } av <- unique(av) ## strip out duplicate variable names pav <- unique(pav) - ret$fake.formula <- if (length(av)>0) reformulate(av,response=ret[[1]]$response) else - ret[[1]]$fake.formula ## create fake formula containing all variables + if (length(av)>0) { + ## work around - reformulate with response = "log(x)" will treat log(x) as a name, + ## not the call it should be... + fff <- formula(paste(ret[[1]]$response,"~ .")) + ret$fake.formula <- reformulate(av,response=ret[[1]]$response) + ret$fake.formula[[2]] <- fff[[2]] ## fix messed up response + } else ret$fake.formula <- ret[[1]]$fake.formula ## create fake formula containing all variables ret$pred.formula <- if (length(pav)>0) reformulate(pav) else ~1 ## predictor only formula ret$response <- ret[[1]]$response ret$nlp <- nlp ## number of linear predictors @@ -1596,7 +1601,7 @@ method <- "REML" ## any method you like as long as it's REML G$Sl <- Sl.setup(G) ## prepare penalty sequence - ## Xorig <- G$X ## store original X incase it is needed by family - poor option pre=proc can manipulate G$X + G$X <- Sl.initial.repara(G$Sl,G$X,both.sides=FALSE) ## re-parameterize accordingly if (!is.null(start)) start <- Sl.initial.repara(G$Sl,start,inverse=FALSE,both.sides=FALSE) @@ -1690,7 +1695,10 @@ ## extended family may need to manipulate G... if (!is.null(G$family$preinitialize)) { - if (inherits(G$family,"general.family")) eval(G$family$preinitialize) else { + if (inherits(G$family,"general.family")) { + Gmod <- G$family$preinitialize(G) ## modifies some elements of G + for (gnam in names(Gmod)) G[[gnam]] <- Gmod[[gnam]] ## copy these into G + } else { ## extended family - just initializes theta and possibly y pini <- G$family$preinitialize(G$y,G$family) if (!is.null(pini$Theta)) G$family$putTheta(pini$Theta) @@ -2014,9 +2022,23 @@ G$formula <- formula G$pred.formula <- gp$pred.formula environment(G$formula)<-environment(formula) + } else { ## G not null + if (!is.null(sp)&&any(sp>=0)) { ## request to modify smoothing parameters + if (is.null(G$L)) G$L <- diag(length(G$sp)) + if (length(sp)!=ncol(G$L)) stop('length of sp must be number of free smoothing parameters in original model') + ind <- sp>=0 ## which smoothing parameters are now fixed + spind <- log(sp[ind]); + spind[!is.finite(spind)] <- -30 ## set any zero parameters to effective zero + G$lsp0 <- G$lsp0 + drop(G$L[,ind,drop=FALSE] %*% spind) ## add fix to lsp0 + G$L <- G$L[,!ind,drop=FALSE] ## drop the cols of G + G$sp <- rep(-1,ncol(G$L)) + } } - if (!fit) return(G) + if (!fit) { + class(G) <- "gam.prefit" + return(G) + } if (ncol(G$X)>nrow(G$X)) stop("Model has more coefficients than data") @@ -2932,7 +2954,7 @@ #lpi <- list();pst <- c(pstart,ncol(X)+1) #for (i in 1:(length(pst)-1)) lpi[[i]] <- pst[i]:(pst[i+1]-1) attr(X,"lpi") <- lpi - ffv <- fam$predict(fam,se.fit,y=response,X=X,beta=object$coefficients, + ffv <- fam$predict(fam,se.fit,y=response[start:stop],X=X,beta=object$coefficients, off=offs,Vb=object$Vp) if (is.matrix(fit)&&!is.matrix(ffv[[1]])) { fit <- fit[,1]; if (se.fit) se <- se[,1] @@ -2959,7 +2981,7 @@ if (se.fit) se[start:stop]<-se[start:stop]*abs(dmu.deta(fit[start:stop])) fit[start:stop] <- linkinv(fit[start:stop]) } else { ## family has its own prediction code for response case - ffv <- fam$predict(fam,se.fit,y=response,X=X,beta=object$coefficients,off=offs,Vb=object$Vp) + ffv <- fam$predict(fam,se.fit,y=response[start:stop],X=X,beta=object$coefficients,off=offs,Vb=object$Vp) if (is.null(fit1)&&is.matrix(ffv[[1]])) { fit1 <- matrix(0,np,ncol(ffv[[1]])) if (se.fit) se1 <- fit1 @@ -3840,11 +3862,17 @@ } -sp.vcov <- function(x) { +sp.vcov <- function(x,edge.correct=TRUE,reg=1e-3) { ## get cov matrix of smoothing parameters, if available if (!inherits(x,"gam")) stop("argument is not a gam object") if (x$method%in%c("ML","P-ML","REML","P-REML","fREML")&&!is.null(x$outer.info$hess)) { - return(solve(x$outer.info$hess)) + hess <- x$outer.info$hess + p <- ncol(hess) + if (edge.correct&&!is.null(attr(hess,"hess1"))) { + V <- solve(attr(hess,"hess1")+diag(p)*reg) + attr(V,"lsp") <- attr(hess,"lsp1") + return(V) + } else return(solve(hess+reg)) } else return(NULL) } diff -Nru mgcv-1.8-26/R/misc.r mgcv-1.8-27/R/misc.r --- mgcv-1.8-26/R/misc.r 2018-11-20 20:38:53.000000000 +0000 +++ mgcv-1.8-27/R/misc.r 2018-11-26 17:09:05.000000000 +0000 @@ -242,6 +242,19 @@ Rup } ## choldrop +cholup <- function(R,u,up=TRUE) { +## routine to update Cholesky factor R to the factor of R'R + uu' (up == TRUE) +## or R'R - uu' (up=FALSE). + n <- as.integer(ncol(R)) + up <- as.integer(up) + eps <- as.double(.Machine$double.eps) + R1 <- R * 1.0 + .Call(C_mgcv_chol_up,R1,u,n,up,eps) + if (up==0) if ((n>1 && R1[2,1] < -1)||(n==1&&u[1]>R[1])) stop("update not positive definite") + R1 +} ## cholup + + vcorr <- function(dR,Vr,trans=TRUE) { ## Suppose b = sum_k op(dR[[k]])%*%z*r_k, z ~ N(0,Ip), r ~ N(0,Vr). vcorr returns cov(b). ## dR is a list of p by p matrices. 'op' is 't' if trans=TRUE and I() otherwise. diff -Nru mgcv-1.8-26/R/mvam.r mgcv-1.8-27/R/mvam.r --- mgcv-1.8-26/R/mvam.r 2017-04-11 14:08:23.000000000 +0000 +++ mgcv-1.8-27/R/mvam.r 2018-12-04 18:18:34.000000000 +0000 @@ -88,10 +88,44 @@ ## initialization has to add in the extra parameters of ## the cov matrix... - preinitialize <- expression({ +# preinitialize <- expression({ ## code to evaluate in estimate.gam... ## extends model matrix with dummy columns and ## finds initial coefficients +# ydim <- ncol(G$y) ## dimension of response +# nbeta <- ncol(G$X) +# ntheta <- ydim*(ydim+1)/2 ## number of cov matrix factor params +# lpi <- attr(G$X,"lpi") +# #offs <- attr(G$X,"offset") +# XX <- crossprod(G$X) +# G$X <- cbind(G$X,matrix(0,nrow(G$X),ntheta)) ## add dummy columns to G$X +# #G$cmX <- c(G$cmX,rep(0,ntheta)) ## and corresponding column means +# G$term.names <- c(G$term.names,paste("R",1:ntheta,sep=".")) +# attr(G$X,"lpi") <- lpi +# #offs -> attr(G$X,"offset") +# attr(G$X,"XX") <- XX +# ## pad out sqrt of balanced penalty matrix to account for extra params +# attr(G$Sl,"E") <- cbind(attr(G$Sl,"E"),matrix(0,nbeta,ntheta)) +# G$family$data <- list(ydim = ydim,nbeta=nbeta) +# G$family$ibeta = rep(0,ncol(G$X)) +# ## now get initial parameters and store in family... +# for (k in 1:ydim) { +# sin <- G$off %in% lpi[[k]] + # #Sk <- G$S[sin] +# um <- magic(G$y[,k],G$X[,lpi[[k]]],rep(-1,sum(sin)),G$S[sin], +# match(G$off[sin],lpi[[k]]), ## turn G$off global indices into indices for this predictor +# nt=control$nthreads) +# G$family$ibeta[lpi[[k]]] <- um$b +# G$family$ibeta[nbeta+1] <- -.5*log(um$scale) ## initial log root precision +# nbeta <- nbeta + ydim - k + 1 +# } +# }) + + preinitialize <- function(G) { + ## G is a gam pre-fit object. Pre-initialize can manipulate some of its + ## elements, returning a named list of the modified ones. + ## extends model matrix with dummy columns and + ## finds initial coefficients ydim <- ncol(G$y) ## dimension of response nbeta <- ncol(G$X) ntheta <- ydim*(ydim+1)/2 ## number of cov matrix factor params @@ -105,7 +139,7 @@ #offs -> attr(G$X,"offset") attr(G$X,"XX") <- XX ## pad out sqrt of balanced penalty matrix to account for extra params - attr(G$Sl,"E") <- cbind(attr(G$Sl,"E"),matrix(0,nbeta,ntheta)) + if (!is.null(G$Sl)) attr(G$Sl,"E") <- cbind(attr(G$Sl,"E"),matrix(0,nbeta,ntheta)) G$family$data <- list(ydim = ydim,nbeta=nbeta) G$family$ibeta = rep(0,ncol(G$X)) ## now get initial parameters and store in family... @@ -113,15 +147,17 @@ sin <- G$off %in% lpi[[k]] #Sk <- G$S[sin] um <- magic(G$y[,k],G$X[,lpi[[k]]],rep(-1,sum(sin)),G$S[sin], - match(G$off[sin],lpi[[k]]), ## turn G$off global indices into indices for this predictor - nt=control$nthreads) + match(G$off[sin],lpi[[k]])) # , ## turn G$off global indices into indices for this predictor + #nt=control$nthreads) G$family$ibeta[lpi[[k]]] <- um$b G$family$ibeta[nbeta+1] <- -.5*log(um$scale) ## initial log root precision nbeta <- nbeta + ydim - k + 1 } - }) - - postproc <- expression({ + list(X=G$X,term.names=G$term.names,family=G$family) + } ## preinitialize + + + postproc <- expression({ ## code to evaluate in estimate.gam, to do with estimated factor of ## precision matrix, etc... ydim <- G$family$data$ydim diff -Nru mgcv-1.8-26/R/smooth.r mgcv-1.8-27/R/smooth.r --- mgcv-1.8-26/R/smooth.r 2018-09-15 12:51:56.000000000 +0000 +++ mgcv-1.8-27/R/smooth.r 2019-01-18 14:12:15.000000000 +0000 @@ -1,4 +1,4 @@ -## R routines for the package mgcv (c) Simon Wood 2000-2016 +## R routines for the package mgcv (c) Simon Wood 2000-2019 ## This file is primarily concerned with defining classes of smoother, ## via constructor methods and prediction matrix methods. There are diff -Nru mgcv-1.8-26/src/coxph.c mgcv-1.8-27/src/coxph.c --- mgcv-1.8-26/src/coxph.c 2018-11-20 20:40:48.000000000 +0000 +++ mgcv-1.8-27/src/coxph.c 2018-12-04 06:56:50.000000000 +0000 @@ -22,7 +22,7 @@ /* Function to predict the survivor function for the new data in X (n by p), t, given fit results in a, h, q, Vb, and original event times tr (length nt). - The new data are in descending order on entry, as is tr. + The new data are in descending time order on entry, as is tr. On exit n - vectors s and se contain the estimated survival function and its se. */ double eta,*p1,*p2,*p3,*v,*pv,*pa,x,vVv,hi,exp_eta; diff -Nru mgcv-1.8-26/src/init.c mgcv-1.8-27/src/init.c --- mgcv-1.8-26/src/init.c 2018-11-20 20:40:48.000000000 +0000 +++ mgcv-1.8-27/src/init.c 2018-11-26 17:01:53.000000000 +0000 @@ -11,7 +11,8 @@ {"mgcv_pmmult2", (DL_FUNC) &mgcv_pmmult2,5}, {"mgcv_Rpiqr", (DL_FUNC) &mgcv_Rpiqr,5}, { "mgcv_tmm",(DL_FUNC)&mgcv_tmm,5}, - { "mgcv_chol_down",(DL_FUNC)&mgcv_chol_down,5}, + { "mgcv_chol_down",(DL_FUNC)&mgcv_chol_down,5}, + { "mgcv_chol_up",(DL_FUNC)&mgcv_chol_up,5}, { "mgcv_Rpbsi",(DL_FUNC)&mgcv_Rpbsi,2}, { "mgcv_RPPt",(DL_FUNC)&mgcv_RPPt,3}, { "mgcv_Rpchol",(DL_FUNC)&mgcv_Rpchol,4}, diff -Nru mgcv-1.8-26/src/mat.c mgcv-1.8-27/src/mat.c --- mgcv-1.8-26/src/mat.c 2018-11-20 20:40:48.000000000 +0000 +++ mgcv-1.8-27/src/mat.c 2018-11-26 17:07:06.000000000 +0000 @@ -1722,6 +1722,15 @@ chol_down(R,Rup,n,k,ut); } +inline double hypot(double x, double y) { +/* stable computation of sqrt(x^2 + y^2) */ + double t; + x = fabs(x);y=fabs(y); + if (y>x) { t = x;x = y; y = t;} + if (x==0) return(y); else t = y/x; + return(x*sqrt(1+t*t)); +} /* hypot */ + void chol_down(double *R,double *Rup, int *n,int *k,int *ut) { /* R is an n by n choleski factor of an n by n matrix A. We want the downdated factor for A[-k,-k] returned in n-1 by n-1 matrix Rup. @@ -1731,7 +1740,6 @@ columnwise. Calls from R should ideally be made from a wrapper called from .Call, since otherwise copying can be the dominant cost. - Currently called directly with .C for accuracy testing. Assumes Rup zeroed on entry. */ int i,j,n1; double x,*Ri1,*Ri,*Rj,c,s,*Re,*ca,*sa,*sp,*cp; @@ -1778,7 +1786,8 @@ for (i = *k;i R[j,j] */ + z0 = hypot(z,*x); /* sqrt(z^2+R[j,j]^2) */ + c0 = *x/z0; s0 = z/z0; /* need to zero z */ + /* now apply this rotation and this column is finished (so no need to update z) */ + *x = s0 * z + c0 * *x; + } else for (j1=-1,j=0;j<*n;j++,u++,j1++) { /* loop over columns of R for down-dating */ + z = *u; /* initial element of u */ + x = R + *n * j; /* current column */ + c = R + 2;s = R + *n + 2; /* Storage for first n-2 hyperbolic rotations */ + for (c1=c+j1;c R[j,j] */ + z0 = z / *x; /* sqrt(z^2+R[j,j]^2) */ + if (fabs(z0>=1)) { /* downdate not +ve def */ + //Rprintf("j = %d d = %g ",j,z0); + if (*n>1) R[1] = -2.0;return; /* signals error */ + } + if (z0 > 1 - *eps) z0 = 1 - *eps; + c0 = 1/sqrt(1-z0*z0);s0 = c0 * z0; + /* now apply this rotation and this column is finished (so no need to update z) */ + *x = -s0 * z + c0 * *x; + } + + /* now zero c and s storage */ + c = R + 2;s = R + *n + 2; + for (x = c + *n - 2;c