bam.update.RdGaussian with identity link models fitted by bam can be efficiently updated as new data becomes available,
by simply updating the QR decomposition on which estimation is based, and re-optimizing the smoothing parameters, starting
from the previous estimates. This routine implements this.
bam.update(b,data,chunk.size=10000)A gam object fitted by bam and representing a strictly additive model
(i.e. gaussian errors, identity link).
Extra data to augment the original data used to obtain b. Must include a weights column if the
original fit was weighted and a AR.start column if AR.start was non NULL in original fit.
size of subsets of data to process in one go when getting fitted values.
An object of class "gam" as described in gamObject.
bam.update updates the QR decomposition of the (weighted) model matrix of the GAM represented by b to take
account of the new data. The orthogonal factor multiplied by the response vector is also updated. Given these updates the model
and smoothing parameters can be re-estimated, as if the whole dataset (original and the new data) had been fitted in one go. The
function will use the same AR1 model for the residuals as that employed in the original model fit (see rho parameter
of bam).
Note that there may be small numerical differences in fit between fitting the data all at once, and fitting in stages by updating, if the smoothing bases used have any of their details set with reference to the data (e.g. default knot locations).
AIC computation does not currently take account of AR model, if used.
library(mgcv)
## following is not *very* large, for obvious reasons...
set.seed(8)
n <- 5000
dat <- gamSim(1,n=n,dist="normal",scale=5)
#> Gu & Wahba 4 term additive model
dat[c(50,13,3000,3005,3100),]<- NA
dat1 <- dat[(n-999):n,]
dat0 <- dat[1:(n-1000),]
bs <- "ps";k <- 20
method <- "GCV.Cp"
b <- bam(y ~ s(x0,bs=bs,k=k)+s(x1,bs=bs,k=k)+s(x2,bs=bs,k=k)+
s(x3,bs=bs,k=k),data=dat0,method=method)
b1 <- bam.update(b,dat1)
#> Error in eval(parse(text = terms[i]), enclos = p.env, envir = mgcvns): object 'k' not found
b2 <- bam.update(bam.update(b,dat1[1:500,]),dat1[501:1000,])
#> Error in eval(parse(text = terms[i]), enclos = p.env, envir = mgcvns): object 'k' not found
b3 <- bam(y ~ s(x0,bs=bs,k=k)+s(x1,bs=bs,k=k)+s(x2,bs=bs,k=k)+
s(x3,bs=bs,k=k),data=dat,method=method)
b1;b2;b3
#> Error: object 'b1' not found
## example with AR1 errors...
e <- rnorm(n)
for (i in 2:n) e[i] <- e[i-1]*.7 + e[i]
dat$y <- dat$f + e*3
dat[c(50,13,3000,3005,3100),]<- NA
dat1 <- dat[(n-999):n,]
dat0 <- dat[1:(n-1000),]
b <- bam(y ~ s(x0,bs=bs,k=k)+s(x1,bs=bs,k=k)+s(x2,bs=bs,k=k)+
s(x3,bs=bs,k=k),data=dat0,rho=0.7)
b1 <- bam.update(b,dat1)
#> Error in eval(parse(text = terms[i]), enclos = p.env, envir = mgcvns): object 'k' not found
summary(b1);summary(b2);summary(b3)
#> Error in h(simpleError(msg, call)): error in evaluating the argument 'object' in selecting a method for function 'summary': object 'b1' not found