lms.bcn.RdLMS quantile regression with the Box-Cox transformation to normality.
A numerical vector containing values between 0 and 100, which are the quantiles. They will be returned as `fitted values'.
Can be an integer-valued vector specifying which
linear/additive predictors are modelled as intercepts only.
The values must be from the set {1,2,3}.
The default value usually increases the chance of successful
convergence.
Setting zero = NULL means they all are
functions of the covariates.
For more information see CommonVGAMffArguments.
Parameter link functions applied to the first, second and third
linear/additive predictors.
See Links for more choices,
and CommonVGAMffArguments.
Degrees of freedom for the cubic smoothing spline fit applied to
get an initial estimate of mu.
See vsmooth.spline.
Degrees of freedom for the cubic smoothing spline fit applied to
get an initial estimate of sigma.
See vsmooth.spline.
This argument may be assigned NULL to get an initial value
using some other algorithm.
Initial value for lambda. If necessary, it is recycled to be a vector of length \(n\) where \(n\) is the number of (independent) observations.
Optional initial value for sigma.
If necessary, it is recycled to be a vector of length \(n\).
The default value, NULL, means an initial value is
computed in the @initialize slot of the family function.
Small positive number, the tolerance for testing if lambda is equal to zero.
Given a value of the covariate, this function applies a Box-Cox transformation to the response to best obtain normality. The parameters chosen to do this are estimated by maximum likelihood or penalized maximum likelihood.
In more detail,
the basic idea behind this method is that, for a fixed
value of \(x\), a Box-Cox transformation of the
response \(Y\)
is applied to obtain standard normality. The 3 parameters
(\(\lambda\), \(\mu\), \(\sigma\),
which start with the letters “L-M-S”
respectively, hence its name) are chosen to maximize a penalized
log-likelihood (with vgam). Then the
appropriate quantiles of the standard normal distribution
are back-transformed onto the original scale to get the
desired quantiles.
The three parameters may vary as a smooth function of \(x\).
The Box-Cox power transformation here of the \(Y\), given \(x\), is $$Z = [(Y/\mu(x))^{\lambda(x)} - 1]/(\sigma(x)\,\lambda(x))$$ for \(\lambda(x) \neq 0\). (The singularity at \(\lambda(x) = 0\) is handled by a simple function involving a logarithm.) Then \(Z\) is assumed to have a standard normal distribution. The parameter \(\sigma(x)\) must be positive, therefore VGAM chooses \(\eta(x)^T = (\lambda(x), \mu(x), \log(\sigma(x)))\) by default. The parameter \(\mu\) is also positive, but while \(\log(\mu)\) is available, it is not the default because \(\mu\) is more directly interpretable. Given the estimated linear/additive predictors, the \(100\alpha\) percentile can be estimated by inverting the Box-Cox power transformation at the \(100\alpha\) percentile of the standard normal distribution.
Of the three functions, it is often a good idea to allow
\(\mu(x)\) to be more flexible because the functions
\(\lambda(x)\) and \(\sigma(x)\)
usually vary more smoothly with \(x\). This is somewhat
reflected in the default value for the argument zero,
viz. zero = c(1, 3).
An object of class "vglmff"
(see vglmff-class).
The object is used by modelling functions
such as vglm,
rrvglm
and vgam.
Cole, T. J. and Green, P. J. (1992). Smoothing Reference Centile Curves: The LMS Method and Penalized Likelihood. Statistics in Medicine, 11, 1305–1319.
Green, P. J. and Silverman, B. W. (1994). Nonparametric Regression and Generalized Linear Models: A Roughness Penalty Approach, London: Chapman & Hall.
Yee, T. W. (2004). Quantile regression via vector generalized additive models. Statistics in Medicine, 23, 2295–2315.
The response must be positive because the Box-Cox transformation cannot handle negative values. In theory, the LMS-Yeo-Johnson-normal method can handle both positive and negative values.
In general, the lambda and sigma functions should be more
smoother than the mean function.
Having zero = 1, zero = 3
or zero = c(1, 3)
is often a good idea. See the example below.
The computations are not simple, therefore convergence may
fail. Set trace = TRUE to monitor convergence if it
isn't set already. Convergence failure will occur if, e.g.,
the response is bimodal at any particular value of \(x\).
In case of convergence failure, try different starting values.
Also, the estimate may diverge quickly near the solution, in
which case try prematurely stopping the iterations by assigning
maxits to be the iteration number corresponding to the
highest likelihood value.
One trick is to fit a simple model and use it to provide initial values for a more complex model; see in the examples below.
lms.bcg,
lms.yjn,
qtplot.lmscreg,
deplot.lmscreg,
cdf.lmscreg,
eCDF,
extlogF1,
alaplace1,
amlnormal,
denorm,
CommonVGAMffArguments.
if (FALSE) require("VGAMdata")
mysub <- subset(xs.nz, sex == "M" & ethnicity == "Maori" & study1)
#> Error: object 'xs.nz' not found
mysub <- transform(mysub, BMI = weight / height^2)
#> Error: object 'mysub' not found
BMIdata <- na.omit(mysub)
#> Error: object 'mysub' not found
BMIdata <- subset(BMIdata, BMI < 80 & age < 65,
select = c(age, BMI)) # Delete an outlier
#> Error: object 'BMIdata' not found
summary(BMIdata)
#> Error in h(simpleError(msg, call)): error in evaluating the argument 'object' in selecting a method for function 'summary': object 'BMIdata' not found
fit <- vgam(BMI ~ s(age, df = c(4, 2)), lms.bcn(zero = 1), BMIdata)
#> Error: object 'BMIdata' not found
par(mfrow = c(1, 2))
plot(fit, scol = "blue", se = TRUE) # The two centered smooths
#> Error in h(simpleError(msg, call)): error in evaluating the argument 'x' in selecting a method for function 'plot': object 'fit' not found
head(predict(fit))
#> Error in h(simpleError(msg, call)): error in evaluating the argument 'object' in selecting a method for function 'predict': object 'fit' not found
head(fitted(fit))
#> Error in h(simpleError(msg, call)): error in evaluating the argument 'object' in selecting a method for function 'fitted': object 'fit' not found
head(BMIdata)
#> Error: object 'BMIdata' not found
head(cdf(fit)) # Person 46 is probably overweight, given his age
#> Error in h(simpleError(msg, call)): error in evaluating the argument 'object' in selecting a method for function 'cdf': object 'fit' not found
100 * colMeans(c(depvar(fit)) < fitted(fit)) # Empirical proportions
#> Error in h(simpleError(msg, call)): error in evaluating the argument 'object' in selecting a method for function 'depvar': object 'fit' not found
# Correct for "vgam" objects but not very elegant:
fit@family@linkinv(eta = predict(fit, data.frame(age = 60)),
extra = list(percentiles = c(10, 50)))
#> Error: object 'fit' not found
if (FALSE) {
# These work for "vglm" objects:
fit2 <- vglm(BMI ~ bs(age, df = 4), lms.bcn(zero = 3), BMIdata)
predict(fit2, percentiles = c(10, 50),
newdata = data.frame(age = 60), type = "response")
head(fitted(fit2, percentiles = c(10, 50))) # Different percentiles
}
# Convergence problems? Use fit0 for initial values for fit1
fit0 <- vgam(BMI ~ s(age, df = 4), lms.bcn(zero = c(1, 3)), BMIdata)
#> Error: object 'BMIdata' not found
fit1 <- vgam(BMI ~ s(age, df = c(4, 2)), lms.bcn(zero = 1), BMIdata,
etastart = predict(fit0))
#> Error: object 'BMIdata' not found
# \dontrun{}
if (FALSE) # Quantile plot
par(bty = "l", mar = c(5, 4, 4, 3) + 0.1, xpd = TRUE)
qtplot(fit, percentiles = c(5, 50, 90, 99), main = "Quantiles",
xlim = c(15, 66), las = 1, ylab = "BMI", lwd = 2, lcol = 4)
#> Error in h(simpleError(msg, call)): error in evaluating the argument 'object' in selecting a method for function 'qtplot': object 'fit' not found
# Density plot
ygrid <- seq(15, 43, len = 100) # BMI ranges
par(mfrow = c(1, 1), lwd = 2)
(aa <- deplot(fit, x0 = 20, y = ygrid, xlab = "BMI", col = "black",
main = "PDFs at Age = 20 (black), 42 (red) and 55 (blue)"))
#> Error in h(simpleError(msg, call)): error in evaluating the argument 'object' in selecting a method for function 'deplot': object 'fit' not found
aa <- deplot(fit, x0 = 42, y = ygrid, add = TRUE, llty = 2, col = "red")
#> Error in h(simpleError(msg, call)): error in evaluating the argument 'object' in selecting a method for function 'deplot': object 'fit' not found
aa <- deplot(fit, x0 = 55, y = ygrid, add = TRUE, llty = 4, col = "blue",
Attach = TRUE)
#> Error in h(simpleError(msg, call)): error in evaluating the argument 'object' in selecting a method for function 'deplot': object 'fit' not found
aa@post$deplot # Contains density function values
#> Error: object 'aa' not found
# \dontrun{}