ordpoisson.RdFits a Poisson regression where the response is ordinal (the Poisson counts are grouped between known cutpoints).
ordpoisson(cutpoints, countdata = FALSE, NOS = NULL,
Levels = NULL, init.mu = NULL, parallel = FALSE,
zero = NULL, link = "loglink")Numeric. The cutpoints, \(K_l\).
These must be non-negative integers.
Inf values may be included.
See below for further details.
Logical. Is the response (LHS of formula) in count-data format?
If not then the response is a matrix or vector with values 1,
2, ..., L, say, where L is the number of
levels. Such input can be generated with cut
with argument labels = FALSE. If countdata = TRUE then
the response is expected to be in the same format as depvar(fit)
where fit is a fitted model with ordpoisson as the
VGAM family function. That is, the response is matrix of counts
with L columns (if NOS = 1).
Integer. The number of species, or more generally, the number of
response random variates.
This argument must be specified when countdata = TRUE.
Usually NOS = 1.
Integer vector, recycled to length NOS if necessary.
The number of levels for each response random variate.
This argument should agree with cutpoints.
This argument must be specified when countdata = TRUE.
Numeric. Initial values for the means of the Poisson regressions.
Recycled to length NOS if necessary.
Use this argument if the default initial values fail (the
default is to compute an initial value internally).
See poissonff.
See CommonVGAMffArguments for information.
This VGAM family function uses maximum likelihood estimation (Fisher scoring) to fit a Poisson regression to each column of a matrix response. The data, however, is ordinal, and is obtained from known integer cutpoints. Here, \(l=1,\ldots,L\) where \(L\) (\(L \geq 2\)) is the number of levels. In more detail, let \(Y^*=l\) if \(K_{l-1} < Y \leq K_{l}\) where the \(K_l\) are the cutpoints. We have \(K_0=-\infty\) and \(K_L=\infty\). The response for this family function corresponds to \(Y^*\) but we are really interested in the Poisson regression of \(Y\).
If NOS=1 then
the argument cutpoints is a vector \((K_1,K_2,\ldots,K_L)\)
where the last value (Inf) is optional. If NOS>1 then
the vector should have NOS-1 Inf values separating
the cutpoints. For example, if there are NOS=3 responses, then
something like
ordpoisson(cut = c(0, 5, 10, Inf, 20, 30, Inf, 0, 10, 40, Inf))
is valid.
An object of class "vglmff" (see vglmff-class).
The object is used by modelling functions such as vglm
and vgam.
Yee, T. W. (2020). Ordinal ordination with normalizing link functions for count data, (in preparation).
Sometimes there are no observations between two cutpoints. If so,
the arguments Levels and NOS need to be specified too.
See below for an example.
The input requires care as little to no checking is done.
If fit is the fitted object, have a look at fit@extra and
depvar(fit) to check.
set.seed(123) # Example 1
x2 <- runif(n <- 1000); x3 <- runif(n)
mymu <- exp(3 - 1 * x2 + 2 * x3)
y1 <- rpois(n, lambda = mymu)
cutpts <- c(-Inf, 20, 30, Inf)
fcutpts <- cutpts[is.finite(cutpts)] # finite cutpoints
ystar <- cut(y1, breaks = cutpts, labels = FALSE)
if (FALSE) { # \dontrun{
plot(x2, x3, col = ystar, pch = as.character(ystar))
} # }
table(ystar) / sum(table(ystar))
#> ystar
#> 1 2 3
#> 0.260 0.194 0.546
fit <- vglm(ystar ~ x2 + x3, fam = ordpoisson(cutpoi = fcutpts))
head(depvar(fit)) # This can be input if countdata = TRUE
#> [,1] [,2] [,3]
#> [1,] 1 0 0
#> [2,] 0 1 0
#> [3,] 1 0 0
#> [4,] 0 0 1
#> [5,] 0 0 1
#> [6,] 0 0 1
head(fitted(fit))
#> mu
#> 1 26.37788
#> 2 29.70400
#> 3 18.82698
#> 4 44.47479
#> 5 41.56519
#> 6 49.54660
head(predict(fit))
#> loglink(mu)
#> [1,] 3.272526
#> [2,] 3.391282
#> [3,] 2.935291
#> [4,] 3.794923
#> [5,] 3.727263
#> [6,] 3.902914
coef(fit, matrix = TRUE)
#> loglink(mu)
#> (Intercept) 3.0324949
#> x2 -0.9879523
#> x3 1.9155716
fit@extra
#> $NOS
#> [1] 1
#>
#> $Levels
#> [1] 3
#>
#> $y.integer
#> [1] TRUE
#>
#> $ncoly
#> [1] 3
#>
#> $countdata
#> [1] FALSE
#>
#> $cutpoints
#> [1] 20 30 Inf
#>
#> $n
#> [1] 1000
#>
# Example 2: multivariate and there are no obsns between some cutpoints
cutpts2 <- c(-Inf, 0, 9, 10, 20, 70, 200, 201, Inf)
fcutpts2 <- cutpts2[is.finite(cutpts2)] # finite cutpoints
y2 <- rpois(n, lambda = mymu) # Same model as y1
ystar2 <- cut(y2, breaks = cutpts2, labels = FALSE)
table(ystar2) / sum(table(ystar2))
#> ystar2
#> 2 3 4 5 6
#> 0.037 0.029 0.214 0.571 0.149
fit <- vglm(cbind(ystar,ystar2) ~ x2 + x3, fam =
ordpoisson(cutpoi = c(fcutpts,Inf,fcutpts2,Inf),
Levels = c(length(fcutpts)+1,length(fcutpts2)+1),
parallel = TRUE), trace = TRUE)
#> Iteration 1: loglikelihood = -3421.6299
#> Iteration 2: loglikelihood = -975.56347
#> Iteration 3: loglikelihood = -763.12349
#> Iteration 4: loglikelihood = -759.96202
#> Iteration 5: loglikelihood = -759.9608
#> Iteration 6: loglikelihood = -759.9608
coef(fit, matrix = TRUE)
#> loglink(mu1) loglink(mu2)
#> (Intercept) 2.993586 2.993586
#> x2 -1.017975 -1.017975
#> x3 2.032580 2.032580
fit@extra
#> $NOS
#> [1] 2
#>
#> $Levels
#> [1] 3 8
#>
#> $y.integer
#> [1] TRUE
#>
#> $ncoly
#> [1] 11
#>
#> $countdata
#> [1] FALSE
#>
#> $cutpoints
#> [1] 20 30 Inf 0 9 10 20 70 200 201 Inf
#>
#> $n
#> [1] 1000
#>
constraints(fit)
#> $`(Intercept)`
#> [,1]
#> [1,] 1
#> [2,] 1
#>
#> $x2
#> [,1]
#> [1,] 1
#> [2,] 1
#>
#> $x3
#> [,1]
#> [1,] 1
#> [2,] 1
#>
summary(depvar(fit)) # Some columns have all zeros
#> V1 V2 V3 V4 V5
#> Min. :0.00 Min. :0.000 Min. :0.000 Min. :0 Min. :0.000
#> 1st Qu.:0.00 1st Qu.:0.000 1st Qu.:0.000 1st Qu.:0 1st Qu.:0.000
#> Median :0.00 Median :0.000 Median :1.000 Median :0 Median :0.000
#> Mean :0.26 Mean :0.194 Mean :0.546 Mean :0 Mean :0.037
#> 3rd Qu.:1.00 3rd Qu.:0.000 3rd Qu.:1.000 3rd Qu.:0 3rd Qu.:0.000
#> Max. :1.00 Max. :1.000 Max. :1.000 Max. :0 Max. :1.000
#> V6 V7 V8 V9 V10
#> Min. :0.000 Min. :0.000 Min. :0.000 Min. :0.000 Min. :0
#> 1st Qu.:0.000 1st Qu.:0.000 1st Qu.:0.000 1st Qu.:0.000 1st Qu.:0
#> Median :0.000 Median :0.000 Median :1.000 Median :0.000 Median :0
#> Mean :0.029 Mean :0.214 Mean :0.571 Mean :0.149 Mean :0
#> 3rd Qu.:0.000 3rd Qu.:0.000 3rd Qu.:1.000 3rd Qu.:0.000 3rd Qu.:0
#> Max. :1.000 Max. :1.000 Max. :1.000 Max. :1.000 Max. :0
#> V11
#> Min. :0
#> 1st Qu.:0
#> Median :0
#> Mean :0
#> 3rd Qu.:0
#> Max. :0