garma.RdFits GARMA models to time series data.
garma(link = "identitylink", p.ar.lag = 1, q.ma.lag = 0,
coefstart = NULL, step = 1)Link function applied to the mean response.
The default is suitable for continuous responses.
The link loglink should be chosen if the data
are counts.
The link reciprocal can be chosen if the data
are counts
and the variance assumed for this is \(\mu^2\).
The links logitlink, probitlink,
clogloglink, and
cauchitlink are supported and suitable for
binary responses.
Note that when the log or logit link is chosen:
for log and logit,
zero values can be replaced by bvalue.
See loglink and logitlink etc. for
specific information about each link function.
A positive integer, the lag for the autoregressive component. Called \(p\) below.
A non-negative integer, the lag for the moving-average component. Called \(q\) below.
Starting values for the coefficients.
Assigning this argument is highly recommended.
For technical reasons, the
argument coefstart in vglm cannot be used.
Numeric. Step length, e.g., 0.5 means half-stepsizing.
This function draws heavily on Benjamin et al. (1998).
See also Benjamin et al. (2003).
GARMA models extend the ARMA time series model to generalized
responses in the exponential family, e.g., Poisson counts,
binary responses. Currently, this function is rudimentary and
can handle only certain continuous, count and binary responses only.
The user must choose an appropriate link for the link argument.
The GARMA(\(p, q\)) model is defined by firstly having a response belonging to the exponential family $$f(y_t|D_t) = \exp \left\{ \frac{y_t \theta_t - b(\theta_t)}{\phi / A_t} + c(y_t, \phi / A_t) \right\}$$ where \(\theta_t\) and \(\phi\) are the canonical and scale parameters respectively, and \(A_t\) are known prior weights. The mean \(\mu_t=E(Y_t|D_t)=b'(\theta_t)\) is related to the linear predictor \(\eta_t\) by the link function \(g\). Here, \(D_t=\{x_t,\ldots,x_1,y_{t-1},\ldots,y_1,\mu_{t-1},\ldots,\mu_1\}\) is the previous information set. Secondly, the GARMA(\(p, q\)) model is defined by $$g(\mu_t) = \eta_t = x_t^T \beta + \sum_{k=1}^p \phi_k (g(y_{t-k}) - x_{t-k}^T \beta) + \sum_{k=1}^q \theta_k (g(y_{t-k}) - \eta_{t-k}).$$ Parameter vectors \(\beta\), \(\phi\) and \(\theta\) are estimated by maximum likelihood.
An object of class "vglmff" (see vglmff-class).
The object is used by modelling functions such as vglm.
Benjamin, M. A., Rigby, R. A. and Stasinopoulos, M. D. (1998). Fitting Non-Gaussian Time Series Models. Pages 191–196 in: Proceedings in Computational Statistics COMPSTAT 1998 by Payne, R. and P. J. Green. Physica-Verlag.
Benjamin, M. A., Rigby, R. A. and Stasinopoulos, M. D. (2003). Generalized Autoregressive Moving Average Models. Journal of the American Statistical Association, 98: 214–223.
Zeger, S. L. and Qaqish, B. (1988). Markov regression models for time series: a quasi-likelihood approach. Biometrics, 44: 1019–1031.
This function is unpolished and is requires lots of improvements.
In particular, initialization is very poor.
Results appear very sensitive to quality of initial values.
A limited amount of experience has shown that half-stepsizing is
often needed for convergence, therefore choosing crit = "coef"
is not recommended.
Overdispersion is not handled.
For binomial responses it is currently best to input a vector
of 1s and 0s rather than the cbind(successes, failures)
because the initialize slot is rudimentary.
This VGAM family function is 'non-standard' in that the model does need some coercing to get it into the VGLM framework. Special code is required to get it running. A consequence is that some methods functions may give wrong results when applied to the fitted object.
gdata <- data.frame(interspike = c(68, 41, 82, 66, 101, 66, 57, 41, 27, 78,
59, 73, 6, 44, 72, 66, 59, 60, 39, 52,
50, 29, 30, 56, 76, 55, 73, 104, 104, 52,
25, 33, 20, 60, 47, 6, 47, 22, 35, 30,
29, 58, 24, 34, 36, 34, 6, 19, 28, 16,
36, 33, 12, 26, 36, 39, 24, 14, 28, 13,
2, 30, 18, 17, 28, 9, 28, 20, 17, 12,
19, 18, 14, 23, 18, 22, 18, 19, 26, 27,
23, 24, 35, 22, 29, 28, 17, 30, 34, 17,
20, 49, 29, 35, 49, 25, 55, 42, 29, 16)) # See Zeger and Qaqish (1988)
gdata <- transform(gdata, spikenum = seq(interspike))
bvalue <- 0.1 # .Machine$double.xmin # Boundary value
fit <- eval(substitute(
vglm(interspike ~ 1, trace = TRUE, data = gdata,
garma(paste0("loglink(bvalue = ", .bvalue , ")"),
p = 2, coefstart = c(4, 0.3, 0.4))),
list( .bvalue = bvalue)))
#> Iteration 1: loglikelihood = -33517179
#> Taking a modified step.....
#> Iteration 1 : loglikelihood = 9063.765
#> Iteration 2: loglikelihood = 9039.4617
#> Taking a modified step.
#> Iteration 2 : loglikelihood = 9294.1272
#> Iteration 3: loglikelihood = 8776.2807
#> Taking a modified step..
#> Iteration 3 : loglikelihood = 9311.6036
#> Iteration 4: loglikelihood = 9236.8891
#> Taking a modified step..
#> Iteration 4 : loglikelihood = 9325.2546
#> Iteration 5: loglikelihood = 9303.4811
#> Taking a modified step..
#> Iteration 5 : loglikelihood = 9327.5058
#> Iteration 6: loglikelihood = 9323.9082
#> Taking a modified step.
#> Iteration 6 : loglikelihood = 9327.635
#> Iteration 7: loglikelihood = 9318.4243
#> Taking a modified step...
#> Iteration 7 : loglikelihood = 9328.5283
#> Iteration 8: loglikelihood = 9329.2394
#> Iteration 9: loglikelihood = 9316.0423
#> Taking a modified step..
#> Iteration 9 : loglikelihood = 9329.4185
#> Iteration 10: loglikelihood = 9329.8789
#> Iteration 11: loglikelihood = 9330.6516
#> Iteration 12: loglikelihood = 9330.6545
#> Iteration 13: loglikelihood = 9330.6617
#> Iteration 14: loglikelihood = 9330.6617
summary(fit)
#>
#> Call:
#> vglm(formula = interspike ~ 1, family = garma(paste0("loglink(bvalue = ",
#> 0.1, ")"), p = 2, coefstart = c(4, 0.3, 0.4)), data = gdata,
#> trace = TRUE)
#>
#> Coefficients:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) 3.71331 0.04117 90.191 < 2e-16 ***
#> (lag1) 0.33723 0.03076 10.962 < 2e-16 ***
#> (lag2) 0.24180 0.02988 8.094 5.79e-16 ***
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> Name of linear predictor: loglink(mu)
#>
#> Log-likelihood: 9330.662 on 95 degrees of freedom
#>
#> Number of Fisher scoring iterations: 14
#>
#> No Hauck-Donner effect found in any of the estimates
#>
coef(fit, matrix = TRUE)
#> loglink(mu)
#> (Intercept) 3.7133065
#> (lag1) 0.3372322
#> (lag2) 0.2418008
Coef(fit) # A bug here
#> mu <NA> <NA>
#> 40.989114 1.401064 1.273541
if (FALSE) with(gdata, plot(interspike, ylim = c(0, 120), las = 1,
xlab = "Spike Number", ylab = "Inter-Spike Time (ms)", col = "blue"))
with(gdata, lines(spikenum[-(1:fit@misc$plag)], fitted(fit), col = "orange"))
#> Error in plot.xy(xy.coords(x, y), type = type, ...): plot.new has not been called yet
abline(h = mean(with(gdata, interspike)), lty = "dashed", col = "gray") # \dontrun{}
#> Error in int_abline(a = a, b = b, h = h, v = v, untf = untf, ...): plot.new has not been called yet