Pareto and Truncated Pareto Distribution Family Functions

Estimates one of the parameters of the Pareto(I) distribution by maximum likelihood estimation. Also includes the upper truncated Pareto(I) distribution.

paretoff(scale = NULL, lshape = "loglink")
truncpareto(lower, upper, lshape = "loglink", ishape = NULL, imethod = 1)

Arguments

lshape

Parameter link function applied to the parameter $k$. See Links for more choices. A log link is the default because $k$ is positive.

scale

Numeric. The parameter $\alpha$ below. If the user inputs a number then it is assumed known with this value. The default means it is estimated by maximum likelihood estimation, which means min(y) is used, where y is the response vector.

lower, upper

Numeric. Lower and upper limits for the truncated Pareto distribution. Each must be positive and of length 1. They are called $\alpha$ and $U$ below.

ishape

Numeric. Optional initial value for the shape parameter. A NULL means a value is obtained internally. If failure to converge occurs try specifying a value, e.g., 1 or 2.

imethod

See CommonVGAMffArguments for information. If failure to converge occurs then try specifying a value for ishape.

Details

A random variable $Y$ has a Pareto distribution if $$P[Y>y] = C / y^{k}$$ for some positive $k$ and $C$. This model is important in many applications due to the power law probability tail, especially for large values of $y$.

The Pareto distribution, which is used a lot in economics, has a probability density function that can be written $$f(y;\alpha,k) = k \alpha^k / y^{k+1}$$ for $0 < \alpha < y$ and $0<k$. The $\alpha$ is called the scale parameter, and it is either assumed known or else min(y) is used. The parameter $k$ is called the shape parameter. The mean of $Y$ is $\alpha k/(k-1)$ provided $k > 1$. Its variance is $\alpha^2 k /((k-1)^2 (k-2))$ provided $k > 2$.

The upper truncated Pareto distribution has a probability density function that can be written $$f(y) = k \alpha^k / [y^{k+1} (1-(\alpha/U)^k)]$$ for $0 < \alpha < y < U < \infty$ and $k>0$. Possibly, better names for $k$ are the index and tail parameters. Here, $\alpha$ and $U$ are known. The mean of $Y$ is $k \alpha^k (U^{1-k}-\alpha^{1-k}) / [(1-k)(1-(\alpha/U)^k)]$.

Value

An object of class "vglmff" (see vglmff-class). The object is used by modelling functions such as vglm, and vgam.

References

Forbes, C., Evans, M., Hastings, N. and Peacock, B. (2011). Statistical Distributions, Hoboken, NJ, USA: John Wiley and Sons, Fourth edition.

Aban, I. B., Meerschaert, M. M. and Panorska, A. K. (2006). Parameter estimation for the truncated Pareto distribution, Journal of the American Statistical Association, 101(473), 270–277.

Author

T. W. Yee

Note

Outside of economics, the Pareto distribution is known as the Bradford distribution.

For paretoff, if the estimate of $k$ is less than or equal to unity then the fitted values will be NAs. Also, paretoff fits the Pareto(I) distribution. See paretoIV for the more general Pareto(IV/III/II) distributions, but there is a slight change in notation: $s = k$ and $b=\alpha$.

In some applications the Pareto law is truncated by a natural upper bound on the probability tail. The upper truncated Pareto distribution has three parameters (called $\alpha$, $U$ and $k$ here) but the family function truncpareto() estimates only $k$. With known lower and upper limits, the ML estimator of $k$ has the usual properties of MLEs. Aban (2006) discusses other inferential details.

Warning

The usual or unbounded Pareto distribution has two parameters (called $\alpha$ and $k$ here) but the family function paretoff estimates only $k$ using iteratively reweighted least squares. The MLE of the $\alpha$ parameter lies on the boundary and is min(y) where y is the response. Consequently, using the default argument values, the standard errors are incorrect when one does a summary on the fitted object. If the user inputs a value for alpha then it is assumed known with this value and then summary on the fitted object should be correct. Numerical problems may occur for small $k$, e.g., $k < 1$.

Examples

alpha <- 2; kay <- exp(3)
pdata <- data.frame(y = rpareto(n = 1000, scale = alpha, shape = kay))
fit <- vglm(y ~ 1, paretoff, data = pdata, trace = TRUE)
#> Iteration 1: loglikelihood = 1263.173
#> Iteration 2: loglikelihood = 1264.3282
#> Iteration 3: loglikelihood = 1264.3288
#> Iteration 4: loglikelihood = 1264.3288
fit@extra  # The estimate of alpha is here
#> $scale
#> [1] 2.00001
#> 
#> $scaleEstimated
#> [1] TRUE
#> 
head(fitted(fit))
#>       [,1]
#> 1 2.104041
#> 2 2.104041
#> 3 2.104041
#> 4 2.104041
#> 5 2.104041
#> 6 2.104041
with(pdata, mean(y))
#> [1] 2.104035
coef(fit, matrix = TRUE)
#>             loglink(shape)
#> (Intercept)       3.006925
summary(fit)  # Standard errors are incorrect!!
#> 
#> Call:
#> vglm(formula = y ~ 1, family = paretoff, data = pdata, trace = TRUE)
#> 
#> Coefficients: 
#>             Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)  3.00692    0.03162   95.09   <2e-16 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> Name of linear predictor: loglink(shape) 
#> 
#> Log-likelihood: 1264.329 on 999 degrees of freedom
#> 
#> Number of Fisher scoring iterations: 4 
#> 
#> No Hauck-Donner effect found in any of the estimates
#> 

# Here, alpha is assumed known
fit2 <- vglm(y ~ 1, paretoff(scale = alpha), data = pdata, trace = TRUE)
#> Iteration 1: loglikelihood = 1263.0715
#> Iteration 2: loglikelihood = 1264.2262
#> Iteration 3: loglikelihood = 1264.2268
#> Iteration 4: loglikelihood = 1264.2268
fit2@extra  # alpha stored here
#> $scale
#> [1] 2
#> 
#> $scaleEstimated
#> [1] FALSE
#> 
head(fitted(fit2))
#>       [,1]
#> 1 2.104042
#> 2 2.104042
#> 3 2.104042
#> 4 2.104042
#> 5 2.104042
#> 6 2.104042
coef(fit2, matrix = TRUE)
#>             loglink(shape)
#> (Intercept)       3.006823
summary(fit2)  # Standard errors are okay
#> 
#> Call:
#> vglm(formula = y ~ 1, family = paretoff(scale = alpha), data = pdata, 
#>     trace = TRUE)
#> 
#> Coefficients: 
#>             Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)  3.00682    0.03162   95.08   <2e-16 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> Name of linear predictor: loglink(shape) 
#> 
#> Log-likelihood: 1264.227 on 999 degrees of freedom
#> 
#> Number of Fisher scoring iterations: 4 
#> 
#> No Hauck-Donner effect found in any of the estimates
#> 

# Upper truncated Pareto distribution
lower <- 2; upper <- 8; kay <- exp(2)
pdata3 <- data.frame(y = rtruncpareto(n = 100, lower = lower,
                                      upper = upper, shape = kay))
fit3 <- vglm(y ~ 1, truncpareto(lower, upper), data = pdata3, trace = TRUE)
#> Iteration 1: loglikelihood = 17.149972
#> Iteration 2: loglikelihood = 17.150482
#> Iteration 3: loglikelihood = 17.150482
coef(fit3, matrix = TRUE)
#>             loglink(shape)
#> (Intercept)       1.999591
c(fit3@misc$lower, fit3@misc$upper)
#> [1] 2 8