Skip to contents

Short Summary

beyondpareto : Optimal Extreme Value Index Estimation based on Rank-size Regression and Asymptotic Mean Squared Error Minimization for Threshold Choice of Upper Order Statistics.

Longer Summary

Consider a regularly varying cumulative distribution function F , so for sufficiently large y and γ > 0

F(y) = 1 − l(y) × y−1/γ

where l denotes a slowly varying nuisance function that is constant asymptotically (l(ty)/l(y) = 1 as y → ∞). γ > 0 is called the extreme value index, and the Pareto or tail index (α ≡ 1/γ) is its reciprocal. The objective is to estimate the parameter γ.

The stata function beyondpareto estimates an optimally selected extreme value index (gamma) following the method in Schluter (2018, 2021). The selection takes places via the minimization of the Asymptotic Mean Squared Error (AMSE), which is a weighted sum of bias and variance of the the estimated tail coefficient.

The basic structure of the program is the following: First, the dataset is ordered from the largest to the smallest of the values of the variable passed in varname. Second, for each of the observations ranked in this way, the extreme value index gamma and the AMSE are calculated (Beirlant et al., 1996). Third, the observation with the minimum AMSE is selected and the optimal extreme value index estimate along with the standard error and the 95% normal confidence interval are displayed. Fourth, if chosen, three diagnostic plots are displayed. The extreme value index gamma is obtained by running a least-squares regression on the coordinates of the Pareto quantile-quantile (QQ-) plot, which is a type of rank-size plot. This estimation technique has the advantage of being more efficient than other standard maximum likelihood estimation technique or other regression-based techniques.
If the dataset comes with sampling weights (such as survey data), then the estimator of gamma is computed using a weighted distribution function.

Installation

The download instructions will follow in due course.