This Replication Kit details all important computations carried out for the paper

  • König, Schluter, Schröder, Retter, Beckmannshagen, “The beyondpareto command for optimal extreme value index estimation”, forthcoming in The Stata Journal.

  • For further explanations, developments and applications, consult the vignette.

1 [Section 4.1]: Synthetic data examples

These examples, also included in the beyondpareto’s help file, are presented in detail in the vignette.

2 [Section 4.2 Example 2]: Top wealth in Germany

The data set used here is the German Socio-Economic Panel. We are not allowed to share the data. However, it is available, free of charge, to researchers worldwide, upon entering into a data contract, see details.

The full replication code (data generation, all computations) will be deposited in the SOEP Research Data Center (SOEP-RDC), which can be accessed by authorised users with a valid SOEP data contract.

The wealth variable is w01110 which we rename wealth

qui{ 
ren w01110 wealth  
keep if hhrf!=. & hhrf>0

//only positive wealth
keep if wealth>0 & wealth!=.

//drop migration samples
drop if inlist(psample,17,18,19)

//collapse to HH level
collapse (sum) wealth (mean) hhrf hhrfao hhrfp psample , by(hid)

//SOEP and SOEP-P
gen w_SOEPP=hhrf
}

We replicate Figure 3.

qui {
preserve 

  //have labels for plots ready
  mat labs=J(1,5,0) 
  local cnt=1
  foreach q of numlist 100000 1000000  10000000 100000000 {
      global lab`cnt'=log(`q')
      local cnt=1+`cnt'
  }

  gen samp = .
  replace samp = 2 if psample == 22 /* P 2019 Top Shareholders  */
  replace samp = 1 if psample != 22

    local weight w_SOEPP
 
    keep if `weight'!=.

    sort wealth, stable
    gen K=_N-(_n-1)

    gsort - wealth
    sum `weight' 
    gen F=log((r(sum)+1)/(sum(`weight'))) 

    gen double ljnw=F 
    gen double Xnw=log(wealth)
    local thr=log(100000)

    replace Xnw= Xnw-.2 if samp==1

    tw scatter Xnw ljnw if samp==1 & Xnw>=`thr' & inrange(ljnw,4,13) ,  mcolor(gs12) msym(oh) msize(medium) || ///
    scatter Xnw ljnw if samp==2 & Xnw>=`thr' & inrange(ljnw,4,13),  mcolor(black) msym(+) msize(medsmall)  ///
    scheme(s2mono) xtitle("Relative Rank") ytitle("Net Wealth") legend(off) /// 
    ylabel( ${lab1} "0.1M" ${lab2} "1M" ${lab3} "10M" ${lab4} "100M") ///
    graphregion(color(white))  title(" ") 

restore 
}

We replicate Table 7 and Figure 4:

beyondpareto wealth [w=w_SOEPP], fracrange(.003,.3) rho(-0.5) plot(all)
di "alpha (1/gamma): " 1/e(gamma)

qui graph export pics/QQ2.png, replace
(sampling weights assumed)
Using the given sampling weights.
Considering the top 4178 of 13925 values, starting from the first 42 and testing 4137 values for k base
> .

----------------------------------------------------------------------
  Results |      Ybase       gamma        S.E.  [95% Conf.   Interval]
----------+-----------------------------------------------------------
   wealth |     402200   .60069975     .011569    .5780244    .6233751
----------------------------------------------------------------------
Optimal k base: 3370

alpha (1/gamma): 1.6647252

3 [Section 4.3 Example 3]: The German city size distribution

clear

quietly {
  clear
  import excel using "https://www.destatis.de/DE/Themen/Laender-Regionen/Regionales/Gemeindeverzeichnis/Administrativ/Archiv/GVAuszugJ/31122000_Auszug_GV.xlsx?__blob=publicationFile", sheet(Gemeindedaten) cellrange(J8:J16155)

  rename J citysize
  drop if citysize == .
  keep if citysize > 1
  gen weight = 1
  gsort - citysize
}

We replicate Table 8 and Figure 5:

beyondpareto citysize, rho(-0.5) fracrange(0.001, 0.5) plot(all)

qui graph export pics/QQ3.png, replace
No sampling weights given. Using w=1.
Considering the top 6919 of 13837 values, starting from the first 14 and testing 6906 values for k base
> .

----------------------------------------------------------------------
  Results |      Ybase       gamma        S.E.  [95% Conf.   Interval]
----------+-----------------------------------------------------------
 citysize |      16042   .76189049    .0283468    .7063308    .8174502
----------------------------------------------------------------------
Optimal k base: 903