Dealing with Censored Earnings in Register Data
2025_JBNST.Rmd
“Dealing with Censored Earnings in Register Data”, Jahrbücher für Nationalökonomie und Statistik, 2025.
Abstract: Earnings are often top-coded (right-censored) in administrative registers. The censoring threshold in the case of Germany is the limit value for social security contributions, leading to a substantial fraction of censoring: For example, about 12 % of male workers in West Germany are affected, rising to above 30 % for highly educated prime-aged workers. This missing right tail of the earnings distribution constitutes a major problem for researchers studying earnings inequality and top incomes. We overcome this challenge by taking a distributional approach and semi-parametrically modelling the right tail as being Pareto-like. Non-censored earnings survey data matched to administrative records, derived from the SOEP-RV project, let us operate in a laboratory-like setting in which the targets are known. Our approach outperforms alternative imputation methods based on Tobit regressions.
Cite (toggle to un/fold)
@article{BeckmannshagenKönigRetterSchluterSchröderTchokni+2025,
url = {https://doi.org/10.1515/jbnst-2024-0037},
title = {Dealing with Censored Earnings in Register Data},
author = {Mattis Beckmannshagen and Johannes König and Isabella Retter and Christian Schluter and Carsten Schröder and Yogam Tchokni},
journal = {Jahrbücher für Nationalökonomie und Statistik},
doi = {doi:10.1515/jbnst-2024-0037},
year = {2025}
}
The left panel shows the histogram of the administrative (IA) earnings distribution with the mass point at the assessment ceiling (transparent), and the fitted tail based on beyondpareto (where the tail index estimate is .24). The right panel shows the empirical survey earnings in the SOEP-RV sample, the Tobit estimate, and our estimation based on beyondpareto. Clearly, our tail estimate dominates the usual Tobit estimate.