[language-switcher]

Juan Aparicio (Miguel Hernández University of Elche; Valencian Graduate School and Research Network of Artificial Intelligence (valgrAI), and Miriam Esteve (Miguel Hernández University of Elche)

Abstract:

Data Envelopment Analysis (DEA) presents the typical characteristics of a data-driven approach with the specific objective of determining technical efficiency and production frontiers in Engineering and Microeconomics. However, by construction, the frontier estimator generated by DEA suffers from overfitting problems; something that contrasts with currently accepted models in machine learning. In this regard, DEA can be seen as a preliminary stage of a more complex approach, where the aim is to avoid overfitting in order to determine a proper description of the underlying Data Generating Process that is behind the generation of the observations in a production process. In this paper, we introduce a possible solution to overcome the overfitting problem associated with DEA that is based on cross-validation. This process “peels” the standard DEA frontier (removing certain supporting hyperplanes) until a new convex technology, which also satisfies free disposability in inputs and outputs but not the principle of minimal extrapolation, is determined. Our approach is tested by resorting to a computational experience. Additionally, we illustrate how the new method could be used as a complement to the standard DEA technique through an empirical application based on a PISA (Programme for International Student Assessment) dataset.