Guerrero, N. M. (Center of Operations Research, Miguel Hernández University of Elche), Aparicio, J. (Center of Operations Research, Miguel Hernández University of Elche), Moragues, R. (Center of Operations Research, Miguel Hernández University of Elche), & Valero-Carreras, D.(Center of Operations Research, Miguel Hernández University of Elche)
Abstract:
Data Envelopment Analysis (DEA) is nowadays a very famous nonparametric technique for the measurement of technical efficiency. It does so by building a production possibility set that satisfies certain microeconomic and mathematical axioms, such as free disposability in inputs and outputs and convexity, and determines the most conservative estimate of technical inefficiency of each assessed unit via the minimal extrapolation principle (i.e., the Occam’s Razor view). Given a data sample and from a statistical point of view, this last axiom implies an estimation of technical inefficiency by exclusively minimizing the empirical error, resulting in overfitting to the data as a by-product. To overcome this statistical problem when the objective is measuring technical inefficiency beyond the data sample, a methodology has been recently introduced which follows the Structural Risk Minimization (SRM) principle. It controls both the empirical and the generalization (prediction) error of the model. This methodology, called Data Envelopment Analysis-based Machines (DEAM) was introduced for the single-output setting [(Guerrero, Aparicio, & Valero-Carreras 2022). Combining Data Envelopment Analysis and Machine Learning. Mathematics 2022, 10, 909.]. In this chapter, we extend the DEAM approach for the estimation of production functions to the multi-output production framework, evaluate technical efficiency with respect to a variety of measures, and illustrate its performance via some empirical applications. In this way, we provide some examples of use of multi-output machine learning techniques for measuring technical efficiency on real-world data.