Daniel Valero, Juan Aparicio and Nadia Guerrero (University Miguel Hernández of Elche)
Abstract: In microeconomics, a topic of interest is the estimation of production functions. By definition, a production function is a non-decreasing function that envelops all the observations (firms) from above in the input-output space, capturing the extreme behavior of the data. These characteristics are far from the usual ones assumed by machine learning techniques like Support Vector Regression (SVR) in Support Vector Machines, where the function to be estimated relates the response variable to the covariables in terms of the mean instead of the extremes and, additionally, they try to fit the data as much as possible, determining a function that increases and decreases following a data-driven process. In this paper, we introduce an adaptation of SVR, denominated Support Vector Frontiers (SVF), with the objective of estimating production functions. To do so and seeking meeting points between SVR and the standard non-parametric techniques for estimating production functions, mainly Free Disposal Hull (FDH) and Data Envelopment Analysis (DEA), an estimator is defined in this paper through a specific input transformation function. However, and in contrast to FDH and DEA, SVF overcomes the overfitting problems from using these techniques. Additionally, we show in this paper that standard FDH and DEA could be reinterpreted, in some sense, as Support Vector Regression techniques. Moreover, a new robust notion of efficiency is introduced, called ε-insensitive technical efficiency, directly inherited from Support Vector Machines. Finally, the performance of SVF is measured through several experiments using synthetic data, showing that the new approach considerably reduces the bias and mean squared error associated with the estimation of the true production function in comparison with standard FDH and DEA, although at the expense of a more computational burden.