Nowadays, a large amount of varied data is being generated which, when made available
to the decision maker, constitutes a valuable resource in optimization problems.
These data, however, are not free from uncertainty about the physical, economic or
social context, system or process from which they originate; uncertainty that, on the
other hand, the decision maker must take into account in his/her decision making process.
The objective of this PhD dissertation is to develop theoretical foundations and
investigate methods for solving optimization problems where there is a great diversity
of data on uncertain phenomena. Today’s decision makers not only collect observations
from the uncertainties directly affecting their decision-making processes, but also gather
some prior information about the data-generating distribution of the uncertainty. This
information is used by the decision maker to prescribe a more accurate set of potential
probability distributions, the so-called ambiguity set in distributionally robust optimization.
Our intention, therefore, is to develop a purely data-driven methodology, within
the scope of distributionally robust optimization based on the optimal transportation
problem, which exploits some extra/prior information about the random phenomenon.
This extra information crystallizes in two axes on the nature of the random phenomenon:
first, some prior information about, for example, the shape/structure of the probability
distribution; second, some conditional information such as that given by various covariates,
which help explain the random phenomenon underlying the optimization problem
without resorting to prior regression techniques.