In this paper we wish to tackle stochastic programs affected by ambiguity about the probability law that governs their uncertain parameters. Using optimal transport theory, we construct an ambiguity set that exploits the knowledge about the distribution of the uncertain parameters, which is provided by: (1) sample data and (2) a-priori information on the order among the probabilities that the true data-generating distribution assigns to some regions of its support set. This type of order is enforced by means of order cone constraints and can encode a wide range of information on the shape of the probability distribution of the uncertain parameters such as information related to monotonicity or multi-modality. We seek decisions that are distributionally robust. In a number of practical cases, the resulting distributionally robust optimization (DRO) problem can be reformulated as a finite convex problem where the a-priori information translates into linear constraints. In addition, our method inherits the finite-sample performance guarantees of the Wasserstein-metric-based DRO approach proposed by Mohajerin Esfahani and Kuhn (Math Program 171(1–2):115–166. https://doi.org/10.1007/s10107-017-1172-1, 2018), while generalizing this and other popular DRO approaches. Finally, we have designed numerical experiments to analyze the performance of our approach with the newsvendor problem and the problem of a strategic firm competing à la Cournot in a market.