ListarLCC - Artículos por tema "Aprendizaje automático (Inteligencia artificial)"

Mostrando ítems 1-1 de 1

A temporal difference method for multi-objective reinforcement learning

Ruiz-Montiel, Manuela; Mandow-Andaluz, Lorenzo; Pérez-de-la-Cruz-Molina, José Luis (2019-10-17)

This work describes MPQ-learning, an temporal-difference method that approximates the set of all non-dominated policies in multi-objective Markov decision problems, where rewards are vectors and each component stands for ...