Mostrar el registro sencillo del ítem
A temporal difference method for multi-objective reinforcement learning
dc.contributor.author | Ruiz-Montiel, Manuela | |
dc.contributor.author | Mandow-Andaluz, Lorenzo | |
dc.contributor.author | Pérez-de-la-Cruz-Molina, José Luis | |
dc.date.accessioned | 2019-10-17T11:54:09Z | |
dc.date.available | 2019-10-17T11:54:09Z | |
dc.date.created | 2017 | |
dc.date.issued | 2019-10-17 | |
dc.identifier.uri | https://hdl.handle.net/10630/18596 | |
dc.description.abstract | This work describes MPQ-learning, an temporal-difference method that approximates the set of all non-dominated policies in multi-objective Markov decision problems, where rewards are vectors and each component stands for an objective to maximize. Unlike other approximations to Multi-objective Reinforcement Learning, MPQ-learning does not require additional parameters or preference information, and can be applied to non-convex Pareto frontiers. We also present the results of the application of MPQ-learning to some benchmark problems and compare it to a linearization procedure. | en_US |
dc.description.sponsorship | This work is partially funded by grants TIN2009-14179 (Spanish Government, Plan Nacional de I+D+i) and TIN2016-80774-R (AEI/FEDER, UE) (Spanish Government, Agencia Estatal de Investigación; and European Union, Fondo Europeo de Desarrollo Regional). Manuela Ruiz-Montiel is funded by the Spanish Ministry of Education through the National F.P.U. Program. | en_US |
dc.language.iso | eng | en_US |
dc.rights | info:eu-repo/semantics/openAccess | en_US |
dc.subject | Aprendizaje automático (Inteligencia artificial) | en_US |
dc.subject.other | Reinforcement learning | en_US |
dc.subject.other | Multi-objective optimization | en_US |
dc.subject.other | MOMDPs | en_US |
dc.subject.other | Q-learning | en_US |
dc.title | A temporal difference method for multi-objective reinforcement learning | en_US |
dc.type | info:eu-repo/semantics/article | es_ES |
dc.centro | E.T.S.I. Informática | en_US |
dc.identifier.doi | https://doi.org/10.1016/j.neucom.2016.10.100 | |
dc.type.hasVersion | info:eu-repo/semantics/submittedVersion | es_ES |