A temporal difference method for multi-objective reinforcement learning
| dc.centro | E.T.S.I. Informática | en_US |
| dc.contributor.author | Ruiz-Montiel, Manuela | |
| dc.contributor.author | Mandow-Andaluz, Lorenzo | |
| dc.contributor.author | Pérez-de-la-Cruz-Molina, José Luis | |
| dc.date.accessioned | 2019-10-17T11:54:09Z | |
| dc.date.available | 2019-10-17T11:54:09Z | |
| dc.date.created | 2017 | |
| dc.date.issued | 2019-10-17 | |
| dc.departamento | Lenguajes y Ciencias de la Computación | |
| dc.description.abstract | This work describes MPQ-learning, an temporal-difference method that approximates the set of all non-dominated policies in multi-objective Markov decision problems, where rewards are vectors and each component stands for an objective to maximize. Unlike other approximations to Multi-objective Reinforcement Learning, MPQ-learning does not require additional parameters or preference information, and can be applied to non-convex Pareto frontiers. We also present the results of the application of MPQ-learning to some benchmark problems and compare it to a linearization procedure. | en_US |
| dc.description.sponsorship | This work is partially funded by grants TIN2009-14179 (Spanish Government, Plan Nacional de I+D+i) and TIN2016-80774-R (AEI/FEDER, UE) (Spanish Government, Agencia Estatal de Investigación; and European Union, Fondo Europeo de Desarrollo Regional). Manuela Ruiz-Montiel is funded by the Spanish Ministry of Education through the National F.P.U. Program. | en_US |
| dc.identifier.doi | https://doi.org/10.1016/j.neucom.2016.10.100 | |
| dc.identifier.uri | https://hdl.handle.net/10630/18596 | |
| dc.language.iso | eng | en_US |
| dc.rights.accessRights | open access | en_US |
| dc.subject | Aprendizaje automático (Inteligencia artificial) | en_US |
| dc.subject.other | Reinforcement learning | en_US |
| dc.subject.other | Multi-objective optimization | en_US |
| dc.subject.other | MOMDPs | en_US |
| dc.subject.other | Q-learning | en_US |
| dc.title | A temporal difference method for multi-objective reinforcement learning | en_US |
| dc.type | journal article | es_ES |
| dc.type.hasVersion | SMUR | es_ES |
| dspace.entity.type | Publication | |
| relation.isAuthorOfPublication | b4b11711-73ab-4cd0-854c-8ab2735e829d | |
| relation.isAuthorOfPublication | b7e65043-46cc-445b-8d8f-b4c7ad4f1c06 | |
| relation.isAuthorOfPublication.latestForDiscovery | b4b11711-73ab-4cd0-854c-8ab2735e829d |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- manuscript.pdf
- Size:
- 333.94 KB
- Format:
- Adobe Portable Document Format
- Description:

