Listar LCC - Artículos por autor "Mandow-Andaluz, Lorenzo"

Preguntas frecuentes Manual de uso Derechos de autor Contacto/Sugerencias

Mostrando ítems 1-6 de 6

An evaluation of best compromise search in graphs

Machuca, Enrique; Mandow-Andaluz, Lorenzo; Galand, Lucie (Springer, 2013-09)

This work evaluates two different approaches for multicriteria graph search problems using compromise preferences. This approach focuses search on a single solution that represents a balanced tradeoff between objectives, ...
Multi-objective dynamic programming with limited precision

Mandow-Andaluz, Lorenzo; Pérez-de-la-Cruz-Molina, José Luis; Pozas García, Nicolás (Springer, 2021-11-02)

This paper addresses the problem of approximating the set of all solutions for Multi-objective Markov Decision Processes. We show that in the vast majority of interesting cases, the number of solutions is exponential or ...
PQ-learning: aprendizaje por refuerzo multiobjetivo

Ruiz-Montiel, Manuela; Mandow-Andaluz, Lorenzo; Pérez-de-la-Cruz-Molina, José Luis (2013-09)

En este artí culo describimos y analizamos PQ-learning, un algoritmo para problemas de aprendizaje por refuerzo multiobjetivo. El algoritmo es una extensi ón de Q-learning, un algoritmo para problemas de aprendizaje ...
Pruning dominated policies in multiobjective Pareto Q-learning

Mandow-Andaluz, Lorenzo; Pérez-de-la-Cruz-Molina, José Luis (2019-10-18)

The solution for a Multi-Objetive Reinforcement Learning problem is a set of Pareto optimal policies. MPQ-learning is a recent algorithm that approximates the whole set of all Pareto-optimal deterministic policies by ...
Randomness and control in design processes: an empirical study with architecture students.

Belmonte-Martínez, María Victoria; Millán-Valldeperas, Eva; Ruiz-Montiel, Manuela; Badillo, Reyes; Boned-Purkiss, Francisco Javier; Mandow-Andaluz, Lorenzo; Pérez-de-la-Cruz-Molina, José Luis[et al.] (2014-02-12)

The aim of this study is to explore designers' preferences between randomness and control in the generation of architectural forms. To this end, a generative computer tool was implemented that allows both random and ...
A temporal difference method for multi-objective reinforcement learning

Ruiz-Montiel, Manuela; Mandow-Andaluz, Lorenzo; Pérez-de-la-Cruz-Molina, José Luis (2019-10-17)

This work describes MPQ-learning, an temporal-difference method that approximates the set of all non-dominated policies in multi-objective Markov decision problems, where rewards are vectors and each component stands for ...

Listar LCC - Artículos por autor "Mandow-Andaluz, Lorenzo"

An evaluation of best compromise search in graphs ﻿

Multi-objective dynamic programming with limited precision ﻿

PQ-learning: aprendizaje por refuerzo multiobjetivo ﻿

Pruning dominated policies in multiobjective Pareto Q-learning ﻿

Randomness and control in design processes: an empirical study with architecture students. ﻿

A temporal difference method for multi-objective reinforcement learning ﻿

An evaluation of best compromise search in graphs

Multi-objective dynamic programming with limited precision

PQ-learning: aprendizaje por refuerzo multiobjetivo

Pruning dominated policies in multiobjective Pareto Q-learning

Randomness and control in design processes: an empirical study with architecture students.

A temporal difference method for multi-objective reinforcement learning