Mostrar el registro sencillo del ítem
Lightweight asynchronous scheduling in heterogeneous reconfigurable systems
dc.contributor.author | Rodríguez-Moreno, Andrés | |
dc.contributor.author | González-Navarro, María Ángeles | |
dc.contributor.author | Nikov, Kris | |
dc.contributor.author | Nunez-Yanez, José | |
dc.contributor.author | Gran-Tejero, Rubén | |
dc.contributor.author | Suárez Gracia, Darío | |
dc.contributor.author | Asenjo-Plaza, Rafael | |
dc.date.accessioned | 2022-05-16T09:25:45Z | |
dc.date.available | 2022-05-16T09:25:45Z | |
dc.date.issued | 2022-03 | |
dc.identifier.citation | Andrés Rodríguez, Angeles Navarro, Kris Nikov, Jose Nunez-Yanez, Rubén Gran, Darío Suárez Gracia, Rafael Asenjo, Lightweight asynchronous scheduling in heterogeneous reconfigurable systems, Journal of Systems Architecture, Volumen 124, 2022, 102398, ISSN 1383-7621, https://doi.org/10.1016/j.sysarc.2022.102398. | es_ES |
dc.identifier.uri | https://hdl.handle.net/10630/24124 | |
dc.description.abstract | The trend for heterogeneous embedded systems is the integration of accelerators and general-purpose CPU cores on the same die. In these integrated architectures, like the Zynq UltraScale+ board (CPU+FPGA) that we target in this work, hardware support for shared memory and low-overhead synchronization between the accelerator and the CPU cores make the case for exploring strategies that exploit a tight collaboration between the CPUs and the accelerator. In this paper we propose a novel lightweight scheduling strategy, FastFit, targeted to FPGA accelerators, and a new scheduler based on it, named MultiFastFit, which asynchronously tackles heterogeneous systems comprised of a variety of CPU cores and FPGA IPs. Our strategy significantly reduces the overhead to automatically compute the near-optimal chunksizes when compared to a previous state-of-the-art auto-tuned approach, which makes our approach more suitable for fine-grained applications. Additionally, our scheduler MultiFastFit has been designed to enable the efficient co-execution of work among compute devices in such a way that all the devices are busy while minimizing the load unbalance. Our approaches have been evaluated using four benchmarks carefully tuned for the low-power UltraScale+ platform. Our experiments demonstrate that the FastFit strategy always finds the near-optimal FPGA chunksize for any device configuration at a reasonable cost, even for fine-grained and irregular applications, and that heterogeneous CPU+FPGA co-executions that exploit all the compute devices are usually faster and more energy efficient than the CPU-only and FPGA-only executions. We have also compared MultiFastFit with other state-of-the-art scheduling strategies, finding that it outperforms other auto-tuned approach up to 2x and it achieves similar results to manually-tuned schedulers without requiring an offline search of the ideal CPU-FPGA partition or FPGA chunk granularity. | es_ES |
dc.description.sponsorship | This work was partially supported by the Spanish projects PID2019-105396RB-I00, UMA18-FEDERJA-108, and UK EPSRC projects ENEAC (EP/N002539/1), HOPWARE (EP/V040863/1) and RS MINET (INF\R2\192044). Funding for open access charge: Universidad de Málaga / CBUA. | es_ES |
dc.language.iso | eng | es_ES |
dc.publisher | Elsevier | es_ES |
dc.rights | info:eu-repo/semantics/openAccess | es_ES |
dc.rights.uri | http://creativecommons.org/licenses/by/4.0/ | * |
dc.subject | Computación heterogénea | es_ES |
dc.subject.other | Heterogeneous architecture | es_ES |
dc.subject.other | FPGA | es_ES |
dc.subject.other | Heterogeneous scheduling | es_ES |
dc.subject.other | Throughput model | es_ES |
dc.subject.other | Energy efficiency | es_ES |
dc.title | Lightweight asynchronous scheduling in heterogeneous reconfigurable systems | es_ES |
dc.type | info:eu-repo/semantics/article | es_ES |
dc.centro | E.T.S.I. Informática | es_ES |
dc.identifier.doi | https://doi.org/10.1016/j.sysarc.2022.102398 | |
dc.rights.cc | Atribución 4.0 Internacional | * |