RT Conference Proceedings T1 Efficient OpenCL-based concurrent tasks offloading on accelerators A1 Lázaro-Muñoz, Antonio J. A1 González-Linares, José María A1 Gómez-Luna, Juan A1 Guil-Mata, Nicolás K1 Computación heterogénea AB Current heterogeneous platforms with CPUs and accelerators have the ability to launch several independent tasks simultaneously, in order to exploit concurrency among them. These tasks typically consist of data transfer commands and kernel computation commands. In this paper we develop a runtime approach to optimize the concurrency between data transfers and kernel computation commands in a multithreaded scenario where each CPU thread offloads tasks to the accelerator. It deploys a heuristic based on a temporal execution model for concurrent tasks. It is able to establish a near-optimal task execution order that significantly reduces the total execution time, including data transfers. Our approach has been evaluated employing five different benchmarks composed of dominant kernel and dominant transfer real tasks. In these experiments our heuristic achieves speedups up to 1.5x in AMD R9 and NVIDIA K20c accelerators and 1.3x in an Intel Xeon Phi (KNC) device. PB Procedia Computer Science YR 2017 FD 2017 LK http://hdl.handle.net/10630/13908 UL http://hdl.handle.net/10630/13908 LA eng NO Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech. DS RIUMA. Repositorio Institucional de la Universidad de Málaga RD 21 ene 2026