Leveraging SYCL for Heterogeneous cDTW Computation on CPU, GPU, and FPGA
Loading...
Identifiers
Publication date
Reading date
Collaborators
Advisors
Tutors
Editors
Journal Title
Journal ISSN
Volume Title
Publisher
Wiley
Share
Center
Department/Institute
Abstract
One of the most time-consuming kernels of a recent epileptic seizure detection application is the computation of the constrained Dynamic Time Warping (cDTW) Distance Matrix. In this paper, we explore the design space of heterogeneous CPU, GPU, and FPGA implementations of this kernel using SYCL as a programming model. First, we optimize the CPU implementation leveraging the SIMD capability of SYCL and compare it with the latest C++26 SIMD library. Next, we tune the SYCL code to run on an on-chip GPU, iGPU, as well as on a discrete NVIDIA GPU, dGPU. We also develop a SYCL implementation on an Intel FPGA. On top of that, we exploit simultaneous co-processing on CPU+GPU and CPU+FPGA platforms by extending a previous heterogeneous scheduling framework to now support 2D partitioning strategies. Our evaluations demonstrate that SYCL seems well suited to exploit the SIMD capabilities of modern CPU cores and shows promising results for accelerating devices, both in terms of performance and energy efficiency. Moreover, we find that our scheduler enables the efficient co-execution of work among the computing devices, and the results demonstrate that dynamic and adaptive partitioning strategies perform efficiently with overheads below 4%.
Description
Bibliographic citation
Campos, C., Asenjo, R., Hormigo, J., & Navarro, A. (2025). Leveraging SYCL for Heterogeneous cDTW Computation on CPU, GPU, and FPGA. Concurrency and Computation: Practice and Experience, 37(15–17).
Collections
Endorsement
Review
Supplemented By
Referenced by
Creative Commons license
Except where otherwised noted, this item's license is described as Atribución-NoComercial 4.0 Internacional










