Optimization of massive data applications on heterogeneous architectures

Romero Moreno, José Carlos

Optimization of massive data applications on heterogeneous architectures

dc.centro	E.T.S. de Arquitectura	es_ES
dc.contributor.advisor	Asenjo-Plaza, Rafael
dc.contributor.advisor	Rodríguez-Moreno, Andrés
dc.contributor.author	Romero Moreno, José Carlos
dc.date.accessioned	2023-01-30T13:15:31Z
dc.date.available	2023-01-30T13:15:31Z
dc.date.issued	2023-01
dc.date.submitted	2022-09-15
dc.departamento	Arquitectura de Computadores
dc.description	Our experimental results show that, our heterogeneous CPU+GPU approaches always outperform only-CPU and only-GPU state-of-the-art implementations up to 6.86x and 5.19x, respectively, and they fall below 6% of ideal peak performance.	es_ES
dc.description.abstract	In the last few years, the heterogeneous architectures have become dominant in each part of the computing industry: from heterogeneous GPU accelerators joining multi-core CPUs within the same chip, to Systems on Chip that integrate DSPs or. The main motivation of this thesis is the fact that there is no implementation with optimal solution for heterogeneous architectures for two massive data, real-life and complex problems widely used in big data fields: Time Series and the Skyline problem. Firstly, we focus on the motifs/discord discovery problem for Time Series, taking as a starting point the state-of-the-art algorithm, the Matrix Profile. We present the first heterogeneous implementations for the Matrix Profile computation for CPU + GPU architectures and CPU + FPGA using a High Performance FPGA with integrated High Bandwidth Memory, HBM. We propose Fastfit, a hierarchical scheduler that efficiently balances workload among the FPGA and the CPU cores and computes an even partition so that all FPGA IPs complete their assignment at the same time. We validate the accuracy of our models, finding that it outperforms state-of-the-art previous schedulers by achieving up to 99.4% of ideal performance. Secondly, we tackle the problem of computing the Skyline operator over a stream of independent data queries targeting a heterogeneous CPU + GPU architecture. We contribute with a novel heterogeneous implementation, based on oneAPI, of the state-of-the-art SkyAlign algorithm. We design a graph-based engine, SkyFlow, and propose two heterogeneous approaches for Skyline computation over a stream of data queries: the first keeps two Skyline computations in parallel, one per device, and the second splits a single Skyline computation between the CPU and GPU.	es_ES
dc.identifier.uri	https://hdl.handle.net/10630/25823
dc.language.iso	eng	es_ES
dc.publisher	UMA Editorial	es_ES
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 Internacional	*
dc.rights.accessRights	open access	es_ES
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/	*
dc.subject	Arquitectura de ordenadores - Tesis doctorales	es_ES
dc.subject.other	Skyline	es_ES
dc.subject.other	Time Series	es_ES
dc.subject.other	High Performance FPGA	es_ES
dc.subject.other	Heterougeneous computing	es_ES
dc.subject.other	OneAPI	es_ES
dc.title	Optimization of massive data applications on heterogeneous architectures	es_ES
dc.type	doctoral thesis	es_ES
dspace.entity.type	Publication
relation.isAdvisorOfPublication	6ea008bf-69ee-4104-a942-2033b5b07ab8
relation.isAdvisorOfPublication	b215fbc9-d0f2-4bbb-a17c-e6055e984f68
relation.isAdvisorOfPublication.latestForDiscovery	6ea008bf-69ee-4104-a942-2033b5b07ab8

Files

Original bundle

Now showing 1 - 1 of 1

Name:: TD_ROMERO_MORENO_Jose_Carlos.pdf
Size:: 7.54 MB
Format:: Adobe Portable Document Format
Description:

Download

Collections

Tesis doctorales