On the performance of SQL scalable systems on Kubernetes: a comparative study

dc.centroE.T.S.I. Informáticaes_ES
dc.contributor.authorCardas Ezeiza, Cristian
dc.contributor.authorAldana Martín, José Francisco
dc.contributor.authorBurgueño Romero, Antonio Manuel
dc.contributor.authorNebro-Urbaneja, Antonio Jesús
dc.contributor.authorMateos, Jose M.
dc.contributor.authorSánchez-Martínez, Juan José
dc.date.accessioned2022-09-12T12:13:13Z
dc.date.available2022-09-12T12:13:13Z
dc.date.issued2022-09-09
dc.departamentoInstituto de Tecnología e Ingeniería del Software de la Universidad de Málaga
dc.description.abstractThe popularization of Hadoop as the the-facto standard platform for data analytics in the context of Big Data applications has led to the upsurge of SQL-on-Hadoop systems, which provide scalable query execution engines allowing the use of SQL queries on data stored in HDFS. In this context, Kubernetes appears as the leading choice to simplify the deployment and scaling of containerized applications; however, there is a lack of studies about the performance of SQL-on-Hadoop systems deployed on Kubernetes, and this is the gap we intend to fill in this paper. We present an experimental study involving four representative SQL scalable platforms: Apache Drill, Apache Hive, Apache Spark SQL and Trino. Concretely, we analyze the performance of these systems when they are deployed on a Hadoop cluster with Kubernetes by using the TPC-H benchmark. The results of our study can help practitioners and users about what they can expect in terms of performance if they plan to use the advantages of Kubernetes to deploy applications using the analyzed SQL scalable platforms.es_ES
dc.description.sponsorshipOpen Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. Funding for open access charge: Universidad de Málaga / CBUA. This work has been partially funded by the Spanish Ministry of Science and Innovation via Grant PID2020-112540RB-C41 (AEI/FEDER, UE), Andalusian PAIDI program with grant P18-RT-2799, and by project ”Evolución y desarrollo de la plataforma DOP de Big Data” (702C2000044) under Andalusian “Programa de Apoyo a la I+D+i Empresarial”.es_ES
dc.identifier.citationCardas, C., Aldana-Martín, J.F., Burgueño-Romero, A.M. et al. On the performance of SQL scalable systems on Kubernetes: a comparative study. Cluster Comput (2022). https://doi.org/10.1007/s10586-022-03718-9es_ES
dc.identifier.doi10.1007/s10586-022-03718-9
dc.identifier.urihttps://hdl.handle.net/10630/24957
dc.language.isoenges_ES
dc.rightsAtribución 4.0 Internacional*
dc.rights.accessRightsopen accesses_ES
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/*
dc.subjectSQL (Lenguaje de programación)es_ES
dc.subject.otherScalable SQL systemses_ES
dc.subject.otherHadoopes_ES
dc.subject.otherKuberneteses_ES
dc.subject.otherApache Sparkes_ES
dc.subject.otherTrinoes_ES
dc.subject.otherApache Drilles_ES
dc.subject.otherHive MR3es_ES
dc.titleOn the performance of SQL scalable systems on Kubernetes: a comparative studyes_ES
dc.typejournal articlees_ES
dc.type.hasVersionVoRes_ES
dspace.entity.typePublication
relation.isAuthorOfPublicationeddeb2e3-acaf-483e-bb13-cebb22c18413
relation.isAuthorOfPublication.latestForDiscoveryeddeb2e3-acaf-483e-bb13-cebb22c18413

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
s10586-022-03718-9.pdf
Size:
1.31 MB
Format:
Adobe Portable Document Format
Description:

Collections