BIGOWL4DQ: ontology-driven approach for Big Data quality meta-modelling, selection and reasoning

dc.centroE.T.S.I. Informáticaes_ES
dc.contributor.authorBarba-González, Cristóbal
dc.contributor.authorCaballero, Ismael
dc.contributor.authorVarela-Vaca, Ángel Jesús
dc.contributor.authorCruz-Lemus, José A.
dc.contributor.authorGómez López, María Teresa
dc.contributor.authorNavas-Delgado, Ismael
dc.date.accessioned2024-01-22T11:43:45Z
dc.date.available2024-01-22T11:43:45Z
dc.date.issued2023-11-27
dc.departamentoInstituto de Tecnología e Ingeniería del Software de la Universidad de Málaga
dc.description.abstractContext: Data quality should be at the core of many Artificial Intelligence initiatives from the very first moment in which data is required for a successful analysis. Measurement and evaluation of the level of quality are crucial to determining whether data can be used for the tasks at hand. Conscientious of this importance, industry and academia have proposed several data quality measurements and assessment frameworks over the last two decades. Unfortunately, there is no common and shared vocabulary for data quality terms. Thus, it is difficult and time-consuming to integrate data quality analysis within a (Big) Data workflow for performing Artificial Intelligence tasks. One of the main reasons is that, except for a reduced number of proposals, the presented vocabularies are neither machine-readable nor processable, needing human processing to be incorporated. Objective: This paper proposes a unified data quality measurement and assessment information model. This model can be used in different environments and contexts to describe data quality measurement and evaluation concerns. Method: The model has been developed as an ontology to make it interoperable and machine-readable. For better interoperability and applicability, this ontology, BIGOWL4DQ, has been developed as an extension of a previously developed ontology for describing knowledge management in Big Data analytics. Conclusions: This extended ontology provides a data quality measurement and assessment framework required when designing Artificial Intelligence workflows and integrated reasoning capacities. Thus, BIGOWL4DQ can be used to describe Big Data analysis and assess the data quality before the analysis. Result: Our proposal has been validated with two use cases. First, the semantic proposal has been assessed using an academic use case. And second, a real-world case study within an Artificial Intelligence workflow has been conducted to endorse our work.es_ES
dc.description.sponsorshipFunding for open access charge: Universidad de Málaga/CBUA. This publication is part of the R+D+d projects PID2020-112540RB-C41 (AETHER-UMA), PID2020-112540RB-C42 (AETHER-UCLM) and PID2020-112540RB-C44 (AETHER-US): A smart data holistic approach for context-aware data analytics, all of which are funded by MCIN/AEI/ 10.13039/501100011033/. Also, it has been partially funded by the R&D projects METAMORFOSIS, Spain (US-1381375) from Junta de Andalucía, and ADAGIO, Alarcos’ DAta Governance framework and systems generation, Spain (SBPLY/21/180501/000061), funded by the Consejería de Educación, Cultura Deportes of the Junta de Comunidades de Castilla-La Mancha (Spain).es_ES
dc.identifier.citationCristóbal Barba-González, Ismael Caballero, Ángel Jesús Varela-Vaca, José A. Cruz-Lemus, María Teresa Gómez-López, Ismael Navas-Delgado, BIGOWL4DQ: Ontology-driven approach for Big Data quality meta-modelling, selection and reasoning, Information and Software Technology, Volume 167, 2024, 107378, ISSN 0950-5849, https://doi.org/10.1016/j.infsof.2023.107378. (https://www.sciencedirect.com/science/article/pii/S0950584923002331)es_ES
dc.identifier.doi10.1016/j.infsof.2023.107378
dc.identifier.urihttps://hdl.handle.net/10630/28974
dc.language.isoenges_ES
dc.publisherElsevieres_ES
dc.rightsAtribución 4.0 Internacional*
dc.rights.accessRightsopen accesses_ES
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/*
dc.subjectDatos masivoses_ES
dc.subjectOntologíaes_ES
dc.subjectIngeniería del softwarees_ES
dc.subject.otherData quality evaluation and measurementes_ES
dc.subject.otherData quality information modeles_ES
dc.subject.otherBig Dataes_ES
dc.subject.otherOntologyes_ES
dc.subject.otherDecision model and notationes_ES
dc.titleBIGOWL4DQ: ontology-driven approach for Big Data quality meta-modelling, selection and reasoninges_ES
dc.typejournal articlees_ES
dc.type.hasVersionVoRes_ES
dspace.entity.typePublication
relation.isAuthorOfPublicatione8971462-20b8-442f-aeea-797c6233b905
relation.isAuthorOfPublication4e298ef9-8825-4aa8-be87-ac0f8adbf1b7
relation.isAuthorOfPublication.latestForDiscoverye8971462-20b8-442f-aeea-797c6233b905

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
1-s2.0-S0950584923002331-main.pdf
Size:
3.37 MB
Format:
Adobe Portable Document Format
Description:

Collections