Semantics in Big Data Analytics.

Loading...
Thumbnail Image

Identifiers

Publication date

Reading date

2024-01-24

Authors

Benítez-Hidalgo, Antonio

Collaborators

Tutors

Editors

Journal Title

Journal ISSN

Volume Title

Publisher

UMA Editorial

Metrics

Google Scholar

Share

Research Projects

Organizational Units

Journal Issue

Abstract

Through the development of the TITAN platform, we aim to provide a tool for managing the lifecycle of workflows, integrating semantics to facilitate more intelligent and efficient workflows. TITAN was built with a flexible architecture, allowing for the implementation of new functionalities. In this regard, we developed NORA, a tool designed to provide reasoning over large ontologies. Using NORA with TITAN, efficient and scalable reasoning can be performed on semantically rich workflows, leveraging NoSQL database technologies to ensure scalability and reliability. NORA uses Apache Spark as its computational engine to implement inference rules, allowing the reasoning process to be evaluated iteratively until no new inferred knowledge is derived. In the biological domain, we introduce SALON, an ontology that provides a consistent understanding and use of multiple sequence alignments. SALON eases the development of Linked Data repositories to offer uniform access to diverse information essential for bioinformatics researchers. This ontology can also serve as a mediator schema for integrating data from various sources and validating sequence alignments by defining SWRL rules. Furthermore, we explore a methodology to inject semantic knowledge (expressed via ontologies) into analysis algorithms using the META ontology. This ontology allows algorithms to be enriched with domain-specific information, resulting in more informed and accurate decisions. Several use cases demonstrate META's effectiveness in enhancing the analysis process, including its use for mapping domain knowledge and constraints into machine learning models. Through META, algorithms can be guided by expert knowledge and domain-specific considerations.

Description

Lastly, we identify several promising directions for future work. These include enhancing the semantic capabilities of TITAN, extending NORA's functionalities, and developing intuitive interfaces for META to make semantics in Big Data more accessible and efficient for a broad range of users.

Bibliographic citation

Collections

Endorsement

Review

Supplemented By

Referenced by

Creative Commons license

Except where otherwised noted, this item's license is described as Attribution-NonCommercial-NoDerivatives 4.0 Internacional