High-performance computing in bioinformatics: accelerating de novo assembly.

Loading...
Thumbnail Image

Identifiers

Publication date

Reading date

2025-10-10

Authors

Espinosa García, Elena María

Collaborators

Tutors

Editors

Journal Title

Journal ISSN

Volume Title

Publisher

UMA Editorial

Metrics

Google Scholar

Share

Research Projects

Organizational Units

Journal Issue

Center

Department/Institute

Abstract

Advances in sequencing technologies—currently divided into short-read and long-read platforms—have enabled the study of genomes from a large number of organisms with high coverage and resolution. However, those advances have also created an urgent need for efficient, scalable, and specialized software solutions that are continually evolving to keep pace with the rapid improvements in sequencing methods. In particular, de novo assembly has been one of the most significant challenges in bioinformatics due to its complexity and high computational cost. However, advances in sequencing technologies have significantly improved the accuracy of genome assemblies and made the process more feasible. In parallel, new data structures, efficient algorithms, and computational techniques have been developed to address these challenges. Despite these improvements, de novo assembly still demands substantial computational resources, and remains an active area of research with ongoing development of diverse methods and strategies. In this thesis, we conduct an in-depth study of de novo genome assembly and its main bottlenecks. Building on these findings, we propose software- and hardware-level solutions to accelerate de novo genome assembly and make its computation as energy-efficient as possible. To this end, this work includes a comprehensive review of de novo genome assembly, a benchmark analysis of the most widely used assemblers to identify key bottlenecks, and the proposal of two acceleration tools: SeqMatcher and GenTEK.

Description

Bibliographic citation

Collections

Endorsement

Review

Supplemented By

Referenced by

Creative Commons license

Except where otherwised noted, this item's license is described as Attribution-NonCommercial-NoDerivatives 4.0 Internacional