Analyzing the differences between reads and contigs when performing a taxonomic assignment comparison in metagenomics

Loading...
Thumbnail Image

Identifiers

Publication date

Reading date

Collaborators

Advisors

Tutors

Editors

Journal Title

Journal ISSN

Volume Title

Publisher

Springer, Cham

Metrics

Google Scholar

Share

Research Projects

Organizational Units

Journal Issue

Department/Institute

Abstract

Metagenomics is an inherently complex field in which one of the primary goals is to determine the compositional organisms present in an environmental sample. Thereby, diverse tools have been developed that are based on the similarity search results obtained from comparing a set of sequences against a database. However, to achieve this goal there still are affairs to solve such as dealing with genomic variants and detecting repeated sequences that could belong to different species in a mixture of uneven and unknown representation of organisms in a sample. Hence, the question of whether analyzing a sample with reads provides further understanding of the metagenome than with contigs arises. The assembly yields larger genomic fragments but bears the risk of producing chimeric contigs. On the other hand, reads are shorter and therefore their statistical significance is harder to asses, but there is a larger number of them. Consequently, we have developed a workflow to assess and compare the quality of each of these alternatives. Synthetic read datasets beloging to previously identified organisms are generated in order to validate the results. Afterwards, we assemble these into a set of contigs and perform a taxonomic analysis on both datasets. The tools we have developed demonstrate that analyzing with reads provide a more trustworthy representation of the species in a sample than contigs especially in cases that present a high genomic variability.

Description

Bibliographic citation

Rodríguez-Brazzarola P., Pérez-Wohlfeil E., Díaz-del-Pino S., Holthausen R., Trelles O. (2018) Analyzing the Differences Between Reads and Contigs When Performing a Taxonomic Assignment Comparison in Metagenomics. In: Rojas I., Ortuño F. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2018. Lecture Notes in Computer Science, vol 10813. Springer, Cham

Endorsement

Review

Supplemented By

Referenced by