Advancing edge-based clustering and graph embedding for biological network analysis: a case study in RASopathies

dc.centroFacultad de Cienciases_ES
dc.contributor.authorGarcía-Criado, Federico
dc.contributor.authorSeoane, Pedro
dc.contributor.authorRojano, Elena
dc.contributor.authorGarcía-Ranea, Juan Antonio
dc.contributor.authorPerkins, James Richard
dc.date.accessioned2025-07-07T10:28:35Z
dc.date.available2025-07-07T10:28:35Z
dc.date.issued2025-07-07
dc.departamentoBiología Molecular y Bioquímicaes_ES
dc.description.abstractUnderstanding and predicting biological processes from protein–protein interaction (PPI) networks requires accurate and efficient representations of their structure. However, many existing methods fail to capture the complex, overlapping modular structure of biological systems. To address this, we propose a network embedding strategy that improves both biological interpretability and predictive power. By transforming networks into a low-dimensional space while preserving key topological properties, embedding enables the discovery of novel functional relationships. Pre-clustering a network before embedding enhances representation quality, i.e. the ability to preserve meaningful structural and functional properties in the embedding space. However, traditional non-overlapping clustering methods can introduce bias by ignoring the overlapping nature of biological communities. We overcome this limitation by integrating the Hierarchical Link Clustering (HLC) algorithm into an embedding workflow tailored for large, weighted, undirected networks. First, we introduce two optimized HLC implementations for Python and R, both outperforming existing methods in clustering accuracy and scalability. Then, by restricting random walks to HLC-defined communities, we improve the representation of biological pathways, as shown using Reactome on the human PPI network. We also apply our full cluster embedding workflow to analyze RASopathies, a group of interrelated disorders with a diverse range of phenotypes, caused by mutations in genes from the RAS/MAPK pathway. This approach was used not only to represent known pathways, but also to identify potential novel gene candidates associated with RASopathies, including Noonan and Costello syndrome. HLC implementations are available in the CDLIB library (https://github.com/GiulioRossetti/cdlib), and at https://github.com/jimrperkins/linkcomm for Python and R, respectively.es_ES
dc.description.sponsorshipFunding for open access charge: Universidad de Málaga / CBUAes_ES
dc.identifier.citationFederico García-Criado, Pedro Seoane, Elena Rojano, Juan A G Ranea, James R Perkins, Advancing edge-based clustering and graph embedding for biological network analysis: a case study in RASopathies, Briefings in Bioinformatics, Volume 26, Issue 4, July 2025, bbaf320, https://doi.org/10.1093/bib/bbaf320es_ES
dc.identifier.doi10.1093/bib/bbaf320
dc.identifier.urihttps://hdl.handle.net/10630/39249
dc.language.isoenges_ES
dc.publisherOxford University Presses_ES
dc.rightsAtribución-NoComercial 4.0 Internacional*
dc.rights.accessRightsopen accesses_ES
dc.rights.urihttp://creativecommons.org/licenses/by-nc/4.0/*
dc.subjectBiología moleculares_ES
dc.subjectBioquímicaes_ES
dc.subjectOncogenes rases_ES
dc.subjectGenéticaes_ES
dc.subjectEnfermedades hereditariases_ES
dc.subjectProteinases_ES
dc.subject.otherRASopathieses_ES
dc.subject.otherNetwork embeddinges_ES
dc.subject.otherOverlap communityes_ES
dc.subject.otherHLCes_ES
dc.subject.otherProtein-protein interactiones_ES
dc.titleAdvancing edge-based clustering and graph embedding for biological network analysis: a case study in RASopathieses_ES
dc.typejournal articlees_ES
dc.type.hasVersionVoRes_ES
dspace.entity.typePublication
relation.isAuthorOfPublication8c8b05f2-a296-4ec5-aa57-f77f60a303a8
relation.isAuthorOfPublication.latestForDiscovery8c8b05f2-a296-4ec5-aa57-f77f60a303a8

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
bbaf320.pdf
Size:
1.85 MB
Format:
Adobe Portable Document Format
Description:

Collections