Enhancing cross-encoders using knowledge graph hierarchy for medical entity linking in zero- and few-shot scenarios

dc.centroE.T.S.I. Informáticaes_ES
dc.contributor.authorGallego, Fernando
dc.contributor.authorRuas, Pedro
dc.contributor.authorCouto, Francisco M.
dc.contributor.authorVeredas-Navarro, Francisco Javier
dc.date.accessioned2025-05-15T09:31:55Z
dc.date.available2025-05-15T09:31:55Z
dc.date.issued2025-03-01
dc.departamentoLenguajes y Ciencias de la Computaciónes_ES
dc.description.abstractMedical Entity Linking (MEL) is a common task in natural language processing, focusing on the normalization of recognized entities from clinical texts using large knowledge bases (KBs). This task presents significant challenges, specially when working with electronic health records that often lack annotated clinical notes, even in languages like English. The difficulty increases in few-shot or zero-shot scenarios, where models must operate with minimal or no training data, a common issue when dealing with less-documented languages such as Spanish. Existing solutions that combine contrastive learning with external sources, like the Unified Medical Language System (UMLS), have shown competitive results. However, most of these methods focus on individual concepts from the KBs, ignoring relationships such as synonymy or hierarchical links between concepts. In this paper, we propose leveraging these relationships to enrich the training triplets used for contrastive learning, improving performance in MEL tasks. Specifically, we fine-tune several BERT-based cross-encoders using enriched triplets on three clinical corpora in Spanish : DisTEMIST, MedProcNER, and SympTEMIST. Our approach addresses the complexity of real-world data, where unseen mentions and concepts are frequent. The results show a notable improvement in lower top-𝑘� accuracies, surpassing the state-of-the-art by up to 5.5 percentage points for unseen mentions and by up to 5.9 points for unseen concepts. This improvement reduces the number of candidate concepts required for cross-encoders, enabling more efficient semi-automatic annotation and decreasing human effort. Additionally, our findings underscore the importance of leveraging not only the concept-level information in KBs but also the relationships between those concepts.es_ES
dc.description.sponsorshipFunding for open access charge: Universidad de Málaga / CBUA. The authors acknowledge the support from the Ministerio de Ciencia e innovación (MICINN), Spain under projects AEI/10.13039/501100011033 and PID2020-119266RA-I00 and supported by FCT (Fundação para a Ciência e a Tecnologia), Portugal through funding of the PhD Scholarship with ref. 2020.05393.BD, and the LASIGE Research Unit, ref. UIDB/00408/2020 (https://doi.org/10.54499/UIDB/ 00408/2020) and ref. UIDP/00408/2020 (https://doi.org/10.54499/ UIDP/00408/2020)es_ES
dc.identifier.citationFernando Gallego, Pedro Ruas, Francisco M. Couto, Francisco J. Veredas, Enhancing cross-encoders using knowledge graph hierarchy for medical entity linking in zero- and few-shot scenarios, Knowledge-Based Systems, Volume 314, 2025, 113211, ISSN 0950-7051, https://doi.org/10.1016/j.knosys.2025.113211.es_ES
dc.identifier.doi10.1016/j.knosys.2025.113211
dc.identifier.urihttps://hdl.handle.net/10630/38624
dc.language.isoenges_ES
dc.publisherElsevieres_ES
dc.rightsAtribución 4.0 Internacional*
dc.rights.accessRightsopen accesses_ES
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/*
dc.subjectSistemas expertoses_ES
dc.subjectRepresentación del conocimientoes_ES
dc.subject.otherKnowledge enrichmentes_ES
dc.subject.otherMedical entity linkinges_ES
dc.subject.otherContrastive-learninges_ES
dc.subject.otherKnowledge baseses_ES
dc.subject.otherCandidate rerankinges_ES
dc.subject.otherZero-few shotes_ES
dc.titleEnhancing cross-encoders using knowledge graph hierarchy for medical entity linking in zero- and few-shot scenarioses_ES
dc.typejournal articlees_ES
dc.type.hasVersionVoRes_ES
dspace.entity.typePublication
relation.isAuthorOfPublicationb8ab3a42-65ef-4349-9230-798e19f78426
relation.isAuthorOfPublication.latestForDiscoveryb8ab3a42-65ef-4349-9230-798e19f78426

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
1-s2.0-S0950705125002588-main.pdf
Size:
1.08 MB
Format:
Adobe Portable Document Format
Description:

Collections