Enhancing cross-encoders using knowledge graph hierarchy for medical entity linking in zero- and few-shot scenarios

Loading...
Thumbnail Image

Identifiers

Publication date

Reading date

Collaborators

Advisors

Tutors

Editors

Journal Title

Journal ISSN

Volume Title

Publisher

Elsevier

Metrics

Google Scholar

Share

Research Projects

Organizational Units

Journal Issue

Abstract

Medical Entity Linking (MEL) is a common task in natural language processing, focusing on the normalization of recognized entities from clinical texts using large knowledge bases (KBs). This task presents significant challenges, specially when working with electronic health records that often lack annotated clinical notes, even in languages like English. The difficulty increases in few-shot or zero-shot scenarios, where models must operate with minimal or no training data, a common issue when dealing with less-documented languages such as Spanish. Existing solutions that combine contrastive learning with external sources, like the Unified Medical Language System (UMLS), have shown competitive results. However, most of these methods focus on individual concepts from the KBs, ignoring relationships such as synonymy or hierarchical links between concepts. In this paper, we propose leveraging these relationships to enrich the training triplets used for contrastive learning, improving performance in MEL tasks. Specifically, we fine-tune several BERT-based cross-encoders using enriched triplets on three clinical corpora in Spanish : DisTEMIST, MedProcNER, and SympTEMIST. Our approach addresses the complexity of real-world data, where unseen mentions and concepts are frequent. The results show a notable improvement in lower top-𝑘� accuracies, surpassing the state-of-the-art by up to 5.5 percentage points for unseen mentions and by up to 5.9 points for unseen concepts. This improvement reduces the number of candidate concepts required for cross-encoders, enabling more efficient semi-automatic annotation and decreasing human effort. Additionally, our findings underscore the importance of leveraging not only the concept-level information in KBs but also the relationships between those concepts.

Description

Bibliographic citation

Fernando Gallego, Pedro Ruas, Francisco M. Couto, Francisco J. Veredas, Enhancing cross-encoders using knowledge graph hierarchy for medical entity linking in zero- and few-shot scenarios, Knowledge-Based Systems, Volume 314, 2025, 113211, ISSN 0950-7051, https://doi.org/10.1016/j.knosys.2025.113211.

Collections

Endorsement

Review

Supplemented By

Referenced by

Creative Commons license

Except where otherwised noted, this item's license is described as Atribución 4.0 Internacional