Explainable clinical coding with in-domain adapted transformers

López-García, Guillermo; Jerez-Aragonés, José Manuel; Ribelles, Nuria; Alba-Conejo, Emilio; Veredas-Navarro, Francisco Javier

doi:https://doi.org/10.1016/j.jbi.2023.104323

Explainable clinical coding with in-domain adapted transformers

Files

1-s2.0-S1532046423000448-main.pdf (946.58 KB)

Identifiers

URI: https://hdl.handle.net/10630/26275

DOI: https://doi.org/10.1016/j.jbi.2023.104323

Publication date

2023

Authors

López-García, Guillermo

Jerez-Aragonés, José Manuel

Ribelles, Nuria

Alba-Conejo, Emilio

Veredas-Navarro, Francisco Javier

Publisher

Elsevier

Metrics

Share

Export

Center

E.T.S.I. Informática

Department/Institute

Lenguajes y Ciencias de la Computación

Keywords

Medicina-Proceso de datos

Abstract

Background and Objective: Automatic clinical coding is a crucial task in the process of extracting relevant in-formation from unstructured medical documents contained in Electronic Health Records (EHR). However, most of the existing computer-based methods for clinical coding act as “black boxes”, without giving a detailed description of the reasons for the clinical-coding assignments, which greatly limits their applicability to real-world medical scenarios. The objective of this study is to use transformer-based models to effectively tackle explainable clinical-coding. In this way, we require the models to perform the assignments of clinical codes to medical cases, but also to provide the reference in the text that justifies each coding assignment. Methods: We examine the performance of 3 transformer-based architectures on 3 different explainable clinical-coding tasks. For each transformer, we compare the performance of the original general-domain version with an in-domain version of the model adapted to the specificities of the medical domain. We address the explainable clinical-coding problem as a dual medical named entity recognition (MER) and medical named entity normal-ization (MEN) task. For this purpose, we have developed two different approaches, namely a multi-task and a hierarchical-task strategy. Results: For each analyzed transformer, the clinical-domain version significantly outperforms the corresponding general domain model across the 3 explainable clinical-coding tasks analyzed in this study. Furthermore, the hierarchical-task approach yields a significantly superior performance than the multi-task strategy. Specifically, the combination of the hierarchical-task strategy with an ensemble approach leveraging the predictive capa-bilities of the 3 distinct clinical-domain transformers

Bibliographic citation

Guillermo López-García, José M. Jerez, Nuria Ribelles, Emilio Alba, Francisco J. Veredas, Explainable clinical coding with in-domain adapted transformers, Journal of Biomedical Informatics, Volume 139, 2023, 104323, ISSN 1532-0464, https://doi.org/10.1016/j.jbi.2023.104323. (https://www.sciencedirect.com/science/article/pii/S1532046423000448)

Collections

Artículos

Creative Commons license

Except where otherwised noted, this item's license is described as Attribution-NonCommercial-NoDerivatives 4.0 Internacional

Full item page

Explainable clinical coding with in-domain adapted transformers

Files

Identifiers

Publication date

Reading date

Authors

Collaborators

Advisors

Tutors

Editors

Journal Title

Journal ISSN

Volume Title

Publisher

Metrics

Share

Export

Research Projects

Organizational Units

Journal Issue

Center

Department/Institute

Keywords

Abstract

Description

Bibliographic citation

Research data

Collections

Endorsement

Review

Supplemented By

Referenced by

Creative Commons license