Mostrar el registro sencillo del ítem
Detection of tumor morphology mentions in clinical reports in Spanish using transformers
dc.contributor.author | López-García, Guillermo | |
dc.contributor.author | Jerez-Aragonés, José Manuel | |
dc.contributor.author | Ribelles, Nuria | |
dc.contributor.author | Alba-Conejo, Emilio | |
dc.contributor.author | Veredas-Navarro, Francisco Javier | |
dc.date.accessioned | 2021-07-23T06:19:22Z | |
dc.date.available | 2021-07-23T06:19:22Z | |
dc.date.created | 2021-07-22 | |
dc.date.issued | 2021 | |
dc.identifier.uri | https://hdl.handle.net/10630/22685 | |
dc.description.abstract | The aim of this study is to systematically examine the performance of transformer-based models for the detection of tumor morphology mentions in clinical documents in Spanish. For this purpose, we analyzed 3 transformer models supporting the Spanish language, namely multilingual BERT, BETO and XLM-RoBERTa. By means of a transfer- learning-based approach, the models were first pretrained on a collection of real-world oncology clinical cases with the goal of adapting trans- formers to the distinctive features of the Spanish oncology domain. The resulting models were further fine-tuned on the Cantemist-NER task, addressing the detection of tumor morphology mentions as a multi-class sequence-labeling problem. To evaluate the effectiveness of the proposed approach, we compared the obtained results by the domain-specific ver- sion of the examined transformers with the performance achieved by the general-domain version of the models. The results obtained in this pa- per empirically demonstrated that, for every analyzed transformer, the clinical version outperformed the corresponding general-domain model on the detection of tumor morphology mentions in clinical case reports in Spanish. Additionally, the combination of the transfer-learning-based approach with an ensemble strategy exploiting the predictive capabilities of the distinct transformer architectures yielded the best obtained results, achieving a precision value of 0.893, a recall of 0.887 and an F1-score of 0.89, which remarkably surpassed the prior state-of-the-art performance for the Cantemist-NER task. | es_ES |
dc.language.iso | eng | es_ES |
dc.publisher | Springer | es_ES |
dc.subject | Oncología | es_ES |
dc.subject.other | Clinical coding | es_ES |
dc.subject.other | Deep learning | es_ES |
dc.subject.other | Natural Language Processing | es_ES |
dc.subject.other | Text Classification | es_ES |
dc.subject.other | Transformers | es_ES |
dc.title | Detection of tumor morphology mentions in clinical reports in Spanish using transformers | es_ES |
dc.type | journal article | es_ES |
dc.centro | E.T.S.I. Informática | es_ES |
dc.relation.eventtitle | International Work Conference on Artificial Neural Networks (IWANN 2021) | es_ES |
dc.relation.eventplace | Madeira, Portugal | es_ES |
dc.relation.eventdate | Junio, 2021 | es_ES |
dc.departamento | Lenguajes y Ciencias de la Computación | |
dc.rights.accessRights | open access | es_ES |