Mostrar el registro sencillo del ítem

dc.contributor.authorMoreno-Barea, Francisco J.
dc.contributor.authorMesa, Héctor
dc.contributor.authorRibelles, Nuria
dc.contributor.authorAlba-Conejo, Emilio 
dc.contributor.authorJerez-Aragonés, José Manuel 
dc.date.accessioned2023-07-24T09:19:01Z
dc.date.available2023-07-24T09:19:01Z
dc.date.issued2023-06-29
dc.identifier.citationMoreno-Barea, F.J. et al. (2023). Clinical Text Classification in Cancer Real-World Data in Spanish. In: Rojas, I., Valenzuela, O., Rojas Ruiz, F., Herrera, L.J., Ortuño, F. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2023. Lecture Notes in Computer Science(), vol 13919. Springer, Cham. https://doi.org/10.1007/978-3-031-34953-9_38es_ES
dc.identifier.urihttps://hdl.handle.net/10630/27352
dc.description.abstractHealthcare systems currently store a large amount of clinical data, mostly unstructured textual information, such as electronic health records (EHRs). Manually extracting valuable information from these documents is costly for healthcare professionals. For example, when a patient first arrives at an oncology clinical analysis unit, clinical staff must extract information about the type of neoplasm in order to assign the appropriate clinical specialist. Automating this task is equivalent to text classification in natural language processing (NLP). In this study, we have attempted to extract the neoplasm type by processing Spanish clinical documents. A private corpus of 23, 704 real clinical cases has been processed to extract the three most common types of neoplasms in the Spanish territory: breast, lung and colorectal neoplasms. We have developed methodologies based on state-of-the-art text classification task, strategies based on machine learning and bag-of-words, based on embedding models in a supervised task, and based on bidirectional recurrent neural networks with convolutional layers (C-BiRNN). The results obtained show that the application of NLP methods is extremely helpful in performing the task of neoplasm type extraction. In particular, the 2-BiGRU model with convolutional layer and pre-trained fastText embedding obtained the best performance, with a macro-average, more representative than the micro-average due to the unbalanced data, of 0.981 for precision, 0.984 for recall and 0.982 for F1-score.es_ES
dc.description.sponsorshipThe authors acknowledge the support from the Ministerio de Ciencia e Innovación (MICINN) under project PID2020-116898RB-I00, from Universidad de Málaga and Junta de Andalucía through grants UMA20-FEDERJA-045 and PYC20-046-UMA (all including FEDER funds), and from the Malaga-Pfizer consortium for AI research in Cancer - MAPIC. Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech.es_ES
dc.language.isoenges_ES
dc.subject.otherText Classificationes_ES
dc.subject.otherNatural Language Processinges_ES
dc.subject.otherElectronic Health Recordses_ES
dc.subject.otherNeoplasm canceres_ES
dc.subject.otherSpanishes_ES
dc.titleClinical text classification in Cancer Real-World Data in Spanishes_ES
dc.typeconference outputes_ES
dc.centroE.T.S.I. Informáticaes_ES
dc.relation.eventtitleInternational Work-Conference on Bioinformatics and Biomedical Engineering (IWBBIO 2023)es_ES
dc.relation.eventplaceMaspalomas, Gran Canaria, Españaes_ES
dc.relation.eventdate12/07/2023-14/07/2023es_ES
dc.departamentoLenguajes y Ciencias de la Computación
dc.rights.accessRightsopen accesses_ES


Ficheros en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem