gApp: a text preprocessing system to improve the neural machine translation of discontinuous multiword expressions

dc.centroFacultad de Filosofía y Letrases_ES
dc.contributor.authorHidalgo Ternero, Carlos Manuel
dc.contributor.authorZhou Lian, Xiaoqing
dc.date.accessioned2022-12-20T12:28:41Z
dc.date.available2022-12-20T12:28:41Z
dc.date.issued2022
dc.departamentoTraducción e Interpretación
dc.description.abstractIn this paper we present research results with gApp, a text-preprocessing system designed for automati-cally detecting and converting discontinuous multiword expressions (MWEs) into their continuous forms so as to improve the performance of current neural machine translation systems (NMT) (see Hidalgo-Ternero, 2021 & 2022, Hidalgo-Ternero & Corpas Pastor, 2020, 2022a & 2022b, Hidalgo-Ternero, Lista, and Corpas Pastor, 2022, and Hidalgo-Ternero and Zhou-Lian, 2022a & 2022b). To test its effectiveness, eight experiments with several NMT systems such as DeepL, Google Translate, ModernMT and VIP have been carried out in different language directionalities (ES/FR/IT > ES/EN/DE/FR/IT/PT/ZH) for the trans-lation of somatisms, i.e., MWEs containing lexemes referring to human or animal body parts (Mellado Blanco, 2004). More specifically, we have analysed both flexible verb-noun idiomatic constructions (VNICs) and flexible verb + prepositional phrase (VPP) constructions. In this regard, the promising results obtained for these typologies of MWEs throughout experiments 1-8 will shed some light on new avenues for enhancing MWE-aware NMT systems.es_ES
dc.description.sponsorshipUniversidad de Málaga. Campus de Excelencia Internacional Andalucía Tech.es_ES
dc.identifier.urihttps://hdl.handle.net/10630/25650
dc.language.isoenges_ES
dc.relation.eventdate24/11/2022es_ES
dc.relation.eventplaceLuxemburgo, Luxemburgoes_ES
dc.relation.eventtitleTranslating and the Computer conference — TC44es_ES
dc.rightsAtribución 4.0 Internacional*
dc.rights.accessRightsopen accesses_ES
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/*
dc.subjectTraducción automáticaes_ES
dc.subject.otherNeural machine translationes_ES
dc.subject.otherText-preprocessing systemes_ES
dc.subject.otherMultiword expressionses_ES
dc.titlegApp: a text preprocessing system to improve the neural machine translation of discontinuous multiword expressionses_ES
dc.typeconference outputes_ES
dspace.entity.typePublication

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
TC44_HidalgoTernero&ZhouLian_finalversion.pdf
Size:
282.24 KB
Format:
Adobe Portable Document Format
Description: