Visual Object Detection with DETR to Support Video-Diagnosis Using Conference Tools

dc.contributor.authorBiró, Attila
dc.contributor.authorTünde Janosi-Rancz, Katalin
dc.contributor.authorSzilágyi, László
dc.contributor.authorCuesta-Vargas, Antonio
dc.contributor.authorMartín-Martín, Jaime
dc.contributor.authorMiklós Szilágyi, Sándor
dc.date.accessioned2024-09-19T10:30:43Z
dc.date.available2024-09-19T10:30:43Z
dc.date.issued2022-06-12
dc.departamentoSalud Pública y Psiquiatría
dc.description.abstractThis text discusses the need for real-time multilingual sentence detection during online video presentations, particularly in the healthcare sector for remote diagnosis. The use of visual (textual) object detection and preprocessing is essential for subsequent analysis. The researchers propose using the DEtection TRansformer (DETR) model to achieve accurate and real-time detection of textual objects. The development of real-time videoconference translation supported by artificial intelligence has become especially important during the COVID-19 pandemic. The challenge lies in the variety of languages spoken by specialists, which requires human translators or AI-based technological channels. The accuracy of visual localization of textual elements depends on the complexity, quality, and variety of the training datasets. The researchers compare the performance of the DETR model with other real-time object detectors like YOLO4 and Detectron2, and introduce AI-based innovations through collaborative solutions combined with OCR. The researchers conducted evaluations using training datasets and achieved higher-than-expected accuracy in terms of visual text detection range, with an average accuracy of 0.4 to 0.65.es_ES
dc.description.sponsorshipThis research was supported by ITware, Hungary. The work of K.T. Jánosi-Rancz and L. Szilágyi was supported by Sapientia Foundation—Institute for Scientific Research.es_ES
dc.identifier.citationBiró, A.; Jánosi-Rancz, K.T.; Szilágyi, L.; Cuesta-Vargas, A.I.; Martín-Martín, J.; Szilágyi, S.M. Visual Object Detection with DETR to Support Video-Diagnosis Using Conference Tools. Appl. Sci. 2022, 12, 5977. https://doi.org/10.3390/app12125977es_ES
dc.identifier.doi10.3390/app12125977
dc.identifier.urihttps://hdl.handle.net/10630/32668
dc.language.isoenges_ES
dc.publisherMDPIes_ES
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internacional*
dc.rights.accessRightsopen accesses_ES
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectDiagnóstico por imagenes_ES
dc.subject.otherDETRes_ES
dc.subject.otherYOLO4es_ES
dc.subject.otherDetectron2es_ES
dc.subject.otherObject visual detectiones_ES
dc.subject.otherMultilingual OCRes_ES
dc.subject.otherReal-time translationes_ES
dc.subject.otherRemote diagnosticses_ES
dc.subject.otherRealtime text detectiones_ES
dc.subject.otherAssessmentes_ES
dc.titleVisual Object Detection with DETR to Support Video-Diagnosis Using Conference Toolses_ES
dc.typejournal articlees_ES
dc.type.hasVersionVoRes_ES
dspace.entity.typePublication
relation.isAuthorOfPublication94126d4b-371d-4727-a252-f4182972d4b6
relation.isAuthorOfPublicationaf904741-d538-4bf8-a882-d00782271171
relation.isAuthorOfPublication.latestForDiscovery94126d4b-371d-4727-a252-f4182972d4b6

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
applsci-12-05977-v2-3.pdf
Size:
3.54 MB
Format:
Adobe Portable Document Format
Description:
Artículo principal
Download

Description: Artículo principal

Collections