Visual Object Detection with DETR to Support Video-Diagnosis Using Conference Tools
| dc.contributor.author | Biró, Attila | |
| dc.contributor.author | Tünde Janosi-Rancz, Katalin | |
| dc.contributor.author | Szilágyi, László | |
| dc.contributor.author | Cuesta-Vargas, Antonio | |
| dc.contributor.author | Martín-Martín, Jaime | |
| dc.contributor.author | Miklós Szilágyi, Sándor | |
| dc.date.accessioned | 2024-09-19T10:30:43Z | |
| dc.date.available | 2024-09-19T10:30:43Z | |
| dc.date.issued | 2022-06-12 | |
| dc.departamento | Salud Pública y Psiquiatría | |
| dc.description.abstract | This text discusses the need for real-time multilingual sentence detection during online video presentations, particularly in the healthcare sector for remote diagnosis. The use of visual (textual) object detection and preprocessing is essential for subsequent analysis. The researchers propose using the DEtection TRansformer (DETR) model to achieve accurate and real-time detection of textual objects. The development of real-time videoconference translation supported by artificial intelligence has become especially important during the COVID-19 pandemic. The challenge lies in the variety of languages spoken by specialists, which requires human translators or AI-based technological channels. The accuracy of visual localization of textual elements depends on the complexity, quality, and variety of the training datasets. The researchers compare the performance of the DETR model with other real-time object detectors like YOLO4 and Detectron2, and introduce AI-based innovations through collaborative solutions combined with OCR. The researchers conducted evaluations using training datasets and achieved higher-than-expected accuracy in terms of visual text detection range, with an average accuracy of 0.4 to 0.65. | es_ES |
| dc.description.sponsorship | This research was supported by ITware, Hungary. The work of K.T. Jánosi-Rancz and L. Szilágyi was supported by Sapientia Foundation—Institute for Scientific Research. | es_ES |
| dc.identifier.citation | Biró, A.; Jánosi-Rancz, K.T.; Szilágyi, L.; Cuesta-Vargas, A.I.; Martín-Martín, J.; Szilágyi, S.M. Visual Object Detection with DETR to Support Video-Diagnosis Using Conference Tools. Appl. Sci. 2022, 12, 5977. https://doi.org/10.3390/app12125977 | es_ES |
| dc.identifier.doi | 10.3390/app12125977 | |
| dc.identifier.uri | https://hdl.handle.net/10630/32668 | |
| dc.language.iso | eng | es_ES |
| dc.publisher | MDPI | es_ES |
| dc.rights | Attribution-NonCommercial-NoDerivatives 4.0 Internacional | * |
| dc.rights.accessRights | open access | es_ES |
| dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/4.0/ | * |
| dc.subject | Diagnóstico por imagen | es_ES |
| dc.subject.other | DETR | es_ES |
| dc.subject.other | YOLO4 | es_ES |
| dc.subject.other | Detectron2 | es_ES |
| dc.subject.other | Object visual detection | es_ES |
| dc.subject.other | Multilingual OCR | es_ES |
| dc.subject.other | Real-time translation | es_ES |
| dc.subject.other | Remote diagnostics | es_ES |
| dc.subject.other | Realtime text detection | es_ES |
| dc.subject.other | Assessment | es_ES |
| dc.title | Visual Object Detection with DETR to Support Video-Diagnosis Using Conference Tools | es_ES |
| dc.type | journal article | es_ES |
| dc.type.hasVersion | VoR | es_ES |
| dspace.entity.type | Publication | |
| relation.isAuthorOfPublication | 94126d4b-371d-4727-a252-f4182972d4b6 | |
| relation.isAuthorOfPublication | af904741-d538-4bf8-a882-d00782271171 | |
| relation.isAuthorOfPublication.latestForDiscovery | 94126d4b-371d-4727-a252-f4182972d4b6 |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- applsci-12-05977-v2-3.pdf
- Size:
- 3.54 MB
- Format:
- Adobe Portable Document Format
- Description:
- Artículo principal
Description: Artículo principal

