Real-time unsupervised video object detection on the edge
Loading...
Identifiers
Publication date
Reading date
Collaborators
Advisors
Tutors
Editors
Journal Title
Journal ISSN
Volume Title
Publisher
Elsevier
Share
Center
Department/Institute
Abstract
Object detection in video is an essential computer vision task. Consequently, many efforts have been devoted to developing precise and fast deep-learning models for this task. These models are commonly deployed on discrete and powerful GPU devices to meet both frame rate performance and detection accuracy requirements. Furthermore, model training is usually performed in a strongly supervised way so that samples must be previously labelled by humans using a slow and costly process. In this paper, we develop a real-time implementation for unsupervised object detection in video employing a low-power device. We improve typical approaches for object detection using information supplied by optical flow to detect moving objects. Besides, we use an unsupervised clustering algorithm to group similar detections that avoid manual object labelling. Finally, we propose a methodology to optimize the deployment of our resulting framework on an embedded heterogeneous platform. Thus, we illustrate how all the computational resources of a Jetson AGX Xavier (CPU, GPU, and DLAs) can be used to fulfil frame rate, accuracy, and energy consumption requirements. Three different data representations (FP32, FP16 and INT8) are studied for the pipeline networks in order to evaluate the impact of all of them in our pipeline. Obtained results show that our proposed optimizations can improve up to 23.6x
energy consumption and 32.2x
execution time with respect to the non-optimized pipeline without penalizing the original mAP (59.44). This computational complexity reduction is achieved through knowledge distillation, using FP16 data precision, and deploying concurrent tasks in different computing units.
Description
Bibliographic citation
Ruiz-Barroso, P., Castro, F. M., & Guil, N. (2025). Real-time unsupervised video object detection on the edge. Future Generation Computer Systems, 167, 107737.
Collections
Endorsement
Review
Supplemented By
Referenced by
Creative Commons license
Except where otherwised noted, this item's license is described as Atribución-NoComercial 4.0 Internacional










