Artificial intelligence for automatically detecting animals in camera trap images: a combination of MegaDetector and YOLOv5
Loading...
Files
Description: abstract
Identifiers
Publication date
Reading date
Collaborators
Advisors
Tutors
Editors
Journal Title
Journal ISSN
Volume Title
Publisher
Share
Center
Department/Institute
Abstract
Camera traps have gained high popularity for collecting animal images in a cost-effective and non-invasive manner, but manually examining these large volumes of images to extract valuable data is a laborious and costly process. Deep learning, specifically object detection techniques, constitutes a powerful tool for automating this task. Here, we describe the development and result of a deep-learning workflow based on MegaDetector and YOLOv5 for automatically detecting animals in camera trap images. For the development, we first used MegaDetector, which automatically generated bounding boxes for 93.2% of the images in the training set, differentiating animals, humans, vehicles, and empty photos. This annotation phase allowed to discard useless images. Then, we used the images containing animals within the training dataset to train four YOLOv5 models, each one built for a group of species of similar aspects as defined by a human expert. Using four expert models instead of one reduces the complexity and variance between species, allowing for more precise learning within each of the groups. The final result is a workflow where the end-user enters the camera trap images into a global model. Then, this global model redirects the images towards the appropriate expert model. Finally, the final animal classification into a particular species is based on the confidence rates provided by a weighted voting system implemented among the expert models. We validated this workflow using a dataset of 120.000 images collected by 100 camera traps over five years in Andalusian National Parks (Spain) with a representation of 24 mammal species. Our workflow approach improved the global classification F1-score from 0.92 to 0.96. It increased the precision for distinguishing similar species, for example from 0.41 to 0.96 for C. capreolus; and from 0.24 to 0.73 for D. dama, often confounded with other ungulate species, which demonstrates its potential for animal detection in images.











