The automatic detection and classification of vehicles in traffic sequences is a typical task which is carried out in many practical video surveillance systems. The advent of deep learning has facilitated the design of these systems. However, limitations in the resolution of the surveillance cameras imply that the vehicles are not clearly defined in the incoming video frames, which hampers the classification performance of deep learning Convolutional Neural Networks. In this paper a method is presented to overcome this challenge, which is based on several steps. An initial segmentation is followed by a postprocessing of the segmented images to solve vehicle overlapping and differing vehicle sizes. Then, a super resolution algorithm is employed to improve the definition of the image windows to be supplied to the neural networks. Finally, the outputs of an ensemble of such networks is integrated in order to obtain an improved recognition performance by the consensus of the networks of the ensemble.
Several computational tests using well-known benchmarks demonstrate the effectiveness of the proposal, even in hard situations.
Therefore, our vehicle classification system overcomes many limitations of naive application of Convolutional Neural Networks, since each proposed subsystem tackles different difficulties which arise in real traffic video data.