Deploying deep learning models on edge devices offers advantages in terms of data security and communication latency. However, optimizing these models to achieve fast computing speeds without sacrificing accuracy can be challenging, especially in video surveillance applications where real-time processing is crucial. In this study, we investigate the deployment of gait recognition models as a multi-objective selection problem in which we seek to simultaneously minimize several objectives, such as latency and energy consumption, while maintaining accuracy. The decision space of a problem comprises all models that can be built by varying parameters, such as the size of the model, the operating frequency of the device, and the precision of the operations. From this problem definition, a subset of Pareto optimal models can be selected to be deployed on the target device. We conducted experiments with two different gait recognition models on NVIDIA Jetson Orin Nano and Jetson AGX Orin to explore their decision spaces. In addition, we investigated different strategies to increase the throughput of the deployed models by taking advantage of batching and concurrent execution. Together, these techniques allowed us to design real-time solutions for gait recognition in scenarios with multiple subjects. These solutions can process between 42 and 188 simultaneous subjects at 25 inferences per second with an energy consumption ranging from 6.31 to 9.71 mJ per inference, depending on the device and the deployed model.