1. Resource-Efficient Visual Multiobject Tracking on Embedded Device
- Author
-
Jingzheng Tu, Qimin Xu, Cailian Chen, Xinping Guan, and Bo Yang
- Subjects
Computer Networks and Communications ,Computer science ,business.industry ,Real-time computing ,Latency (audio) ,Cloud computing ,Frame rate ,Computer Science Applications ,Active appearance model ,Parallel processing (DSP implementation) ,Hardware and Architecture ,Video tracking ,Signal Processing ,Graph (abstract data type) ,Latency (engineering) ,business ,Information Systems - Abstract
Multi-object tracking (MOT) is a crucial technology for security surveillance, which is computationally intensive due to the requirement of processing a large number of video streams within low latency in practice. The input video streams of MOT are processed on a cloud computing center with abundant computational capability, posing heavy pressures on delivering video streams to the cloud. Recent advances in the Internet of Things (IoT) technology provide edge-computing-based solutions for video analytics at scale. However, the gap between MOT’s high computational capability demand and IoT devices’ resource-constrained nature remains significant. In this paper, a resource-efficient multi-object tracking method (REMOT) is proposed for real-time surveillance on IoT embedded devices, including an affinity measurement based on an appearance model with angular triplet loss and a motion association that substitutes the time-consuming graph-based data association stage. Considering the trade-off between latency and accuracy, we design an optimization strategy on the parallel processing of deep learning models’ layers to accelerate the inference speed with less accuracy loss. Besides, we employ a model compression strategy for model size reduction. Experiments on MOT16 and MOT17 benchmarks demonstrate that REMOT reduces 2.4x latency compared with the original implementation and achieves a running speed of 81 frames per second (fps) on an embedded device with only a marginal accuracy loss (6%), which meets the requirements of real-time processing and low-latency response for surveillance.
- Published
- 2022