6,430 results on '"Object Tracking"'
Search Results
202. High-Magnification Object Tracking with Ultra-Fast View Adjustment and Continuous Autofocus Based on Dynamic-Range Focal Sweep.
- Author
-
Zhang, Tianyi, Shimasaki, Kohei, Ishii, Idaku, and Namiki, Akio
- Subjects
- *
DEPTH of field , *OBJECT tracking (Computer vision) , *LIGHT filters , *K-means clustering , *SINE waves , *CAPABILITIES approach (Social sciences) - Abstract
Active vision systems (AVSs) have been widely used to obtain high-resolution images of objects of interest. However, tracking small objects in high-magnification scenes is challenging due to shallow depth of field (DoF) and narrow field of view (FoV). To address this, we introduce a novel high-speed AVS with a continuous autofocus (C-AF) approach based on dynamic-range focal sweep and a high-frame-rate (HFR) frame-by-frame tracking pipeline. Our AVS leverages an ultra-fast pan-tilt mechanism based on a Galvano mirror, enabling high-frequency view direction adjustment. Specifically, the proposed C-AF approach uses a 500 fps high-speed camera and a focus-tunable liquid lens operating at a sine wave, providing a 50 Hz focal sweep around the object's optimal focus. During each focal sweep, 10 images with varying focuses are captured, and the one with the highest focus value is selected, resulting in a stable output of well-focused images at 50 fps. Simultaneously, the object's depth is measured using the depth-from-focus (DFF) technique, allowing dynamic adjustment of the focal sweep range. Importantly, because the remaining images are only slightly less focused, all 500 fps images can be utilized for object tracking. The proposed tracking pipeline combines deep-learning-based object detection, K-means color clustering, and HFR tracking based on color filtering, achieving 500 fps frame-by-frame tracking. Experimental results demonstrate the effectiveness of the proposed C-AF approach and the advanced capabilities of the high-speed AVS for magnified object tracking. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
203. Object modeling through weightless tracking.
- Author
-
do Nascimento, Daniel N. and França, Felipe M. G.
- Subjects
- *
ARTIFICIAL neural networks , *ARTIFICIAL satellite tracking , *PRIOR learning - Abstract
This paper presents a method to perform the real-time creation of models that are used to represent aspects of tracked objects in video frames. Object modeling is done during the task of tracking previously unseen selected objects, and both tracking and model creation are implemented using the WiSARD weightless neural network and occur in real time, starting from no prior knowledge. The main purpose of this work is to track an object through camera images and, simultaneously, create a model that describes the presented appearances along with the transitions between each learned aspect. To achieve this goal, an object tracker based on the ClusWiSARD weightless neural network model was used to determine the states that describe the observed objects. In this way, it is possible to obtain a system that capture knowledge about the visual structures of the learned objects, creating relationships between the possible appearances, and being able to transit over the model aspects in an appropriate way. Furthermore, the created models have visual representations that can be used to show the learned aspects and validate the state transitions, in addition to being able to visualize occluded parts of objects. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
204. Coordination of Robots for Object Transfer from Source to Destination.
- Author
-
Swetha, R., J. K., Yamini, S. M., Vinay Manju, Nagaraj, Pavani, and B., Gayathri Devi
- Subjects
ROBOTS ,RADIO frequency ,MULTICASTING (Computer networks) - Abstract
In this study, a centralized monitoring system orchestrates the deployment and coordination of multiple robots within a supervised environment. The primary goal is to detect and monitor specific entities using robots. The traversal of these robots is executed by a nondifferential objective function, wherein their next positions are decided by minimizing this function. Communication amongst the robots is achieved through a radio frequency wireless communication protocol. The system utilises Python's Open CV for sophisticated imageprocessing capabilities. The robots must be able to pick up the payloads and place them at the target. The optimal distance and time taken for the process is calculated. [ABSTRACT FROM AUTHOR]
- Published
- 2024
205. Robust Heart Rate Monitoring System: Contactless Approach using Fast Fourier Transform.
- Author
-
Sujatha, E., Divya, M., Nandhini, I., Sakunthala, and Raju, D. Naveen
- Subjects
FAST Fourier transforms ,HEART rate monitoring ,HEART rate monitors ,HEART beat ,SIGNAL processing ,COMPUTER performance - Abstract
Today, to be alive in this world we need to have all our body parts working in good condition and since technology is improving faster day by day everything has become automatic and AI enabled. Measuring the heart-rate (HR) of people has multiple applications in telemedicine, Internet-of-Things (IoT), sports, security, etc. However, sometimes it becomes difficult to use a classic method for measuring HR and it does not support scalability. This project is about giving a solution that works live and gives the HR of a person using web camera. Face detection is combined with object tracking is used to produce a set of frames, which are sampled in the later stages of the pipeline for color variations. The average of the color, in a region of interest (ROI) is chosen on the face, representing a signal which corresponds to the heart rate. Signal processing techniques are used to extract the frequency from the signals obtained. The algorithm used is FFT (Fast Fourier Transform) to make image transformation between frequency and special domain. This method is precise and efficient. And using this system another method is also enabled in which the video of a person is uploaded and the heart rate is found through video. By utilizing object tracking combined with face detection, this method reduces the processing power needed and allows better scaling. The accuracy is 98% compared to the real time existing systems. [ABSTRACT FROM AUTHOR]
- Published
- 2024
206. A machine learning pipeline for extracting decision-support features from traffic scenes1.
- Author
-
Fraga, Vitor A., Schreiber, Lincoln V., da Silva, Marco Antonio C., Kunst, Rafael, Barbosa, Jorge L.V., and Ramos, Gabriel de O.
- Subjects
- *
DEEP learning , *MACHINE learning , *TRACKING algorithms , *TRAFFIC signs & signals , *TRAFFIC engineering , *REINFORCEMENT learning - Abstract
Traffic systems play a key role in modern society. However, these systems are increasingly suffering from problems, such as congestions. A well-known way to efficiently reduce this kind of problem is to perform traffic light control intelligently through reinforcement learning (RL) algorithms. In this context, extracting relevant features from the traffic environment to support decision-making becomes a central concern. Examples of such features include vehicle counting on each queue, identification of vehicles' origins and destinations, among others. Recently, the advent of deep learning has paved to way to efficient methods for extracting some of the aforementioned features. However, the problem of identifying vehicles and their origins and destinations within an intersection has not been fully addressed in the literature, even though such information has shown to play a role in RL-based traffic signal control. Building against this background, in this work we propose a deep learning pipeline for extracting relevant features from intersections based on traffic scenes. Our pipeline comprises three main steps: (i) a YOLO-based object detector fine-tuned using the UAVDT dataset, (ii) a tracking algorithm to keep track of vehicles along their trajectories, and (iii) an origin-destination identification algorithm. Using this pipeline, it is possible to identify vehicles as well as their origins and destinations within a given intersection. In order to assess our pipeline, we evaluated each of its modules separately as well as the pipeline as a whole. The object detector model obtained 98.2% recall and 79.5% precision, on average. The tracking algorithm obtained a MOTA of 72.6% and a MOTP of 74.4%. Finally, the complete pipeline obtained an average error rate of 3.065% in terms of origin and destination counts. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
207. SA-DCPNet: Scale-aware deep convolutional pyramid network for crowd counting.
- Author
-
Tyagi, Bhawana, Nigam, Swati, and Singh, Rajiv
- Subjects
- *
STANDARD deviations , *PYRAMIDS , *DEEP learning , *COMPUTER vision , *VISUAL fields - Abstract
Crowd counting is one of the most complex research topics in the field of computer vision. There are many challenges associated with this task, including severe occlusion, scale variation, and complex background. Multi-column networks are commonly used for crowd counting, but they suffer from scale variation and feature similarity, which leads to poor analysis of crowd sequences. To address these issues, we propose a scale-aware deep convolutional pyramid network for crowd counting. We have introduced a scale-aware deep convolutional pyramid module by integrating message passing and global attention mechanisms into a multi-column network. The proposed network minimizes the problem of scale variation using SA-DPCM and uses a multi-column variance loss function to handle issues with feature similarity. Experiments have been performed over the ShanghaiTech and UCF-CC-50 datasets, which show the better performance of the proposed method in terms of mean absolute error and root mean square error. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
208. Enhanced Unmanned Aerial Vehicle Localization in Dynamic Environments Using Monocular Simultaneous Localization and Mapping and Object Tracking.
- Author
-
El Gaouti, Youssef, Khenfri, Fouad, Mcharek, Mehdi, and Larouci, Cherif
- Subjects
- *
MONOCULARS , *KALMAN filtering - Abstract
This work proposes an innovative approach to enhance the localization of unmanned aerial vehicles (UAVs) in dynamic environments. The methodology integrates a sophisticated object-tracking algorithm to augment the established simultaneous localization and mapping (ORB-SLAM) framework, utilizing only a monocular camera setup. Moving objects are detected by harnessing the power of YOLOv4, and a specialized Kalman filter is employed for tracking. The algorithm is integrated into the ORB-SLAM framework to improve UAV pose estimation by correcting the impact of moving elements and effectively removing features connected to dynamic elements from the ORB-SLAM process. Finally, the results obtained are recorded using the TUM RGB-D dataset. The results demonstrate that the proposed algorithm can effectively enhance the accuracy of pose estimation and exhibits high accuracy and robustness in real dynamic scenes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
209. 融合相机与激光雷达的目标检测、跟踪与预测.
- Author
-
黄远宪, 周 剑, 黄 琦, 李必军, 王兰兰, and 朱佳琳
- Subjects
- *
CONVOLUTIONAL neural networks , *OPTICAL radar , *LIDAR , *STANDARD deviations , *POINT cloud - Abstract
Objectives: A real-time and robust 3D dynamic object perception module is a key part of autonomous driving system. Methods: This paper fuses monocular camera and light detection and ranging (LiDAR) to detect 3D objects. First, we use convolutional neural network to detect 2D bounding boxes and generate 3D frustum region of interest (ROI) according to the geometric projection relation between camera and LiDAR. Then, we cluster the point cloud in the frustum ROI and fit the 3D bounding box of the objects. After detecting 3D objects, we reidentify the objects between adjacent frames by appearance features and Hungarian algorithm, and then propose a tracker management model based on a quad-state machine. Finally, a novel prediction model is proposed, which leverages lane lines to constrain vehicle trajectories. Results: The experimental results show that in the stage of target detection, the accuracy and recall of the proposed algorithm can reach 92.5% and 86.7%, respectively. The root mean square error of the proposed trajectory prediction algorithm is smaller than that of the existing algorithms on the simulation datasets including straight line, arc and spiral curves. The whole algorithm only takes approximately 25 ms, which meets the real-time requirements. Conclusions: The proposed algorithm is effective and efficient, and has a good performance in different lane lines. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
210. Object Tracking Based on Optical Flow Reconstruction of Motion-Group Parameters.
- Author
-
Karpuzov, Simeon, Petkov, George, Ilieva, Sylvia, Petkov, Alexander, and Kalitzin, Stiliyan
- Subjects
- *
OPTICAL flow , *GROUP velocity , *EPILEPSY , *SEIZURES (Medicine) , *PATIENT safety , *AUTONOMOUS vehicles , *CLINICAL medicine - Abstract
Rationale. Object tracking has significance in many applications ranging from control of unmanned vehicles to autonomous monitoring of specific situations and events, especially when providing safety for patients with certain adverse conditions such as epileptic seizures. Conventional tracking methods face many challenges, such as the need for dedicated attached devices or tags, influence by high image noise, complex object movements, and intensive computational requirements. We have developed earlier computationally efficient algorithms for global optical flow reconstruction of group velocities that provide means for convulsive seizure detection and have potential applications in fall and apnea detection. Here, we address the challenge of using the same calculated group velocities for object tracking in parallel. Methods. We propose a novel optical flow-based method for object tracking. It utilizes real-time image sequences from the camera and directly reconstructs global motion-group parameters of the content. These parameters can steer a rectangular region of interest surrounding the moving object to follow the target. The method successfully applies to multi-spectral data, further improving its effectiveness. Besides serving as a modular extension to clinical alerting applications, the novel technique, compared with other available approaches, may provide real-time computational advantages as well as improved stability to noisy inputs. Results. Experimental results on simulated tests and complex real-world data demonstrate the method's capabilities. The proposed optical flow reconstruction can provide accurate, robust, and faster results compared to current state-of-the-art approaches. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
211. Towards Reliable Identification and Tracking of Drones Within a Swarm.
- Author
-
Kumari, Nisha, Lee, Kevin, Barca, Jan Carlo, and Ranaweera, Chathurika
- Abstract
Drone swarms consist of multiple drones that can achieve tasks that individual drones can not, such as search and recovery or surveillance over a large area. A swarm’s internal structure typically consists of multiple drones operating autonomously. Reliable detection and tracking of swarms and individual drones allow a greater understanding of the behaviour and movement of a swarm. Increased understanding of drone behaviour allows better coordination, collision avoidance, and performance monitoring of individual drones in the swarm. The research presented in this paper proposes a deep learning-based approach for reliable detection and tracking of individual drones within a swarm using stereo-vision cameras in real time. The motivation behind this research is in the need to gain a deeper understanding of swarm dynamics, enabling improved coordination, collision avoidance, and performance monitoring of individual drones within a swarm. The proposed solution provides a precise tracking system and considers the highly dense and dynamic behaviour of drones. The approach is evaluated in both sparse and dense networks in a variety of configurations. The accuracy and efficiency of the proposed solution have been analysed by implementing a series of comparative experiments that demonstrate reasonable accuracy in detecting and tracking drones within a swarm. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
212. A Comprehensive Survey on Deep Learning Techniques for Digital Video Forensics.
- Author
-
Vigneshwaran, T. and Velammal, B. L.
- Subjects
DIGITAL forensics ,DIGITAL video ,DEEP learning ,DIGITAL learning ,SOCIAL media ,SOCIAL networks - Abstract
With the help of advancements in connected technologies, social media and networking have made a wide open platform to share information via audio, video, text, etc. Due to the invention of smartphones, video contents are being manipulated day-by-day. Videos contain sensitive or personal information which are forged for one's own self pleasures or threatening for money. Video falsification identification plays a most prominent role in case of digital forensics. This paper aims to provide a comprehensive survey on various problems in video falsification, deep learning models utilised for detecting the forgery. This survey provides a deep understanding of various algorithms implemented by various authors and their advantages, limitations thereby providing an insight for future researchers. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
213. Object Tracking Using Computer Vision: A Review.
- Author
-
Kadam, Pushkar, Fang, Gu, and Zou, Ju Jia
- Subjects
COMPUTER vision ,DEEP learning ,OBJECT tracking (Computer vision) ,RESEARCH questions ,IMAGE processing - Abstract
Object tracking is one of the most important problems in computer vision applications such as robotics, autonomous driving, and pedestrian movement. There has been a significant development in camera hardware where researchers are experimenting with the fusion of different sensors and developing image processing algorithms to track objects. Image processing and deep learning methods have significantly progressed in the last few decades. Different data association methods accompanied by image processing and deep learning are becoming crucial in object tracking tasks. The data requirement for deep learning methods has led to different public datasets that allow researchers to benchmark their methods. While there has been an improvement in object tracking methods, technology, and the availability of annotated object tracking datasets, there is still scope for improvement. This review contributes by systemically identifying different sensor equipment, datasets, methods, and applications, providing a taxonomy about the literature and the strengths and limitations of different approaches, thereby providing guidelines for selecting equipment, methods, and applications. Research questions and future scope to address the unresolved issues in the object tracking field are also presented with research direction guidelines. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
214. Probabilistic 3D motion model for object tracking in aerial applications.
- Author
-
Mirtajadini, Seyed Hojat, Amiri Atashgah, MohammadAli, and Shahbazi, Mohammad
- Subjects
AERIAL spraying & dusting in agriculture ,OBJECT tracking (Computer vision) ,MONTE Carlo method ,DISTRIBUTION (Probability theory) ,AERIAL surveillance ,MACHINE learning - Abstract
Visual object tracking, crucial in aerial applications such as surveillance, cinematography, and chasing, faces challenges despite AI advancements. Current solutions lack full reliability, leading to common tracking failures in the presence of fast motions or long‐term occlusions of the subject. To tackle this issue, a 3D motion model is proposed that employs camera/vehicle states to locate a subject in the inertial coordinates. Next, a probability distribution is generated over future trajectories and they are sampled using a Monte Carlo technique to provide search regions that are fed into an online appearance learning process. This 3D motion model incorporates machine‐learning approaches for direct range estimation from monocular images. The model adapts computationally by adjusting search areas based on tracking confidence. It is integrated into DiMP, an online and deep learning‐based appearance model. The resulting tracker is evaluated on the VIOT dataset with sequences of both images and camera states, achieving a 68.9% tracking precision compared to DiMP's 49.7%. This approach demonstrates increased tracking duration, improved recovery after occlusions, and faster motions. Additionally, this strategy outperforms random searches by about 3.0%. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
215. Trans-RGBT:RGBT Object Tracking with Transformer.
- Author
-
LIU Wanjun, LIANG Linlin, and QU Haicheng
- Subjects
DATA mining ,TRACKING algorithms ,FILTERS & filtration ,INFRARED imaging ,DECISION making ,DATA fusion (Statistics) - Abstract
The current object tracking methods mostly fuse different modal information to make localization decisions, which has the problems of insufficient information extraction, simple fusion methods, and inability to accurately track targets in low-light scenes. To this end, a Transformer-based multi-modal object tracking algorithm (Trans-RGBT) is proposed. Firstly, the visible and infrared images are extracted separately by using a pseudo-twin network, and fully fused at the feature level. Secondly, the first frame of target information is modulated into feature vector of the frame to be tracked to obtain a dedicated tracker. Then, transformer method is applied to code and decode for target in the field of view. Spatial position of the target in the field of view is predicted by the spatial position prediction branch and the interference target is filtered out by combining the historical information to obtain accurate position of the target. Finally, external rectangular frame of the target is predicted by using the rectangular frame regression network, so as to achieve accurate target tracking. Full experiments are conducted on the latest large-scale dataset VTUAV and RGBT234. In comparison with the twin network (Siam-based) and filtering (filter-based) algorithms, Trans-RGBT has higher accuracy, better robustness and achieves a real-time tracking speed of 22 frames per second. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
216. Spatial feature embedding for robust visual object tracking.
- Author
-
Liu, Kang, Liu, Long, Yang, Shangqi, and Fu, Zhihao
- Subjects
TRACKING algorithms ,OBJECT tracking (Computer vision) ,MOTION analysis ,PROBLEM solving ,COMPUTER vision ,IMAGE analysis - Abstract
Recently, the offline‐trained Siamese pipeline has drawn wide attention due to its outstanding tracking performance. However, the existing Siamese trackers utilise offline training to extract 'universal' features, which is insufficient to effectively distinguish between the target and fluctuating interference in embedding the information of the two branches, leading to inaccurate classification and localisation. In addition, the Siamese trackers employ a pre‐defined scale for cropping the search candidate region based on the previous frame's result, which might easily introduce redundant background noise (clutter, similar objects etc.), affecting the tracker's robustness. To solve these problems, the authors propose two novel sub‐network spatial employed to spatial feature embedding for robust object tracking. Specifically, the proposed spatial remapping (SRM) network enhances the feature discrepancy between target and distractor categories by online remapping, and improves the discriminant ability of the tracker on the embedding space. The MAML is used to optimise the SRM network to ensure its adaptability to complex tracking scenarios. Moreover, a temporal information proposal‐guided (TPG) network that utilises a GRU model to dynamically predict the search scale based on temporal motion states to reduce potential background interference is introduced. The proposed two network is integrated into two popular trackers, namely SiamFC++ and TransT, which achieve superior performance on six challenging benchmarks, including OTB100, VOT2019, UAV123, GOT10K, TrackingNet and LaSOT, TrackingNet and LaSOT denoting them as SiamSRMC and SiamSRMT, respectively. Moreover, the proposed trackers obtain competitive tracking performance compared with the state‐of‐the‐art trackers in the attribute of background clutter and similar object, validating the effectiveness of our method. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
217. A machine learning pipeline for extracting decision-support features from traffic scenes1.
- Author
-
Fraga, Vitor A., Schreiber, Lincoln V., da Silva, Marco Antonio C., Kunst, Rafael, Barbosa, Jorge L.V., and Ramos, Gabriel de O.
- Subjects
DEEP learning ,MACHINE learning ,TRACKING algorithms ,TRAFFIC signs & signals ,TRAFFIC engineering ,REINFORCEMENT learning - Abstract
Traffic systems play a key role in modern society. However, these systems are increasingly suffering from problems, such as congestions. A well-known way to efficiently reduce this kind of problem is to perform traffic light control intelligently through reinforcement learning (RL) algorithms. In this context, extracting relevant features from the traffic environment to support decision-making becomes a central concern. Examples of such features include vehicle counting on each queue, identification of vehicles' origins and destinations, among others. Recently, the advent of deep learning has paved to way to efficient methods for extracting some of the aforementioned features. However, the problem of identifying vehicles and their origins and destinations within an intersection has not been fully addressed in the literature, even though such information has shown to play a role in RL-based traffic signal control. Building against this background, in this work we propose a deep learning pipeline for extracting relevant features from intersections based on traffic scenes. Our pipeline comprises three main steps: (i) a YOLO-based object detector fine-tuned using the UAVDT dataset, (ii) a tracking algorithm to keep track of vehicles along their trajectories, and (iii) an origin-destination identification algorithm. Using this pipeline, it is possible to identify vehicles as well as their origins and destinations within a given intersection. In order to assess our pipeline, we evaluated each of its modules separately as well as the pipeline as a whole. The object detector model obtained 98.2% recall and 79.5% precision, on average. The tracking algorithm obtained a MOTA of 72.6% and a MOTP of 74.4%. Finally, the complete pipeline obtained an average error rate of 3.065% in terms of origin and destination counts. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
218. 基于特征融合的轻量化巡飞弹目标跟踪算法.
- Author
-
王子康 and 姚文进
- Abstract
Copyright of Journal of Ordnance Equipment Engineering is the property of Chongqing University of Technology and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
219. 基于弹载图像的代价敏感与平滑约束 结构化 SVM 目标跟踪方法.
- Author
-
孙子文, 钱立志, 杨传栋, 袁广林, and 凌冲
- Abstract
Copyright of Journal of Ordnance Equipment Engineering is the property of Chongqing University of Technology and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
220. Adaptive sparse attention-based compact transformer for object tracking.
- Author
-
Pan, Fei, Zhao, Lianyu, and Wang, Chenglin
- Abstract
The Transformer-based Siamese networks have excelled in the field of object tracking. Nevertheless, a notable limitation persists in their reliance on ResNet as backbone, which lacks the capacity to effectively capture global information and exhibits constraints in feature representation. Furthermore, these trackers struggle to effectively attend to target-relevant information within the search region using multi-head self-attention (MSA). Additionally, they are prone to robustness challenges during online tracking and tend to exhibit significant model complexity. To address these limitations, We propose a novel tracker named ASACTT, which includes a backbone network, feature fusion network and prediction head. First, we improve the Swin-Transformer-Tiny to enhance its global information extraction capabilities. Second, we propose an adaptive sparse attention (ASA) to focus on target-specific details within the search region. Third, we leverage position encoding and historical candidate data to develop a dynamic template updater (DTU), which ensures the preservation of the initial frame’s integrity while gracefully adapting to variations in the target’s appearance. Finally, we optimize the network model to maintain accuracy while minimizing complexity. To verify the effectiveness of our proposed tracker, ASACTT, experiments on five benchmark datasets demonstrated that the proposed tracker was highly comparable to other state-of-the-art methods. Notably, in the GOT-10K1 evaluation, our tracker achieved an outstanding success score of 75.3% at 36 FPS, significantly surpassing other trackers with comparable model parameters. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
221. Spiking SiamFC++: deep spiking neural network for object tracking.
- Author
-
Xiang, Shuiying, Zhang, Tao, Jiang, Shuqing, Han, Yanan, Zhang, Yahui, Guo, Xingxing, Yu, Licun, Shi, Yuechun, and Hao, Yue
- Abstract
Spiking neural network (SNN) is a biologically-plausible model and exhibits advantages of high computational capability and low power consumption. While the training of deep SNN is still an open problem, which limits the real-world applications of deep SNN. Here we propose a deep SNN architecture named Spiking SiamFC++ for object tracking with end-to-end direct training. Specifically, the AlexNet network is extended in the time domain to extract the feature, and the surrogate gradient function is adopted to realize direct supervised training of the deep SNN. To examine the performance of the Spiking SiamFC++, several tracking benchmarks including OTB2013, OTB2015, VOT2015, VOT2016, and UAV123 are considered. It is found that, the precision loss is small compared with the original SiamFC++. Compared with the existing SNN-based target tracker, e.g., the SiamSNN, the precision (success) of the proposed Spiking SiamFC++ reaches 0.861 (0.644), which is much higher than that of 0.528 (0.443) achieved by the SiamSNN. To our best knowledge, the performance of the Spiking SiamFC++ outperforms the existing state-of-the-art approaches in SNN-based object tracking, which provides a novel path for SNN application in the field of target tracking. This work may further promote the development of SNN algorithms and neuromorphic chips. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
222. AODiMP‐TIR: Anti‐occlusion thermal infrared targets tracker based on SuperDiMP.
- Author
-
Ma, Shaoyang, Yang, Yao, and Chen, Gang
- Subjects
MAP design ,OBJECT tracking (Computer vision) ,INFRARED imaging ,KALMAN filtering ,PEDESTRIANS - Abstract
To address the issue of tracking drift and failures in thermal infrared (TIR) tracking tasks caused by target occlusion, this study proposes an anti‐occlusion TIR target tracker named AODiMP‐TIR. This approach involves an anti‐occlusion strategy that relies on target occlusion status determination and trajectory prediction. This enables the prediction of the target's current position when it is identified as occluded, ensuring swift recapture upon reappearance. A criterion is introduced for occlusion status determination based on the classification response map of SuperDiMP. Additionally, a trajectory mapping module designed to decouple target motion from camera motion is presented, enhancing the precision of trajectory prediction. Comparative experiments with other state‐of‐the‐art trackers are conducted on the large‐scale high‐diversity thermal infrared object tracking benchmark (LSOTB‐TIR), LSOTB‐TIR100, and thermal infrared pedestrian tracking benchmark (PTB‐TIR) datasets. The results indicate that the AODiMP‐TIR performs well across all three datasets, particularly exhibiting outstanding performance in occlusion sequences. Furthermore, ablation study experiments confirm the effectiveness of the anti‐occlusion strategy, occlusion determination criterion and trajectory mapping module. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
223. A modular motion compensation pipeline for prospective respiratory motion correction of multi-nuclear MR spectroscopy.
- Author
-
Wampl, Stefan, Körner, Tito, Meyerspeer, Martin, Zaitsev, Maxim, Wolf, Marcos, Trattnig, Siegfried, Wolzt, Michael, Bogner, Wolfgang, and Schmid, Albrecht Ingo
- Subjects
CARDIAC magnetic resonance imaging ,OBJECT tracking (Computer vision) ,MAGNETIC resonance imaging ,NUCLEAR magnetic resonance spectroscopy ,NUCLEAR spectroscopy ,MODULAR construction ,FOUR-dimensional imaging - Abstract
Magnetic resonance (MR) acquisitions of the torso are frequently affected by respiratory motion with detrimental effects on signal quality. The motion of organs inside the body is typically decoupled from surface motion and is best captured using rapid MR imaging (MRI). We propose a pipeline for prospective motion correction of the target organ using MR image navigators providing absolute motion estimates in millimeters. Our method is designed to feature multi-nuclear interleaving for non-proton MR acquisitions and to tolerate local transmit coils with inhomogeneous field and sensitivity distributions. OpenCV object tracking was introduced for rapid estimation of in-plane displacements in 2D MR images. A full three-dimensional translation vector was derived by combining displacements from slices of multiple and arbitrary orientations. The pipeline was implemented on 3 T and 7 T MR scanners and tested in phantoms and volunteers. Fast motion handling was achieved with low-resolution 2D MR image navigators and direct implementation of OpenCV into the MR scanner's reconstruction pipeline. Motion-phantom measurements demonstrate high tracking precision and accuracy with minor processing latency. The feasibility of the pipeline for reliable in-vivo motion extraction was shown on heart and kidney data. Organ motion was manually assessed by independent operators to quantify tracking performance. Object tracking performed convincingly on 7774 navigator images from phantom scans and different organs in volunteers. In particular the kernelized correlation filter (KCF) achieved similar accuracy (74%) as scored from inter-operator comparison (82%) while processing at a rate of over 100 frames per second. We conclude that fast 2D MR navigator images and computer vision object tracking can be used for accurate and rapid prospective motion correction. This and the modular structure of the pipeline allows for the proposed method to be used in imaging of moving organs and in challenging applications like cardiac magnetic resonance spectroscopy (MRS) or magnetic resonance imaging (MRI) guided radiotherapy. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
224. Feature-Based Object Detection and Tracking: A Systematic Literature Review.
- Author
-
Husna Fauzi, Nurul Izzatie, Musa, Zalili, and Hujainah, Fadhl
- Subjects
- *
RESEARCH questions , *ARTIFICIAL satellite tracking - Abstract
Correct object detection plays a key role in generating an accurate object tracking result. Feature-based methods have the capability of handling the critical process of extracting features of an object. This paper aims to investigate object tracking using feature-based methods in terms of (1) identifying and analyzing the existing methods; (2) reporting and scrutinizing the evaluation performance matrices and their implementation usage in measuring the effectiveness of object tracking and detection; (3) revealing and investigating the challenges that affect the accuracy performance of identified tracking methods; (4) measuring the effectiveness of identified methods in terms of revealing to what extent the challenges can impact the accuracy and precision performance based on the evaluation performance matrices reported; and (5) presenting the potential future directions for improvement. The review process of this research was conducted based on standard systematic literature review (SLR) guidelines by Kitchenam's and Charters'. Initially, 157 prospective studies were identified. Through a rigorous study selection strategy, 32 relevant studies were selected to address the listed research questions. Thirty-two methods were identified and analyzed in terms of their aims, introduced improvements, and results achieved, along with presenting a new outlook on the classification of identified methods based on the feature-based method used in detection and tracking process. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
225. Joint target and background temporal propagation for aerial tracking.
- Author
-
Lei, Xu, Cheng, Wensheng, Xu, Chang, and Yang, Wen
- Subjects
- *
RESCUE work , *ENVIRONMENTAL monitoring , *DISTRACTION , *ARTIFICIAL satellite tracking - Abstract
Tracking objects from aerial imagery is significant in numerous remote sensing-based applications, including environmental monitoring, security surveillance, and search & rescue. However, tracking specific targets in aerial images is still challenging due to target appearance variation and similar object distraction. To address these challenges, we propose a joint target and background temporal propagation approach for aerial tracking, dubbed JTBP. JTBP leverages the temporal coherence present in video sequences and consists of two modules: the target temporal propagation module and the background temporal propagation module. The former adjusts the template by mining target-specific information from the temporal domain. It utilizes a key–value mechanism to weigh the channels of the template based on target-specific features, allowing the template to adapt to variations in target appearance. The latter identifies background objects from the temporal domain and effectively distinguishes similar objects by leveraging the temporal coherence of objects in the background. On four benchmarks, the new tracker JTBP shows consistent improvements over baselines and achieves leading performance compared to advanced trackers. Notably, our approach outperforms the state-of-the-art method by 2.8 points in terms of success rate on UAV112Track_L dataset. The project is available at https://chnleixu.github.io/JTBP-Web/. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
226. Infrared multi‐target detection and tracking in dense urban traffic scenes.
- Author
-
Zha, Chaoneng, Luo, Suyun, and Xu, Xinhao
- Subjects
CITY traffic ,OBJECT recognition (Computer vision) ,TRACKING radar ,FEATURE extraction ,GABOR filters ,INFRARED imaging ,TRAFFIC monitoring ,OBJECT tracking (Computer vision) - Abstract
Infrared object detection and tracking in dense urban traffic remain a challenge due to factors such as low contrast, small intra‐class differences, and frequent false positives and negatives. To overcome these, the authors introduce YOLO‐IR, an algorithm based on the enhanced YOLOv8s, and YOLO‐DeepOC‐IR, a comprehensive infrared multi‐object tracking method for urban traffic, integrating both detection and tracking. During preprocessing, three infrared image enhancement techniques, local contrast multi‐scale enhancement, non‐local means, and contrast limited adaptive histogram equalization, are applied for better reliability in dense scenes. To further improve the performance, the original YOLOv8s backbone is replaced with MobileVITv3 to enhance detection accuracy and robustness. This infrared feature extraction module, incorporated into the detector, combines canny edge detection, Gabor filtering, and open operation layers, significantly boosting object detection in infrared imagery. The tracker's feature processing capabilities are improved using the learned arrangements of three patch codes descriptor and locality‐sensitive hashing for feature extraction and matching. Experimental results on FLIR ADAS v2 and InfiRay datasets indicate superior performance of this method, achieving 78.6% mAP and 151.1 FPS in detection, and up to 80.8% moving object tracking accuracy, 78.6% identification F1 score, and 62.1% higher order tracking accuracy in multi‐object tracking. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
227. Developing a dynamic obstacle avoidance system for autonomous mobile robots using Bayesian optimization and object tracking: Implementation and testing.
- Author
-
Wu, Chung-Hsin and Chan, Kuei-Yuan
- Abstract
This study addresses the challenges faced by autonomous mobile robots in dynamic environments where moving obstacles such as people, cars, and other mobile robots are prevalent. Current motion planning methods for mobile robots focus on stationary obstacles or equivalent static ones at each control moment, resulting in collisions with moving objects or the need for longer distances to avoid them. To address this issue, we propose a system that incorporates dynamic information about moving obstacles into the motion planning decision-making process. We developed an object tracking technique using 2D LiDARs to gather this information and optimized the relevant parameters using Bayesian optimization. The dynamic information of moving obstacles is saved in a costmap representation at each instant, allowing the robot to yield or ignore confronting obstacles based on the prediction of who shall pass first. Simulation and real-world testing demonstrate that our approach saves paths for mobile robots in dynamic environments and prevents them from colliding with dynamic obstacles, improving the practicality of mobile robots working alongside moving humans. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
228. FPGA-base object tracking: integrating deep learning and sensor fusion with Kalman filter.
- Author
-
Harouna Maloum, Abdoul Moumouni, Muchuka, Nicasio Maguu, and Kiruki, Cosmas Raymond Mutugi
- Subjects
KALMAN filtering ,DEEP learning ,OBJECT recognition (Computer vision) ,FIELD programmable gate arrays ,CENTRAL processing units ,FEATURE extraction - Abstract
This research presents an integrated approach for object detection and tracking in autonomous perception systems, combining deep learning techniques for object detection with sensor fusion and field programmable gate array (FPGA-based) hardware implementation of the Kalman filter. This approach is suitable for applications like autonomous vehicles, robotics, and augmented reality. The study explores the seamless integration of pretrained deep learning models, sensor data from a depth camera, real-sense D435, and FPGA-based Kalman filtering to achieve robust and accurate 3D position and 2D size estimation of tracked objects while maintaining low latency. The object detection and feature extraction are implemented on a central processing unit (CPU), and the Kalman filter sensor fusion with universal asynchronous receiver transmitter (UART) communication is implemented on a Basys 3 FPGA board that performs 8 times faster compared to the software approach. The experimental result provides the hardware resource utilization of about 29% of look-up tables, 6% of lookup table RAMs (LUTRAM), 15% of Flip-flops, 32% of Block-RAM, 38% of DSP blocks operating at 100 MHz, and 230400 baud rates for the UART. The whole FPGA design executes at 2.1 milliseconds, the Kalman filter executes at 240 microseconds, and the UART at 1.86 milliseconds. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
229. Multi-object detection and behavior tracking of sea cucumbers with skin ulceration syndrome based on deep learning.
- Author
-
Fengli Ge, Kui Xuan, Peng Lou, Juan Li, Lingxu Jiang, Jiasheng Wang, and Qi Lin
- Subjects
SEA cucumbers ,DEEP learning ,OBJECT recognition (Computer vision) ,BEHAVIORAL assessment ,INTELLIGENCE levels ,INFECTIOUS disease transmission - Abstract
Skin ulceration syndrome of sea cucumbers is one of the most serious diseases in intensive aquaculture, and it is the most effective way of preventing the spread of this disease to detect the abnormal behavior of sea cucumbers in time and take corresponding measures. However, the detection and tracking of multi-object is a hard problem in sea cucumber behavior analysis. To solve this problem, this paper first proposes a novel one-stage algorithm SUS-YOLOv5 for multi-object detection and tracking of sea cucumbers. The proposed SUS-YOLOv5 optimizes the maximum suppression algorithm in the overlapping region of the object detection box. Next, the SE-BiFPN feature fusion structure is proposed to enhance the transmission efficiency of feature information between deep and shallow layers of the network. Then, a MO-Tracking algorithm is proposed integrated with DeepSORT to achieve real-time multi-object tracking. Experimental results show that the mAP@0.5 and mAP@0.5:0.95 of the proposed object detector reach 95.40% and 83.80%, respectively, which are 3.30% and 4.10% higher than the original YOLOv5s. Compared with the traditional SSD, YOLOv3, and YOLOv4, the mAP of SUS-YOLOv5 is improved by 5.49%, 1.57%, and 3.76%, respectively. This research can realize the multiobject detection and tracking, which lays the foundation for the prediction of skin ulceration syndrome in sea cucumbers and has a certain practical application value for improving the intelligence level of aquaculture. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
230. Study of stability and object tracking of traffic video image for smart cities.
- Author
-
Yongfeng, Xing, Luo, Zhong, and Xian, Zhong
- Subjects
- *
SMART cities , *IMAGE registration , *TRACKING algorithms , *VIDEOS , *PROBLEM solving , *CENTROID - Abstract
In order to solve the problem that image information is not effectively utilized because of unclear traffic video images and random jitter between image sequences, this paper has studied how to achieve stability of traffic video images and proposed an improved Mean Shift algorithm about how to conduct object centroid registration in compensation for the deviations in space localization and on this basis, how to select kernel window width to eliminate the errors in scale positioning. This algorithm gets the tracking effect and computation analysis of improved Mean Shift from the perspective of applications; removes or relieves the impact motion has on imaging, improves the quality of the video image information obtained, automatically adjusts the size of window according to the scale changes of moving object in the image, and effectively enhances the stability and real-time of object tracking. The experiment has proven that the algorithm of this paper is practical to some extent. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
231. Multimodal Features Alignment for Vision–Language Object Tracking.
- Author
-
Ye, Ping, Xiao, Gang, and Liu, Jun
- Subjects
- *
SUCCESS , *VISUAL perception , *FORECASTING - Abstract
Vision–language tracking presents a crucial challenge in multimodal object tracking. Integrating language features and visual features can enhance target localization and improve the stability and accuracy of the tracking process. However, most existing fusion models in vision–language trackers simply concatenate visual and linguistic features without considering their semantic relationships. Such methods fail to distinguish the target's appearance features from the background, particularly when the target changes dramatically. To address these limitations, we introduce an innovative technique known as multimodal features alignment (MFA) for vision–language tracking. In contrast to basic concatenation methods, our approach employs a factorized bilinear pooling method that conducts squeezing and expanding operations to create a unified feature representation from visual and linguistic features. Moreover, we integrate the co-attention mechanism twice to derive varied weights for the search region, ensuring that higher weights are placed on the aligned visual and linguistic features. Subsequently, the fused feature map with diverse distributed weights serves as the search region during the tracking phase, facilitating anchor-free grounding to predict the target's location. Extensive experiments are conducted on multiple public datasets, and our proposed tracker obtains a success score of 0.654/0.553/0.447 and a precision score of 0.872/0.556/0.513 on OTB-LANG/LaSOT/TNL2K. These results are satisfying compared with those of recent state-of-the-art vision–language trackers. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
232. Real-time object detection and tracking using YOLOv3 network by quadcopter.
- Author
-
Mokhtari, Mir Abolfazl and Taheri, Mehrdad
- Subjects
- *
OBJECT recognition (Computer vision) , *PYTHON programming language , *TRACKING control systems , *ARTIFICIAL intelligence , *ARTIFICIAL satellite tracking - Abstract
In this article, artificial intelligence is applied for real-time object detection in Tello quadcopters. For this purpose, the YOLOv3 detection algorithm as a highly used deep-learning method is employed. The results indicate that the YOLOv3 network can be trained with an accuracy of 99 percent and can detect the target with above 95 percent accuracy at a speed of 15 frames-per-second for different ambient lighting and background conditions. The YOLOv3 algorithm is trained using a custom dataset and implemented in Python. Images are sent to a computer using Python language to detect the target and entered into the YOLOv3 algorithm. After detecting the target, the errors are calculated and given to the control system to track the target in real-time. There are two purposes in tracking, the target is in the view of the quadcopter, and the quadcopter is at a certain distance from the target. For this purpose, the effect of quadcopter movements on the coordinates and area of the target is examined and four controllers are designed to follow the target and keep the robot at a certain distance from the target. The designed controllers efficiently follow the target and prevent flying robots from losing sight of the target. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
233. Analysis of Various Machine Learning Algorithms for Using Drone Images in Livestock Farms.
- Author
-
Gao, Jerry, Bambrah, Charanjit Kaur, Parihar, Nidhi, Kshirsagar, Sharvaree, Mallarapu, Sruthi, Yu, Hailong, Wu, Jane, and Yang, Yunyun
- Subjects
MACHINE learning ,LIVESTOCK farms ,DEEP learning ,AGRICULTURE ,ARTIFICIAL intelligence ,DRONE aircraft ,TRACKING radar - Abstract
With the development of artificial intelligence, the intelligence of agriculture has become a trend. Intelligent monitoring of agricultural activities is an important part of it. However, due to difficulties in achieving a balance between quality and cost, the goal of improving the economic benefits of agricultural activities has not reached the expected level. Farm supervision requires intensive human effort and may not produce satisfactory results. In order to achieve intelligent monitoring of agricultural activities and improve economic benefits, this paper proposes a solution that combines unmanned aerial vehicles (UAVs) with deep learning models. The proposed solution aims to detect and classify objects using UAVs in the agricultural industry, thereby achieving independent agriculture without human intervention. To achieve this, a highly reliable target detection and tracking system is developed using Unmanned Aerial Vehicles. The use of deep learning methods allows the system to effectively solve the target detection and tracking problem. The model utilizes data collected from DJI Mirage 4 unmanned aerial vehicles to detect, track, and classify different types of targets. The performance evaluation of the proposed method shows promising results. By combining UAV technology and deep learning models, this paper provides a cost-effective solution for intelligent monitoring of agricultural activities. The proposed method offers the potential to improve the economic benefits of farming while reducing the need for intensive hum. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
234. SiamRAAN: Siamese Residual Attentional Aggregation Network for Visual Object Tracking.
- Author
-
Xin, Zhiyi, Yu, Junyang, He, Xin, Song, Yalin, and Li, Han
- Abstract
The Siamese network-based tracker calculates object templates and search images independently, and the template features are not updated online when performing object tracking. Adapting to interference scenarios with performance-guaranteed tracking accuracy when background clutter, illumination variation or partial occlusion occurs in the search area is a challenging task. To effectively address the issue with the abovementioned interference and to improve location accuracy, this paper devises a Siamese residual attentional aggregation network framework for self-adaptive feature implicit updating. First, SiamRAAN introduces Self-RAAN into the backbone network by applying residual self-attention to extract effective objective features. Then, we introduce Cross-RAAN to update the template features online by focusing on the high-relevance parts in the feature extraction process of both the object template and search image. Finally, a multilevel feature fusion module is introduced to fuse the RAAN-enhanced feature information and improve the network’s ability to perceive key features. Extensive experiments conducted on benchmark datasets (GOT-10K, LaSOT, OTB-50, OTB-100 and UAV123) demonstrated that our SiamRAAN delivers excellent performance and runs at 51 FPS in various challenging object tracking tasks. Code is available at . [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
235. 基于主干增强和特征重排的反无人机目标跟踪..
- Author
-
郑滨汐, 杨志钢, and 丁钰峰
- Abstract
Copyright of Chinese Journal of Liquid Crystal & Displays is the property of Chinese Journal of Liquid Crystal & Displays and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
236. Image convolution techniques integrated with YOLOv3 algorithm in motion object data filtering and detection.
- Author
-
Cheng, Mai and Liu, Mengyuan
- Subjects
TRACKING algorithms ,FILTERS & filtration ,VIDEO surveillance ,ALGORITHMS ,IMAGE segmentation ,RESEARCH personnel ,JOGGING - Abstract
In order to address the challenges of identifying, detecting, and tracking moving objects in video surveillance, this paper emphasizes image-based dynamic entity detection. It delves into the complexities of numerous moving objects, dense targets, and intricate backgrounds. Leveraging the You Only Look Once (YOLOv3) algorithm framework, this paper proposes improvements in image segmentation and data filtering to address these challenges. These enhancements form a novel multi-object detection algorithm based on an improved YOLOv3 framework, specifically designed for video applications. Experimental validation demonstrates the feasibility of this algorithm, with success rates exceeding 60% for videos such as "jogging", "subway", "video 1", and "video 2". Notably, the detection success rates for "jogging" and "video 1" consistently surpass 80%, indicating outstanding detection performance. Although the accuracy slightly decreases for "Bolt" and "Walking2", success rates still hover around 70%. Comparative analysis with other algorithms reveals that this method's tracking accuracy surpasses that of particle filters, Discriminative Scale Space Tracker (DSST), and Scale Adaptive Multiple Features (SAMF) algorithms, with an accuracy of 0.822. This indicates superior overall performance in target tracking. Therefore, the improved YOLOv3-based multi-object detection and tracking algorithm demonstrates robust filtering and detection capabilities in noise-resistant experiments, making it highly suitable for various detection tasks in practical applications. It can address inherent limitations such as missed detections, false positives, and imprecise localization. These improvements significantly enhance the efficiency and accuracy of target detection, providing valuable insights for researchers in the field of object detection, tracking, and recognition in video surveillance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
237. Multi-modal visual tracking: Review and experimental comparison.
- Author
-
Zhang, Pengyu, Wang, Dong, and Lu, Huchuan
- Subjects
TRACKING algorithms ,COMPUTER vision ,OBJECT tracking (Computer vision) ,RESEARCH personnel ,TAXONOMY - Abstract
Visual object tracking has been drawing increasing attention in recent years, as a fundamental task in computer vision. To extend the range of tracking applications, researchers have been introducing information from multiple modalities to handle specific scenes, with promising research prospects for emerging methods and benchmarks. To provide a thorough review of multi-modal tracking, different aspects of multi-modal tracking algorithms are summarized under a unified taxonomy, with specific focus on visible-depth (RGB-D) and visible-thermal (RGB-T) tracking. Subsequently, a detailed description of the related benchmarks and challenges is provided. Extensive experiments were conducted to analyze the effectiveness of trackers on five datasets: PTB, VOT19-RGBD, GTOT, RGBT234, and VOT19-RGBT. Finally, various future directions, including model design and dataset construction, are discussed from different perspectives for further research. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
238. Ant algorithms for finding weighted and unweighted maximum cliques in d-division graphs.
- Author
-
Schiff, Krzysztof
- Subjects
ANT algorithms ,GRAPHIC methods ,MATHEMATICAL optimization ,SWARM intelligence ,ALGORITHMS - Abstract
This article deals with the problem of finding the maximum number of maximum cliques in a weighted graph with all edges between vertices from different d-division of a graph with the minimum total weight of all these cliques, and the problem of finding the maximum number of maximum cliques in a nonweighted graph with not all edges between vertices from different d-division of the graph. This article presents new ant algorithms with new desire functions for these problems. These algorithms were tested for their purpose with different changing input parameters, the test results were tabulated and discussed, the best algorithms were indicated. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
239. Robust object tracking via ensembling semantic‐aware network and redetection
- Author
-
Peiqiang Liu, Qifeng Liang, Zhiyong An, Jingyi Fu, and Yanyan Mao
- Subjects
computer vision ,learning (artificial intelligence) ,object tracking ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Computer software ,QA76.75-76.765 - Abstract
Abstract Most Siamese‐based trackers use classification and regression to determine the target bounding box, which can be formulated as a linear matching process of the template and search region. However, this only takes into account the similarity of features while ignoring the semantic object information, resulting in some cases in which the regression box with the highest classification score is not accurate. To address the lack of semantic information, an object tracking approach based on an ensemble semantic‐aware network and redetection (ESART) is proposed. Furthermore, a DarkNet53 network with transfer learning is used as our semantic‐aware model to adapt the detection task for extracting semantic information. In addition, a semantic tag redetection method to re‐evaluate the bounding box and overcome inaccurate scaling issues is proposed. Extensive experiments based on OTB2015, UAV123, UAV20L, and GOT‐10k show that our tracker is superior to other state‐of‐the‐art trackers. It is noteworthy that our semantic‐aware ensemble method can be embedded into any tracker for classification and regression task.
- Published
- 2024
- Full Text
- View/download PDF
240. IoUNet++: Spatial cross‐layer interaction‐based bounding box regression for visual tracking
- Author
-
Shilei Wang, Yamin Han, Baozhen Sun, and Jifeng Ning
- Subjects
computer vision ,convolutional neural nets ,object tracking ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Computer software ,QA76.75-76.765 - Abstract
Abstract Accurate target prediction, especially bounding box estimation, is a key problem in visual tracking. Many recently proposed trackers adopt the refinement module called IoU predictor by designing a high‐level modulation vector to achieve bounding box estimation. However, due to the lack of spatial information that is important for precise box estimation, this simple one‐dimensional modulation vector has limited refinement representation capability. In this study, a novel IoU predictor (IoUNet++) is designed to achieve more accurate bounding box estimation by investigating spatial matching with a spatial cross‐layer interaction model. Rather than using a one‐dimensional modulation vector to generate representations of the candidate bounding box for overlap prediction, this paper first extracts and fuses multi‐level features of the target to generate template kernel with spatial description capability. Then, when aggregating the features of the template and the search region, the depthwise separable convolution correlation is adopted to preserve the spatial matching between the target feature and candidate feature, which makes their IoUNet++ network have better template representation and better fusion than the original network. The proposed IoUNet++ method with a plug‐and‐play style is applied to a series of strengthened trackers including DiMP++, SuperDiMP++ and SuperDIMP_AR++, which achieve consistent performance gain. Finally, experiments conducted on six popular tracking benchmarks show that their trackers outperformed the state‐of‐the‐art trackers with significantly fewer training epochs.
- Published
- 2024
- Full Text
- View/download PDF
241. Construction Activity Analysis of Workers Based on Human Posture Estimation Information
- Author
-
Xuhong Zhou, Shuai Li, Jiepeng Liu, Zhou Wu, and Yohchia Frank Chen
- Subjects
Pose estimation ,Activity analysis ,Object tracking ,Construction workers ,Automatic systems ,Engineering (General). Civil engineering (General) ,TA1-2040 - Abstract
Identifying workers’ construction activities or behaviors can enable managers to better monitor labor efficiency and construction progress. However, current activity analysis methods for construction workers rely solely on manual observations and recordings, which consumes considerable time and has high labor costs. Researchers have focused on monitoring on-site construction activities of workers. However, when multiple workers are working together, current research cannot accurately and automatically identify the construction activity. This research proposes a deep learning framework for the automated analysis of the construction activities of multiple workers. In this framework, multiple deep neural network models are designed and used to complete worker key point extraction, worker tracking, and worker construction activity analysis. The designed framework was tested at an actual construction site, and activity recognition for multiple workers was performed, indicating the feasibility of the framework for the automated monitoring of work efficiency.
- Published
- 2024
- Full Text
- View/download PDF
242. A Multi-Stream Approach to Mixed-Traffic Accident Recognition Using Deep Learning
- Author
-
Swee Tee Fu, Lau Bee Theng, Brian Loh Chung Shiong, Chris McCarthy, and Mark Tee Kit Tsun
- Subjects
Deep learning ,image classification ,object detection ,object tracking ,road accidents ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Road traffic accidents are a leading cause of injuries and fatalities globally, prompting extensive research into deep learning-based accident recognition models for their superior performance in computer vision tasks. However, most studies focus on non-mixed traffic environments, where detection is simpler due to predictable traffic patterns and uniform vehicle types. In contrast, mixed-traffic scenarios present greater challenges as diverse vehicles, motorcyclists, and pedestrians move unpredictably. Models relying on a single type of perception are effective in structured traffic but struggle to handle the complexities of mixed-traffic environments. This study proposes a novel multi-stream deep learning model called Accident Recognition in Mixed-Traffic Scene (ARMS), which integrates three distinct streams: the first stream analyzes the overall accident scene, the second focuses on mixed-traffic accident features, and the third examines vehicle motion abnormalities through object detection and tracking, aimed at improving road accident recognition accuracy in mixed-traffic environments at intersections. This model is trained and evaluated using datasets from CADP, UA-DETRAC, and supplementary online sources. The results demonstrate that the ARMS model achieves an accuracy of 93.3%, with performance improving significantly through the fusion of the individual streams. Additionally, the ARMS model was evaluated using two publicly available standard datasets, which further highlights its improved performance in recognizing mixed-traffic accidents compared to existing studies.
- Published
- 2024
- Full Text
- View/download PDF
243. Multi-modal visual tracking: Review and experimental comparison
- Author
-
Pengyu Zhang, Dong Wang, and Huchuan Lu
- Subjects
visual tracking ,object tracking ,multi-modal fusion ,RGB-T tracking ,RGB-D tracking ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Abstract Visual object tracking has been drawing increasing attention in recent years, as a fundamental task in computer vision. To extend the range of tracking applications, researchers have been introducing information from multiple modalities to handle specific scenes, with promising research prospects for emerging methods and benchmarks. To provide a thorough review of multi-modal tracking, different aspects of multi-modal tracking algorithms are summarized under a unified taxonomy, with specific focus on visible-depth (RGB-D) and visible-thermal (RGB-T) tracking. Subsequently, a detailed description of the related benchmarks and challenges is provided. Extensive experiments were conducted to analyze the effectiveness of trackers on five datasets: PTB, VOT19-RGBD, GTOT, RGBT234, and VOT19-RGBT. Finally, various future directions, including model design and dataset construction, are discussed from different perspectives for further research.
- Published
- 2024
- Full Text
- View/download PDF
244. Multi-layer features template update object tracking algorithm based on SiamFC++
- Author
-
Xiaofeng Lu, Xuan Wang, Zhengyang Wang, and Xinhong Hei
- Subjects
Object tracking ,Fully convolutional Siamese networks ,Template update ,Mutual information ,FPN ,Electronics ,TK7800-8360 - Abstract
Abstract SiamFC++ only extracts the object feature of the first frame as a tracking template, and only uses the highest level feature maps in both the classification branch and the regression branch, so that the respective characteristics of the two branches are not fully utilized. In view of this, the present paper proposes an object tracking algorithm based on SiamFC++. The algorithm uses the multi-layer features of the Siamese network to update template. First, FPN is used to extract feature maps from different layers of Backbone for classification branch and regression branch. Second, 3D convolution is used to update the tracking template of the object tracking algorithm. Next, a template update judgment condition is proposed based on mutual information. Finally, AlexNet is used as the backbone and GOT-10K as training set. Compared with SiamFC++, our algorithm obtains improved results on OTB100, VOT2016, VOT2018 and GOT-10k data sets, and the tracking process is real time.
- Published
- 2024
- Full Text
- View/download PDF
245. Indirect Vaccine Box Localization in Small to Medium Obstructed Cold Storages via Worker Tracking With VCS-YOLOv5
- Author
-
Chen Liang, Wei Yang, Longlong Pang, Zhuozhang Zou, and Quangao Liu
- Subjects
YOLOv5 ,object tracking ,vaccine box location ,small to medium obstructed cold storage ,behavior recognition ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Within the realm of public health, the end-to-end traceability and monitoring of vaccines play an indispensable role in ascertaining the safety and efficacy of vaccines, especially the precise localization of vaccines in the vaccine cold storage. However, challenges such as limited space, dense stacking of boxes, and frequent obstructions in the vaccine cold storage, particularly in Small to Medium Cold Storage (SMCS), pose significant obstacles to effective localizing. Existing vaccine box localizing methods in cold storage, like manual localizing, Radio Frequency Identification (RFID) technology, and traditional visual localizing, struggle with obstructions and inefficiencies, leading to limited accuracy and real-time update capabilities. This paper introduces an innovative solution for vaccine box localization in obstructed environment within SMCS, leveraging computer vision technology. Specifically, to address the challenge of accurately locating vaccine boxes in densely stacked and heavily obstructed SMCS, this paper exploits the strong correlation between the vaccine boxes and workers during the storage process. The vaccine box is indirectly located by focusing on the less numerous and less obstructed cold storage workers. Furthermore, to enhance the tracking accuracy of the workers, the YOLOv5 model was modified, resulting in the development of the Vaccine Cold Storages YOLOV5 (VCS-YOLOv5) model tailored for obstructed environment in SMCS. Additionally, the final location of the vaccine box is determined by a behavior recognition model, identifying instances where the workers’ hands are not in contact with the vaccine box. Extensive experiments confirm that VCS-YOLOv5 sets a new benchmark in vaccine box localization and worker tracking, significantly surpassing the performance of standard models in accuracy and real-time effectiveness.
- Published
- 2024
- Full Text
- View/download PDF
246. SiamMFF: UAV Object Tracking Algorithm Based on Multi-Scale Feature Fusion
- Author
-
Yanli Hou, Xilin Gai, Xintao Wang, and Yongqiang Zhang
- Subjects
Siamese network ,object tracking ,unmanned aerial vehicle(UAV) ,deformable convolution ,multi-scale feature fusion ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
UAVs have entered various fields of life, and object tracking is one of the key technologies for UAV applications. However, there are various challenges in practical applications, such as the scale change of video images, motion blur and too high shooting angle leading to the tracked objects being too small, resulting in poor tracking accuracy. To cope with the problem that small targets are poorly tracked by UAVs due to less effective information output from the deep residual network, a SiamMFF tracking method that introduces an efficient multi-scale feature fusion strategy is proposed. The method aggregates features at different scales, and at the same time, replaces the ordinary convolution with deformable convolution to increase the sense field of convolution operation to enhance the feature extraction capability. The experimental results show that the proposed algorithm improves the success rate and accuracy of small target tracking.
- Published
- 2024
- Full Text
- View/download PDF
247. An Occlusion-Aware Tracker With Local-Global Features Modeling in UAV Videos
- Author
-
Qiuyu Jin, Yuqi Han, Wenzheng Wang, Linbo Tang, Jianan Li, and Chenwei Deng
- Subjects
Local-global feature modeling ,object tracking ,occlusion awareness ,UAV ,Ocean engineering ,TC1501-1800 ,Geophysics. Cosmic physics ,QC801-809 - Abstract
Recently, tracking with unmanned aerial vehicle (UAVs) platforms has played significant roles in Earth observation tasks. However, target occlusion remains a challenging factor during the continuous tracking procedure. In particular, incomplete local appearance features can mislead the tracking network to produce inaccurate size and position estimations when the target is occluded. Furthermore, the tracking network lacks sufficient occlusion supervision information, which may lead to template degradation during template updating. To address these challenges, in this article, we design an occlusion-aware tracker with local-global features modeling, which contains two key components, namely the feature intrinsic association module (FIAM) and the feature verification module (FVM). Specifically, the FIAM divides the local features into blocks and utilizes the transformer network to explore the relative relationships among each subblock, which supplements the damaged local target features and assists the modeling for global target features. In addition, the FVM establishes a correlation measurement network between the target and the template. To precisely evaluate the occlusion status, masked samples with occlusion exceeding 50% are selected as negative samples for independent training, which ensures the purity of the target template. Qualitative and quantitative experiments are conducted on publicly available datasets, including UAV20 L, UAV123, and LaSOT. Qualitative and quantitative experiments have demonstrated the effectiveness of the proposed tracking algorithm over the other state-of-the-art trackers in occlusion scenarios.
- Published
- 2024
- Full Text
- View/download PDF
248. Synchronizing Object Detection: Applications, Advancements and Existing Challenges
- Author
-
Md. Tanzib Hosain, Asif Zaman, Mushfiqur Rahman Abir, Shanjida Akter, Sawon Mursalin, and Shadman Sakeeb Khan
- Subjects
Object detection ,image recognition ,object segmentation ,semantic detection ,image classification ,object tracking ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
From pivotal roles in autonomous vehicles, healthcare diagnostics, and surveillance systems to seamlessly integrating with augmented reality, object detection algorithms stand as the cornerstone in unraveling the complexities of the visual world. Tracing the trajectory from conventional region-based methods to the latest neural network architectures reveals a technological renaissance where algorithms metamorphose into digital artisans. However, this journey is not without hurdles, prompting researchers to grapple with real-time detection, robustness in varied environments, and interpretability amidst the intricacies of deep learning. The allure of addressing issues such as occlusions, scale variations, and fine-grained categorization propels exploration into uncharted territories, beckoning the scholarly community to contribute to an ongoing saga of innovation and discovery. This research offers a comprehensive panorama, encapsulating the applications reshaping our digital reality, the advancements pushing the boundaries of perception, and the open issues extending an invitation to the next generation of visionaries to explore uncharted frontiers within object detection.
- Published
- 2024
- Full Text
- View/download PDF
249. Enhancing Image Annotation With Object Tracking and Image Retrieval: A Systematic Review
- Author
-
Rodrigo Fernandes, Alexandre Pessoa, Marta Salgado, Anselmo De Paiva, Ishak Pacal, and Antonio Cunha
- Subjects
Image annotation ,object tracking ,image retrieval ,deep learning ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Effective image and video annotation is a fundamental pillar in computer vision and artificial intelligence, crucial for the development of accurate machine learning models. Object tracking and image retrieval techniques are essential in this process, significantly improving the efficiency and accuracy of automatic annotation. This paper systematically investigates object tracking and image acquisition techniques. It explores how these technologies can collectively enhance the efficiency and accuracy of the annotation processes for image and video datasets. Object tracking is examined for its role in automating annotations by tracking objects across video sequences, while image retrieval is evaluated for its ability to suggest annotations for new images based on existing data. The review encompasses diverse methodologies, including advanced neural networks and machine learning techniques, highlighting their effectiveness in various contexts like medical analyses and urban monitoring. Despite notable advancements, challenges such as algorithm robustness and effective human-AI collaboration are identified. This review provides valuable insights into these technologies’ current state and future potential in improving image annotation processes, even showing existing applications of these techniques and their full potential when combined.
- Published
- 2024
- Full Text
- View/download PDF
250. Satellite Videos Object Tracking Based on Enhanced Correlation Filter With Motion Prediction Network
- Author
-
Puhua Chen, Lu Wang, Lei Guo, Xu Liu, Xiangrong Zhang, Licheng Jiao, and Fang Liu
- Subjects
Correlation filter (CF) ,feature enhancement ,motion prediction ,object tracking ,satellite video ,Ocean engineering ,TC1501-1800 ,Geophysics. Cosmic physics ,QC801-809 - Abstract
With the maturity of satellite imaging technology, satellite video has attracted more and more attention because of high spatial resolution and temporal resolution. Thus, object tracking as the main application task of satellite videos also becomes a popular research topic. Compared to natural videos, the difficulties of satellite videos object tracking mainly are caused by feature deficiency of small objectives, surroundings, occlusion (OCC), etc. In this article, based on a dual correlation filter (DCF) tracking framework, a new object tracking method for satellite videos is proposed to deal with the above-mentioned problems. For feature deficiency problem, the proposed super-resolution feature enhancement module could improve the feature discriminative ability of objects utilizing the prior information learned by a super-resolution network. For the OCC problem, a multilayer perceptron motion prediction network is designed to predict the position of objects when OCC occurs, which uses an online training strategy. Besides, KNN background subtraction also is introduced to reduce the interference of surroundings. Finally, these above-mentioned processes are combined together appropriately on the DCF tracking framework for better tracking results. Aiming to verify the performance of the proposed method, abundant experiments have been conducted on satellite video datasets. The experimental results show that the design of the proposed tracking method is effective and it also has a conspicuous advantage compared with some state-of-the-art methods.
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.