6 results on '"Fiaz, Mustansar"'
Search Results
2. Adaptive Feature Selection Siamese Networks for Visual Tracking
- Author
-
Fiaz, Mustansar, Rahman, Md. Maklachur, Mahmood, Arif, Farooq, Sehar Shahzad, Baek, Ki Yeol, Jung, Soon Ki, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Kotenko, Igor, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Ohyama, Wataru, editor, and Jung, Soon Ki, editor
- Published
- 2020
- Full Text
- View/download PDF
3. Guided-attention and gated-aggregation network for medical image segmentation.
- Author
-
Fiaz, Mustansar, Noman, Mubashir, Cholakkal, Hisham, Anwer, Rao Muhammad, Hanna, Jacob, and Khan, Fahad Shahbaz
- Subjects
- *
CONVOLUTIONAL neural networks , *IMAGE segmentation , *CARDIAC magnetic resonance imaging , *TRANSFORMER models , *MULTIPLE myeloma - Abstract
Recently, transformers have been widely used in medical image segmentation to capture long-range and global dependencies using self-attention. However, they often struggle to learn the local details which limit their ability to capture irregular shapes and sizes of the tissues and indistinct boundaries between the tissues, which are critical for accurate segmentation. To alleviate this issue, we propose a network named GA2Net, which comprises an encoder, a bottleneck, and a decoder. The encoder computes multi-scale features. In the bottleneck, we propose a hierarchical-gated features aggregation (HGFA) which introduces a novel spatial gating mechanism to enrich the multi-scale features. To effectively learn the shapes and sizes of the tissues, we apply deep supervision in the bottleneck. GA2Net proposes to use adaptive aggregation (AA) within the decoder, to adjust the receptive fields for each location in the feature map, by replacing the traditional concatenation/summation operations in skip connections in U-Net like architecture. Furthermore, we propose mask-guided feature attention (MGFA) modules within the decoder which strives to learn the salient features using foreground priors to adequately grasp the intricate structural and contour information of the tissues. We also apply intermediate supervision for each stage of the decoder, which further improves the capability of the model to better locate the boundaries of the tissues. Our extensive experimental results illustrate that our GA2-Net significantly outperforms the existing state-of-the-art methods over eight medical image segmentation datasets i.e., five polyps, a skin lesion, a multiple myeloma cell segmentation, and a cardiac MRI scan datasets. We then perform an extensive ablation study to validate the capabilities of our method. Code is available at https://github.com/mustansarfiaz/ga2net. [Display omitted] • We propose GA2Net to capture complex shapes of the tissues for better segmentation. • Our HGFA enhances the most relevant feature information for pixel-precise segmentation using deep supervision. • Adaptive aggregation adjusts the receptive fields for each stage features. • Our MGFA modules in decoder are crucial to obtaining accurate boundaries of the tissue objects. • Experiments on eight medical segmentation benchmarks demonstrate merits of our contributions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. 4G-VOS: Video Object Segmentation using guided context embedding.
- Author
-
Fiaz, Mustansar, Zaheer, Muhammad Zaigham, Mahmood, Arif, Lee, Seung-Ik, and Jung, Soon Ki
- Subjects
- *
ALGORITHMS , *APPLICATION software , *CONVOLUTIONAL neural networks - Abstract
Video Object Segmentation (VOS) is a fundamental task required in many high-level real-world computer vision applications. VOS becomes challenging due to the presence of background distractors as well as to object appearance variations. Many existing VOS approaches use online model updates to capture the appearance variations which incurs high computational cost. Template matching and propagation-based VOS methods, although cost-effective, suffer from performance degradation under challenging scenarios such as occlusion and background clutter. In order to tackle these challenges, we propose a network architecture dubbed 4G-VOS to encode video context for improved VOS performance to tackle these challenges. To preserve long term semantic information, we propose a guided transfer embedding module. We employ a global instance matching module to generate similarity maps from the initial image and the mask. Besides, we use a generative directional appearance module to estimate and dynamically update the foreground/background class probabilities in a spherical embedding space. Moreover, during feature refinement, existing approaches may lose contextual information. Therefore, we propose a guided pooled decoder to exploit the global and local contextual information during feature refinement. The proposed framework is an end-to-end learning architecture that is trained in an offline fashion. Evaluations over three VOS benchmark datasets including DAVIS2016, DAVIS2017, and YouTube-VOS have demonstrated outstanding performance of the proposed algorithm compared to 40 existing state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
5. Learning Soft Mask Based Feature Fusion with Channel and Spatial Attention for Robust Visual Object Tracking.
- Author
-
Fiaz, Mustansar, Mahmood, Arif, and Jung, Soon Ki
- Subjects
- *
OBJECT tracking (Computer vision) , *CONVOLUTIONAL neural networks , *TRACKING algorithms , *ARTIFICIAL neural networks , *MACHINE learning - Abstract
We propose to improve the visual object tracking by introducing a soft mask based low-level feature fusion technique. The proposed technique is further strengthened by integrating channel and spatial attention mechanisms. The proposed approach is integrated within a Siamese framework to demonstrate its effectiveness for visual object tracking. The proposed soft mask is used to give more importance to the target regions as compared to the other regions to enable effective target feature representation and to increase discriminative power. The low-level feature fusion improves the tracker robustness against distractors. The channel attention is used to identify more discriminative channels for better target representation. The spatial attention complements the soft mask based approach to better localize the target objects in challenging tracking scenarios. We evaluated our proposed approach over five publicly available benchmark datasets and performed extensive comparisons with 39 state-of-the-art tracking algorithms. The proposed tracker demonstrates excellent performance compared to the existing state-of-the-art trackers. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
6. Improving Object Tracking by Added Noise and Channel Attention.
- Author
-
Fiaz, Mustansar, Mahmood, Arif, Baek, Ki Yeol, Farooq, Sehar Shahzad, and Jung, Soon Ki
- Subjects
- *
OBJECT tracking (Computer vision) , *ARTIFICIAL neural networks , *CONVOLUTIONAL neural networks , *NOISE , *FAULT-tolerant computing - Abstract
CNN-based trackers, especially those based on Siamese networks, have recently attracted considerable attention because of their relatively good performance and low computational cost. For many Siamese trackers, learning a generic object model from a large-scale dataset is still a challenging task. In the current study, we introduce input noise as regularization in the training data to improve generalization of the learned model. We propose an Input-Regularized Channel Attentional Siamese (IRCA-Siam) tracker which exhibits improved generalization compared to the current state-of-the-art trackers. In particular, we exploit offline learning by introducing additive noise for input data augmentation to mitigate the overfitting problem. We propose feature fusion from noisy and clean input channels which improves the target localization. Channel attention integrated with our framework helps finding more useful target features resulting in further performance improvement. Our proposed IRCA-Siam enhances the discrimination of the tracker/background and improves fault tolerance and generalization. An extensive experimental evaluation on six benchmark datasets including OTB2013, OTB2015, TC128, UAV123, VOT2016 and VOT2017 demonstrate superior performance of the proposed IRCA-Siam tracker compared to the 30 existing state-of-the-art trackers. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.