13 results on '"Zhao, Jiaqi"'
Search Results
2. Incremental learning with neural networks for computer vision: a survey
- Author
-
Liu, Hao, Zhou, Yong, Liu, Bing, Zhao, Jiaqi, Yao, Rui, and Shao, Zhiwen
- Published
- 2023
- Full Text
- View/download PDF
3. Spatial-Temporal Based Multihead Self-Attention for Remote Sensing Image Change Detection.
- Author
-
Zhou, Yong, Wang, Fengkai, Zhao, Jiaqi, Yao, Rui, Chen, Silin, and Ma, Heping
- Subjects
REMOTE sensing ,LEARNING modules ,COMPUTER vision ,FEATURE extraction ,DEEP learning - Abstract
The neural network-based remote sensing image change detection method faces a large amount of imaging interference and severe class imbalance problems under high-resolution conditions, which bring new challenges to the accuracy of the detection network. In this work, to address the imaging interference caused by different imaging angles and times, the siamese strategy and multi-head self-attention mechanism are used to reduce the imaging differences between the dual-temporal images and fully exploit the inter-temporal information. Secondly, a learnable multi-part feature learning module is used to adaptively exploit features from different scales to obtain more comprehensive features. Finally, a mixed loss function strategy is used to ensure that the network converges effectively and excludes the adverse interference of a large number of negative samples to the network. Extensive experiments show that our method outperforms numerous methods on LEVIR-CD, WHU, and DSIFN datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
4. ARET-IQA: An Aspect-Ratio-Embedded Transformer for Image Quality Assessment.
- Author
-
Zhu, Hancheng, Zhou, Yong, Shao, Zhiwen, Du, Wen-Liang, Zhao, Jiaqi, and Yao, Rui
- Subjects
SELF-adaptive software ,ASPECT ratio (Images) ,COMPUTER vision ,IMAGE processing - Abstract
Image quality assessment (IQA) aims to automatically evaluate image perceptual quality by simulating the human visual system, which is an important research topic in the field of image processing and computer vision. Although existing deep-learning-based IQA models have achieved significant success, these IQA models usually require input images with a fixed size, which varies the perceptual quality of images. To this end, this paper proposes an aspect-ratio-embedded Transformer-based image quality assessment method, which can implant the adaptive aspect ratios of input images into the multihead self-attention module of the Swin Transformer. In this way, the proposed IQA model can not only relieve the variety of perceptual quality caused by size changes in input images but also leverage more global content correlations to infer image perceptual quality. Furthermore, to comprehensively capture the impact of low-level and high-level features on image quality, the proposed IQA model combines the output features of multistage Transformer blocks for jointly inferring image quality. Experimental results on multiple IQA databases show that the proposed IQA method is superior to state-of-the-art methods for assessing image technical and aesthetic quality. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
5. Survey for person re-identification based on coarse-to-fine feature learning.
- Author
-
Liu, Minjie, Zhao, Jiaqi, Zhou, Yong, Zhu, Hancheng, Yao, Rui, and Chen, Ying
- Subjects
PATTERN recognition systems ,COMPUTER vision ,DEEP learning ,VIDEO surveillance - Abstract
Person re-identification (Re-ID), aiming to retrieve interested people through multiple non-overlapping cameras, has caused concerns in pattern recognition communities and computer vision in recent years. With the continuous promotion of deep learning, the research on person Re-ID is more and more extensive. In this paper, we conduct a comprehensive review of the advanced methods and divide them into three categories from coarse to fine: (1) global-based methods, which are based on whole images to obtain discriminative features; (2) part-based methods, which focus on image regions to extract detailed information; (3) multiple granularities-based methods, which combine advantages of the above two categories. For each category, we further classify it according to popular research tools. Then, we give the evaluation of some typical models on a set of benchmark datasets and compare them in detail. We also introduce some widely used training tricks. The methods mentioned in this paper were published in 2011-2021. By discussing their advantages and limitations, we provide a reference for future works. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
6. Structural similarity preserving GAN for infrared and visible image fusion
- Author
-
Zhao Jiaqi, Yong Zhou, Zhang Di, Rui Yao, and Zhou Ziyuan
- Subjects
Image fusion ,Structural similarity ,Infrared ,Computer science ,business.industry ,Applied Mathematics ,020206 networking & telecommunications ,02 engineering and technology ,Multiple sensors ,Image (mathematics) ,law.invention ,law ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,CLARITY ,020201 artificial intelligence & image processing ,Computer vision ,Artificial intelligence ,Single image ,business ,Generative adversarial network ,Information Systems - Abstract
Compared with a single image, in a complex environment, image fusion can utilize the complementary information provided by multiple sensors to significantly improve the image clarity and the information, more accurate, reliable, comprehensive access to target and scene information. It is widely used in military and civil fields, such as remote sensing, medicine, security and other fields. In this paper, we propose an end-to-end fusion framework based on structural similarity preserving GAN (SSP-GAN) to learn a mapping of the fusion tasks for visible and infrared images. Specifically, on the one hand, for making the fusion image natural and conforming to visual habits, structure similarity is introduced to guide the generator network produce abundant texture structure information. On the other hand, to fully take advantage of shallow detail information and deep semantic information for achieving feature reuse, we redesign the network architecture of multi-modal image fusion meticulously. Finally, a wide range of experiments on real infrared and visible TNO dataset and RoadScene dataset prove the superior performance of the proposed approach in terms of accuracy and visual. In particular, compared with the best results of other seven algorithms, our model has improved entropy, edge information transfer factor, multi-scale structural similarity and other evaluation metrics, respectively, by 3.05%, 2.4% and 0.7% on TNO dataset. And our model has also improved by 0.7%, 2.82% and 1.1% on RoadScene dataset.
- Published
- 2020
7. A survey of semi- and weakly supervised semantic segmentation of images.
- Author
-
Zhang, Man, Zhou, Yong, Zhao, Jiaqi, Man, Yiyun, Liu, Bing, and Yao, Rui
- Subjects
CONVOLUTIONAL neural networks ,IMAGE segmentation ,COMPUTER vision ,ARTIFICIAL neural networks ,VISUAL fields ,SUPERVISED learning ,DEEP learning - Abstract
Image semantic segmentation is one of the most important tasks in the field of computer vision, and it has made great progress in many applications. Many fully supervised deep learning models are designed to implement complex semantic segmentation tasks and the experimental results are remarkable. However, the acquisition of pixel-level labels in fully supervised learning is time consuming and laborious, semi-supervised and weakly supervised learning is gradually replacing fully supervised learning, thus achieving good results at a lower cost. Based on the commonly used models such as convolutional neural networks, fully convolutional networks, generative adversarial networks, this paper focuses on the core methods and reviews the semi- and weakly supervised semantic segmentation models in recent years. In the following chapters, existing evaluations and data sets are summarized in details and the experimental results are analyzed according to the data set. The last part of the paper is an objective summary. In addition, it points out the possible direction of research and inspiring suggestions for future work. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
8. Edge-aware and spectral–spatial information aggregation network for multispectral image semantic segmentation.
- Author
-
Zhang, Di, Zhao, Jiaqi, Chen, Jingyang, Zhou, Yong, Shi, Boyu, and Yao, Rui
- Subjects
- *
MULTISPECTRAL imaging , *IMAGE segmentation , *INFORMATION networks , *REMOTE sensing , *IMAGE analysis , *COMPUTER vision , *MARKOV random fields , *FUZZY algorithms - Abstract
Semantic segmentation is a fundamental task in the field of remote sensing image intelligent interpretation and computer vision. Multispectral remote sensing images have attracted more and more researchers' attention because they can accurately describe different types of reflection spectra. However, inaccurate multispectral feature description leads to edge semantic ambiguity and misclassification of small objects. In this article, we propose a novel network named edge-aware and spectral–spatial information aggregation net (ESSANet) to capture both high-level semantic features and low-level edge details for semantic segmentation of remote sensing images. Specifically, on the one hand, in order to improve the representation ability of discriminant features, we design a two-stream spectral–spatial feature extraction network via 3D hybrid convolution and multi-level aggregation network. On the other hand, in order to eliminate the effect of edge semantic ambiguity, we develop a siamese edge-aware structure and multi-stage edge loss function. Experimental results show that our method achieved 3.5% and 4.09% mean intersection over union (mIoU) score improvements and 2.59% and 3.32% Kappa score improvements compared with the competitive baseline algorithm on the SEN12MS and US3D datasets, respectively. In addition, the method proposed in this paper also achieves a better trade-off between speed and accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
9. Using a vision cognitive algorithm to schedule virtual machines.
- Author
-
Zhao, Jiaqi, Mhedheb, Yousri, Tao, Jie, Jrad, Foued, Liu, Qinghuai, and Streit, Achim
- Subjects
VIRTUAL machine systems ,COMPUTER vision ,COMPUTER algorithms ,COGNITIVE learning ,QUALITY of service ,NP-hard problems ,ENERGY consumption - Abstract
Scheduling virtual machines is a major research topic for cloud computing, because it directly influences the performance, the operation cost and the quality of services. A large cloud center is normally equipped with several hundred thousand physical machines. The mission of the scheduler is to select the best one to host a virtual machine. This is an NPhard global optimization problem with grand challenges for researchers. This work studies the Virtual Machine (VM) scheduling problem on the cloud. Our primary concern with VM scheduling is the energy consumption, because the largest part of a cloud center operation cost goes to the kilowatts used. We designed a scheduling algorithm that allocates an incoming virtual machine instance on the host machine, which results in the lowest energy consumption of the entire system. More specifically, we developed a new algorithm, called vision cognition, to solve the global optimization problem. This algorithm is inspired by the observation of how human eyes see directly the smallest/largest item without comparing them pairwisely. We theoretically proved that the algorithm works correctly and converges fast. Practically, we validated the novel algorithm, together with the scheduling concept, using a simulation approach. The adopted cloud simulator models different cloud infrastructures with various properties and detailed runtime information that can usually not be acquired from real clouds. The experimental results demonstrate the benefit of our approach in terms of reducing the cloud center energy consumption [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
10. ResT-ReID: Transformer block-based residual learning for person re-identification.
- Author
-
Chen, Ying, Xia, Shixiong, Zhao, Jiaqi, Zhou, Yong, Niu, Qiang, Yao, Rui, Zhu, Dongjun, and Liu, Dongjingdian
- Subjects
- *
FEATURE extraction , *COMPUTER vision , *COMPUTATIONAL complexity , *OVERHEAD costs , *SPINE - Abstract
• We propose an efficient hybrid ReID backbone for discriminative feature extraction. • We formulate an efficient PW-MSA block to free positions from the fixed length. • We put forward a novel attention-guided GCN model to encode person attributes and body parts into embedding representations. [Display omitted] The Transformer has been applied into computer vision to explore long-range dependencies with multi-head self-attention strategy, therefore numerous Transformer-based methods for person re-identification (ReID) are designed for extracting effective as well as robust representation. However, the memory and computational complexity of scaled dot-product attention in Transformer cost vast overheads. To overcome these limitations, this paper presents ResT-ReID method, which designs a hybrid backbone Res-Transformer based on ResNet-50 and Transformer block for effective identify information. Specifically, we use global self-attention in place of depth-wise convolution in the fourth layer's residual bottleneck of ResNet-50. For fully exploiting the entire knowledge of the person, we devise attention-guided Graph Convolution Networks (GCNs) with side information embedding (SIE-AGCN), which has an attention layer located into two GCN layers. The quantified experiments on two large-scale ReID benchmarks demonstrate that the proposed ResT-ReID achieves competitive results compared with several state-of-the-art approaches. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
11. Multiobjective ResNet pruning by means of EMOAs for remote sensing scene classification.
- Author
-
Liu, Xuning, Zhou, Yong, Zhao, Jiaqi, Yao, Rui, Liu, Bing, Ma, Ding, and Zheng, Yi
- Subjects
- *
REMOTE sensing , *ARTIFICIAL neural networks , *PRUNING , *COMPUTER vision , *VISUAL fields , *PROCESS optimization , *OPTICAL remote sensing - Abstract
Convolutional neural networks have achieved remarkable success in the field of computer vision. However, due to their high storage and expensive computations, recently, there has been a lot of work focusing on reducing the complexity of convolutional neural networks. In this work, we propose a random filter pruning method by means of evolutionary multiobjective optimization algorithms to accelerate the Siamese ResNet-50 for remote sensing scene classification. We have conduct experiments on NWPU-RESISC45, UC Merced Land-Use and SIRI-WHU datasets for performance evaluation of the proposed method. The experimental results demonstrate that the classification performance of our pruned model has been improved while keeping a certain degree of sparsity of the model. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
12. Semi-supervised blockwisely architecture search for efficient lightweight generative adversarial network.
- Author
-
Zhang, Man, Zhou, Yong, Zhao, Jiaqi, Xia, Shixiong, Wang, Jiaqi, and Huang, Zizheng
- Subjects
- *
GENERATIVE adversarial networks , *SUPERVISED learning , *COMPUTER vision , *VISUAL fields , *MOBILE operating systems , *COMPUTER network architectures - Abstract
• Using semi-supervised learning method combined with block-based architecture search, which greatly reduces the level of supervision. • Randomly occlude a part of the picture, generate the picture according to the semantic around the occlusion block. • The optimal architecture is constructed by flexibly stacking blocks, which realizes the image classification task with high efficiency. • A balance is achieved between the lightweight and performance, thus the network can be well applied to mobile platform. In the field of computer vision, methods that use fully supervised learning and fixed deep network structures need to be improved. Currently, many studies are devoted to designing neural architecture search methods to use neural networks in a more flexible way. However, most of these methods use fully supervised learning at the cost of extraordinary GPU training time. In view of the above problems, we propose a semi-supervised generative adversarial network and search network architecture based on block structure. Use real pictures and generated pictures with corresponding real tags and pseudo tags for training, to achieve the purpose of semi-supervised learning. By setting the layer's hyperparameters to a variable and flexible stacking block structure, network architecture search is achieved. The proposed method realizes image generation and extends to image classification. In the experimental results in Section 4, the training time is greatly reduced and the model performance is improved, which illustrates the efficiency of our method. The code can be found in https://github.com/AICV-CUMT/STASGAN. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
13. GAN-based person search via deep complementary classifier with center-constrained Triplet loss.
- Author
-
Yao, Rui, Gao, Cunyuan, Xia, Shixiong, Zhao, Jiaqi, Zhou, Yong, and Hu, Fuyuan
- Subjects
- *
PEDESTRIANS , *COMPUTER vision , *MEDLINE , *COMPUTER engineering , *OBJECT recognition (Computer vision) , *BIG data , *MYOELECTRIC prosthesis - Abstract
• We propose a framework for person search that utilizes GAN-generated images in a novel deep network, which solves the typical problem of low numbers of samples in big data and can adapt to images of various types of pedestrians. • We propose a deep complementary classifier for pedestrian detection to leverage complementary object regions for pedestrian/non-pedestrian classification that can improve the overall performance of the person search model. • We propose a new loss function, named the center-constrained triplet loss, that combines the advantages of both center loss and triplet loss to minimize intraperson variations and maximize interperson variations. In addition, the center-constrained triplet loss can avoid the disadvan- tages of the triplet loss's careful selection of the required triplets for training. • We conduct experiments on the large-scale CUHK-SYSU and PRW datasets and find that a newly synthesized image from the original image helps improve the performance of the model. Our proposed loss achieves significant improvements over the compared approach in both mAP and top-1 evaluation protocols. This paper addresses the person search task, which is a computer vision technology that finds the location of a pedestrian and retrieves it in a video taken by a single camera or multiple cameras. This task is much more challenging than the conventional settings for person re-identification or pedestrian detection since the search is susceptible to factors such as different resolutions, similar pedestrians, lighting, viewing angles and occlusion. Moreover, the person search task is a typical big data-small sample problem because each pedestrian only has a few images. It is difficult for the model to learn the discriminant features of pedestrians with a small quantity of pedestrian data. This paper proposes a framework for person search that uses the original training set without collecting extra data by implementing a generative adversarial network (GAN) to generate unlabeled samples. We propose a deep complementary classifier for pedestrian detection to leverage complementary object regions for pedestrian/non-pedestrian classification. In the re-identification section, we propose a center-constrained triplet loss that avoids the complicated triplet selection of the triplet loss and simultaneously pushes away all the distances of rather similar negative centers and the positive center. Experiments show that the GAN-generated data can effectively help to improve the discriminating ability of the CNN model. On the two large-scale datasets, CUHK-SYSU and PRW, we achieve a performance improvement over the baseline CNN. We additionally apply the proposed center-constrained triplet loss and complementary classifiers in the training model, and we achieve mAP improvements over the original method of +1.9% on CUHK-SYSU and +2.5% on PRW. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.