Author: "Zhang, Wenfeng" / Publication Year Range: This year / Topic: transformer models - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Zhang, Wenfeng"' showing total 2 results

Start Over Author "Zhang, Wenfeng" Topic transformer models Publication Year Range This year

2 results on '"Zhang, Wenfeng"'

1. Representation Learning Based on Vision Transformer.

Author: Ran, Ruisheng, Gao, Tianyu, Hu, Qianwei, Zhang, Wenfeng, Peng, Shunshun, and Fang, Bin
Subjects: TRANSFORMER models, INFORMATION technology, IMAGE reconstruction, CHANNEL coding, DATA visualization
Abstract: In recent years, with the rapid development of information technology, the volume of image data has grown exponentially. However, these datasets typically contain a large amount of redundant information. To extract effective features and reduce redundancy from images, a representation learning method based on the Vision Transformer (ViT) has been proposed, and to our best knowledge, Transformer was first applied to zero-shot learning (ZSL). The method adopts a symmetric encoder–decoder structure, where the encoder incorporates Multi-Head Self-Attention (MSA) mechanism of ViT to reduce the dimensionality of image features, eliminate redundant information, and decrease computational burden. Consequently, it effectively extracts features, and the decoder is utilized for reconstructing image data. We evaluated the representation learning capability of the proposed method in various tasks, including data visualization, image reconstruction, face recognition, and ZSL. By comparing with state-of-the-art representation learning methods, the outstanding results obtained validate the effectiveness of this method in the field of representation learning. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

2. Deep global semantic structure-preserving hashing via corrective triplet loss for remote sensing image retrieval.

Author: Zhou, Hongyan, Qin, Qibing, Hou, Jinkui, Dai, Jiangyan, Huang, Lei, and Zhang, Wenfeng
Subjects: *REMOTE sensing, *IMAGE retrieval, *CONVOLUTIONAL neural networks, *DEEP learning, *TRANSFORMER models, *LEARNING strategies, *FEATURE extraction
Abstract: With the explosive increase of remote sensing data, how to search for remote sensing data quickly and accurately in a vast dataset is an incredibly critical matter for research subjects. The deep hashing method has become the dominant method for remote sensing image retrieval because of its low-cost storage and high-speed retrieval. However, for the reason of the limitation of fixed convolutional kernels, deep hashing frameworks based on Convolutional Neural Networks (CNNs) fail to obtain the global semantic information well, which leads to the generation of suboptimal solutions. Furthermore, existing hashing methods commonly employ the random sampling strategy or hardest sample mining to build training batches, resulting in bad local minima. To remedy these problems, a novel Deep Global Semantic Structure-preserving Hashing framework via corrective triplet loss (DGSSH) is proposed for remote sensing image retrieval to learn a discriminative and stable embedding space, achieving intra-class confusion and inter-class diversity. Specifically speaking, the feature extraction module based on Swim Transformer architecture is developed to capture global semantic information and multiscale features from remote sensing images. Based on a distribution matching constraint, the corrective triplet loss for deep hashing schemes is designed to reduce the distribution shift caused by the random selection or hardest sample mining. Meanwhile, to reduce the time overhead of the model, the asymmetric learning strategy is employed to perform effective compact representation learning. Numerous experiments have been carried out on three publicly available benchmarks, which indicates that the proposed DGSSH framework could achieve optimal performance for remote sensing image retrieval applications. The source code of our DGSSH framework is hosted at https://github.com/QinLab-WFU/DGSSH.git. • The Swim Transformer architecture is developed to capture global semantic features. • The corrective triplet loss is proposed to reduce the distribution shift. • An asymmetric learning strategy is employed to improve training efficiency. • The state-of-the-art results are reported on three benchmark datasets. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

2 results on '"Zhang, Wenfeng"'

1. Representation Learning Based on Vision Transformer.

2. Deep global semantic structure-preserving hashing via corrective triplet loss for remote sensing image retrieval.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

2 results on '"Zhang, Wenfeng"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources