Author: "Zhang, Huaxiang" / Database: Academic Search Index - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Zhang, Huaxiang"' showing total 92 results

Start Over Author "Zhang, Huaxiang" Database Academic Search Index

92 results on '"Zhang, Huaxiang"'

1. Joint-Modal Graph Convolutional Hashing for unsupervised cross-modal retrieval.

Author: Meng, Hui, Zhang, Huaxiang, Liu, Li, Liu, Dongmei, Lu, Xu, and Guo, Xinru
Subjects: *INFORMATION retrieval, *LEARNING modules, *COMPUTER programming education
Abstract: Cross-modal hashing retrieval has garnered significant attention for its exceptional retrieval efficiency and low storage consumption, especially in large-scale data retrieval. However, due to the difference in modality and semantic gap, the existing methods fail to fuse multi-modal information effectively or adjust weight adaptively, which further damages the discriminative ability of the generated hash code. In this paper, we propose an innovative approach called the Joint-Modal Graph Convolutional Hashing (JMGCH) method via adaptive weight assignment for unsupervised cross-modal retrieval. JMGCH consists of a Feature Encoding Module (FEM), a Joint-Modal Graph Convolutional Module (JMGCM), an Adaptive Weight Allocation Fusion Module (AWAFM), and a Hash Code Learning Module (HCLM). After the image and text have been encoded, we use the graph convolutional network to further explore the semantic structure. To consider both the intra-modal and inter-modal semantic relationships, JMGCM is proposed to capture the correlations of different modalities, and then fuse the features from uni-modality and cross-modality by designed AWAFM. Finally, in order to obtain the hash code with greater expressive capacity, the features of one modality are used to reconstruct the features of another one, so as to reduce the gap between different modalities. We conduct extensive experiments on three widely used cross-modal retrieval datasets, and the results demonstrate that our proposed framework achieves satisfactory retrieval performance. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

2. Hypergraph clustering based multi-label cross-modal retrieval.

Author: Guo, Shengtang, Zhang, Huaxiang, Liu, Li, Liu, Dongmei, Lu, Xu, and Li, Liujian
Subjects: *HYPERGRAPHS, *GRAPH theory, *SEMANTICS, *IMAGE retrieval, *MULTIMEDIA systems
Abstract: Most existing cross-modal retrieval methods face challenges in establishing semantic connections between different modalities due to inherent heterogeneity among them. To establish semantic connections between different modalities and align relevant semantic features across modalities, so as to fully capture important information within the same modality, this paper considers the superiority of hypergraph in representing higher-order relationships, and proposes an image-text retrieval method based on hypergraph clustering. Specifically, we construct hypergraphs to capture feature relationships within image and text modalities, as well as between image and text. This allows us to effectively model complex relationships between features of different modalities and explore the semantic connectivity within and across modalities. To compensate for potential semantic feature loss during the construction of the hypergraph neural network, we design a weight-adaptive coarse and fine-grained feature fusion module for semantic supplementation. Comprehensive experimental results on three common datasets demonstrate the effectiveness of the proposed method. • A hypergraph cluster module is proposed to model modal relationships. • A fusion module dynamically learns weights is proposed. • The experiments conducted on three datasets prove the effectiveness of our method. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

3. Hierarchical Feature Aggregation Based on Transformer for Image-Text Matching.

Author: Dong, Xinfeng, Zhang, Huaxiang, Zhu, Lei, Nie, Liqiang, and Liu, Li
Subjects: *SEMANTICS, *MODAL logic, *GRAPH algorithms, *IMAGE reconstruction, *FEATURE extraction, *PARTICLES
Abstract: In order to carry out more accurate retrieval across image-text modalities, some scholars use fine-grained feature to align image and text. Most of them directly use attention mechanism to align image regions and words in the sentence, and ignore the fact that semantics related to an object is abstract and cannot be accurately expressed by object information alone. To overcome this weakness, we propose a hierarchical feature aggregation algorithm based on graph convolutional networks (GCN) to facilitate object semantic integrity by integrating attributes of an object and relations between objects hierarchically in both image and text modalities. In order to eliminate the semantic gap between modalities, we propose a cross-modal feature fusion method based on transformer to generate modal-specific feature representations by integrating both the object feature and global feature from the other modality. Then we map the fusion feature into a common space. Experiment results on the most frequently-used datasets MSCOCO and Flickr30K show the effectiveness of the proposed model compared with the latest methods. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

4. Self-similarity guided probabilistic embedding matching based on transformer for occluded person re-identification.

Author: Pang, Yunxiao, Zhang, Huaxiang, Zhu, Lei, Liu, Dongmei, and Liu, Li
Subjects: *TRANSFORMER models, *IMAGE registration, *DISTRIBUTION (Probability theory), *VIRTUAL networks, *PEDESTRIANS, *PROBLEM solving
Abstract: In real scenes, the occluded situation caused by different factors is the key to the person re-identification (Re-ID) problem. Many methods retrieve the corresponding images in the gallery by employing feature matching. However, the occluded region is still not well addressed, and the noise generated inevitably increases the difficulty of Re-ID. Therefore, we propose a Self-Similarity guided Probabilistic Embedding Matching (SSPEM) method to solve the problem of occluded person Re-ID. Specifically, we first design a Feature Similarity Enhancement (FSE) module to perform self-similarity calculation based on the extracted features of the same pedestrians by Vision Transformer (ViT) and complete feature enhancement by suppressing irrelevant information. Then, we design a probabilistic embedding matching (PEM) module to represent the features as probability distributions in a common embedding space, which is able to learn more feature structures. The uncertainty for the occluded region can maximize the metric computation to obtain features with better matching. Extensive experiments on five challenging datasets for occluded and holistic person Re-ID tasks are conducted, and the results show the effectiveness of our proposed SSPEM method. • A novel method SSPEM is proposed to solve occluded person Re-ID problem. • Proposing a FSE module to follow the self-similarity of the same pedestrian. • Designing a PEM module to achieve probabilistic embedding matching. • Experimental results show the superiority of the SSPEM method. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

5. Cross-modal dual subspace learning with adversarial network.

Author: Shang, Fei, Zhang, Huaxiang, Sun, Jiande, Nie, Liqiang, and Liu, Li
Subjects: *QUADRUPLETS, *STATISTICAL sampling, *MULTIMODAL user interfaces, *INTRA-aortic balloon counterpulsation, *BISTATIC radar
Abstract: Cross-modal retrieval has recently attracted much interest along with the rapid development of multimodal data, and effectively utilizing the complementary relationship of different modal data and eliminating the heterogeneous gap as much as possible are the two key challenges. In this paper, we present a novel network model termed cross-modal Dual Subspace learning with Adversarial Network (DSAN). The main contributions are as follows: (1) Dual subspaces (visual subspace and textual subspace) are proposed, which can better mine the underlying structure information of different modalities as well as modality-specific information. (2) An improved quadruplet loss is proposed, which takes into account the relative distance and absolute distance between positive and negative samples, together with the introduction of the idea of hard sample mining. (3) Intra-modal constrained loss is proposed to maximize the distance of the most similar cross-modal negative samples and their corresponding cross-modal positive samples. In particular, feature preserving and modality classification act as two antagonists. DSAN tries to narrow the heterogeneous gap between different modalities, and distinguish the original modality of random samples in dual subspaces. Comprehensive experimental results demonstrate that, DSAN significantly outperforms 9 state-of-the-art methods on four cross-modal datasets. • Dual parallel subspaces are proposed, which can better mine the underlying structure information of different modalities as well as modality-specific information. • An improved quadruplet loss is proposed, which integrates relative distance, absolute distance and hard sample mining. On the one hand, it pushes forward the boundaries of positive and negative samples to a certain extent. The introduction of the idea of hard sample mining reduces the complexity of the model and further improves its performance. • An Intra-modal constrained loss is proposed to maximize the distance of the most similar cross-modal negative sample and its corresponding cross-modal positive samples. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

6. Adversarial cross-modal retrieval based on dictionary learning.

Author: Shang, Fei, Zhang, Huaxiang, Zhu, Lei, and Sun, Jiande
Subjects: *MULTIMODAL user interfaces, *STATISTICAL learning, *MACHINE learning, *REINFORCEMENT learning, *ARTIFICIAL neural networks
Abstract: • We utilize dictionary learning to reconstruct strongly relevant and discriminative information so as to maintain the specificity of each sample. • Cross-modal adversarial mechanism is established to maintain statistical characteristics of different modalities. • The proposed approach can be easily applied to other modalities. And experimental results demonstrate its effectiveness. Existing cross-modal approaches focus on learning a subspace or using classical neural networks for similarity measurement of different modalities, which ignore the complex statistical properties of multimodal data. To settle the above problems, we propose a novel framework termed Adversarial Cross-Modal Retrieval Based on Dictionary Learning Algorithm (DLA-CMR). The dictionary learning serves as feature reconstructor to reconstruct discriminative features, while adversarial learning mines the statistical characteristics for each modality. Firstly, using all of the training (testing) samples to reconstruct each training (testing) sample, the specificity of each sample is maintained to some extent. Secondly, the weight of important features increases while that of secondary features decreases. This also makes the dimension of transformed visual modality approximate to textual modality. In addition, the adversarial learning guarantees that the transformed features maintain the inherent statistical characteristics of original features for each modality, and it requires transformed features to be statistically indistinguishable in common space. The transformed features must be maximally correlated to eliminate the heterogeneous gap. Comprehensive experimental results compared with 7 state-of-the-art methods on 4 widely-used datasets verify the effectiveness of our DLA-CMR method. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

7. Semantic consistency cross-modal dictionary learning with rank constraint.

Author: Shang, Fei, Zhang, Huaxiang, Sun, Jiande, and Liu, Li
Subjects: *CONSISTENCY models (Computers), *SEMANTIC computing, *MULTIMODAL user interfaces, *ELECTRONIC dictionaries, *MACHINE learning, *NATURAL language processing, *UNDIRECTED graphs
Abstract: • SCDC is a novel cross-modal dictionary learning algorithm which can preserve the semantic consistency of different modalities in a latent space. Specially, we construct the undirected graph to learn an orthogonal space instead of utilizing the original semantic space. • SCDC takes into consideration l 21 -norm regularization and rank constraint simultaneously. It not only selects the discriminative features but also improves the correlation of different modalities. These two constraints are rarely seen together in the field of dictionary learning. • The proposed approach can be easily applied to other modalities. And experimental results demonstrate its effectiveness. Cross-modal retrieval develops rapidly due to the growth and widespread applications of multimodal data. How to reduce the heterogeneous gap and impose effective constraints on different modalities are two basic problems. In this paper, we propose a novel Semantic Consistency cross-modal Dictionary learning algorithm with rank Constraint (SCDC) to solve these aforementioned problems. An orthogonal space learned by spectral regression is introduced, in which different modalities can be measured directly. Specifically, images and texts are encoded by their dictionaries to obtain corresponding reconstruction coefficients. A l 21 -norm term is imposed on these coefficients in order to select discriminative features and avoid over-fitting simultaneously. In the meantime, a rank constraint is imposed on the transformed features so as to improve the correlation of different modalities. Experimental results on three popular datasets demonstrate that SCDC is significantly superior to several state-of-the-art methods. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

8. Unsupervised cross-modal retrieval via Multi-modal graph regularized Smooth Matrix Factorization Hashing.

Author: Fang, Yixian, Zhang, Huaxiang, and Ren, Yuwei
Subjects: *MATRIX decomposition, *SIMILARITY (Geometry)
Abstract: Abstract The existing cross-modal hashing methods often encounter quantization loss which is caused by relaxing discrete hash codes in the process of cross-modal retrieval. To counter this problem, a Multi-modal graph regularized Smooth matrix Factorization Hashing (MSFH) approach is represented for unsupervised cross-modal retrieval. In the proposal framework, a smooth matrix generated by a control parameter is introduced into the matrix decomposition model, which can guarantee the sparsity of the dictionaries learned and the extracted common features at the same time, thus reducing the quantization loss in the hashing process. Furthermore, to preserve the topology of the original data, a multi-modal graph regularization term is drawn into the model, which consists of two parts. One is the intra-modal similarity graph which is used to preserve the geometric structure of each modality. The other is the inter-modal similarity graph reconstructed by the symmetric nonnegative matrix factorization, which is employed to soften the structure difference between modalities. The goal of MSFH is to learn unified hash-codes for multi-modal data in a shared latent semantic space in which the similarity of different modalities can be estimated effectively. And the corresponding experimental results on three benchmark data sets demonstrate the superiority of the proposed approach over several state-of-the-art cross-modality hashing approaches. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

9. Detecting the latent associations hidden in multi-source information for better group recommendation.

Author: Feng, Shanshan, Zhang, Huaxiang, Wang, Lei, Liu, Li, and Xu, Yuchang
Subjects: *RECOMMENDER systems, *HUMAN beings, *GROUPS
Abstract: Abstract Nowadays, most recommendation approaches used to suggest appreciate items for individual users. However, due to the social nature of human beings, group activities have become an integral part of our daily life, thus motivating the study on group recommendation. Unfortunately, most existing approaches used in group recommender systems make recommendations through aggregating individual preferences or individual predictive results rather than comprehensively investigating the social features that govern user choices made within a group. As a result, such approaches often fail to detect many latent factors that could potentially improve the performance of the group recommender systems. Therefore, we propose a new approach, random walks based on a topic model (RTM), for group recommendations through combining an integrated probabilistic topic model − a User Topic Model (UTM) with the Random Walk with Restart (RWR) method. The goal of the work in this paper is better identify group preferences by comprehensively detecting the latent associations among group members, in order to alleviate the data sparsity problem and improve the performance of group recommender systems. The UTM provides a latent framework of users, groups, and items by exploiting both the users' preference profiles and the items' content information, which together can describe group interests and item features in a more complete manner. This latent framework is then combined with RWR to predict the preference degrees of groups to unrated items. In particular, we develop two different recommendation strategies based on the proposed approach, and design a special random walk path for each developed recommendation strategy to comprehensively detect various latent associations. Finally, we conduct experiments to evaluate our approach and compare it with other state-of-the-art approaches using the real-world CAMRa2011 dataset. The results demonstrate the advantage of our approach over comparative ones. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

10. Weighted locality collaborative representation based on sparse subspace.

Author: Dong, Xiao, Zhang, Huaxiang, Zhu, Lei, Wan, Wenbo, Wang, Zhenhua, Wang, Qiang, Guo, Peilian, Ji, Hui, and Sun, Jiande
Subjects: *SUBSPACES (Mathematics), *SPARSE approximations, *ALGORITHMS, *REGRESSION analysis, *HUMAN facial recognition software, *COMPRESSED sensing
Abstract: Highlights • A novel algorithm based on the sparse subspace is proposed. • The approach first learns a subset of the original training data to build a much correlated dictionary, and learns the reconstruction coefficients for each unlabeled data while considering the influence of its local neighbors. • The approach takes advantages of the linear regression techniques together with the weighted collaborative representation techniques to learn more discriminative representation coefficients for unlabeled data. • Experimental results demonstrate its effectiveness. Abstract This paper takes into account both unlabeled data and their local neighbors to learn their sparse representations, and proposes a face recognition approach named Weighted Locality Collaborative Representation Classifier based on sparse subspace (WLCRC). WLCRC firstly learns a subset of the original training data to build a much correlated dictionary, and then combines linear regression techniques together with weighted collaborative representation techniques to optimize the linear reconstruction of unlabeled data. It uses the newly built dictionary to learn the reconstruction coefficients for each unlabeled datum while considering the influence of its local neighbors. Classifications are performed according to the reconstruction residuals, and experimental results on benchmark datasets demonstrate that WLCRC is effective. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

11. Supervised graph regularization based cross media retrieval with intra and inter-class correlation.

Author: Zhang, Meijia, Zhang, Huaxiang, Li, Junzheng, Wang, Li, Fang, Yixian, and Sun, Jiande
Subjects: *GRAPHIC methods, *INTRACLASS correlation, *SUBSPACES (Mathematics), *DATA mining, *INFORMATION retrieval, *MULTIMEDIA communications
Abstract: Highlights • We project heterogeneous media data to a subspace to increase data discriminative capability. • We learn two couples of projections for different retrieval tasks. • The proposed approach is semi-supervised and can be easily applied to other modalities. • Experimental results demonstrate its effectiveness. Abstract With the rapid development of internet technology, mining and retrieving the information from internet accurately is an urgent problem, among which, cross media retrieval becomes a hot spot of current research. This paper proposes a cross media retrieval approach, which learns two couples of projections based on different retrieval tasks. We first learn a common subspace to project heterogeneous media data to the isomorphic subspace, to measure the similarity of the heterogeneous media data in the isomorphic subspace. Second, we build isomorphic and heterogeneous adjacent graphs to preserve the correlations of the cross media data. Then we combine the two processes together to learn a common subspace. We also consider intra-class and inter-class similarity of images or texts in the unified framework. Third, the L 2 norm is used to perform feature selection for different media data. Experimental results on three datasets demonstrate the effectiveness of the proposed approach. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

12. Multi-label adversarial fine-grained cross-modal retrieval.

Author: Sun, Chunpu, Zhang, Huaxiang, Liu, Li, Liu, Dongmei, and Wang, Lin
Subjects: *GRAPH labelings, *LABEL design, *LEARNING modules
Abstract: Most supervised cross-modal approaches transform features into a common representation space in which semantic similarity can be measured directly. However, there exist modal specific features in the common semantic space and most methods cannot fully eliminate them. In order to bridge the semantic gap and eliminate modal specific features, we propose a novel Multi-label Adversarial Fine-grained Cross-modal Retrieval Based on Transformer (MLAT). MLAT constructs a semantic consistency enhanced module (SCE) which includes the semantic mask attention module and a fine-grained feature generator based on transformer. It learns fine-grained semantic information to preserve the high-level semantic relevance and eliminate modal specific features. In order to narrow the distance between common representations and further eliminate modal specific features, we construct a multi-stage adversarial learning module to optimize feature representations. Furthermore, we design a label graph network based on graph attention network (GAT) to better explore the semantic correlations between labels and learn a classifier. Three benchmark datasets are synthesized to demonstrate the superiority of MLAT method. • We design a semantic consistency enhanced module to eliminate modal specific features and use transformer to extract the fine-grained features for better cross modal alignment. • We construct a label graph network based GAT to explore the semantic correlations between labels and learn a classifier. • The experiments conducted on three retrieval datasets prove the effectiveness of our method compared with the state-of-art algorithms. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

13. Multi-level adversarial attention cross-modal hashing.

Author: Wang, Benhui, Zhang, Huaxiang, Zhu, Lei, Nie, Liqiang, and Liu, Li
Subjects: *DEEP learning, *PROBLEM solving
Abstract: Deep cross-modal hashing has made great progress in recent years due to the development of deep learning and efficient hashing algorithms. However, most of the existing methods only focus on the feature distribution between modalities, and ignore the fine grain information in each modality. To solve this problem, we propose a multi-level adversarial attention cross-modal hashing (MAAH). First, we design a modality-attention module to find the fine-grained information of each modality. Specifically, we use the channel attention mechanism to divide modality information into relevant and irrelevant representation, in which the irrelevant representation is the fine-grained information of the modality. Then, we design a modality-adversary module to supplement the fine-grained information of each modality. In this module, intra-modal adversarial learning can supplement the relevant representation of modalities, and inter-modal adversarial learning can make the distribution of the relevant representation of each modality more uniform. Experimental results on three widely used datasets demonstrate the superiority of the proposed method. • We design a modality-attention module to separate the relevant and irrelevant representations. • We design a modality-adversary module to supplement the relevant representation information. • The experimental results of our method show superiority on three widely used datasets. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

14. Locally controllable network based on visual–linguistic relation alignment for text-to-image generation.

Author: Li, Zaike, Liu, Li, Zhang, Huaxiang, Liu, Dongmei, Song, Yu, and Li, Boqun
Abstract: Since locally controllable text-to-image generation cannot achieve satisfactory results in detail, a novel locally controllable text-to-image generation network based on visual–linguistic relation alignment is proposed. The goal of the method is to complete image processing and generation semantically through text guidance. The proposed method explores the relationship between text and image to achieve local control of text-to-image generation. The visual–linguistic matching learns the similarity weights between image and text through semantic features to achieve the fine-grained correspondence between local images and words. The instance-level optimization function is introduced into the generation process to accurately control the weight with low similarity and combine with text features to generate new visual attributes. In addition, a local control loss is proposed to preserve the details of the text and local regions of the image. Extensive experiments demonstrate the superior performance of the proposed method and enable more accurate control of the original image. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

15. Discriminative correlation hashing for supervised cross-modal retrieval.

Author: Lu, Xu, Zhang, Huaxiang, Sun, Jiande, Wang, Zhenhua, Guo, Peilian, and Wan, Wenbo
Subjects: *IMAGE retrieval, *HASHING, *HAMMING codes, *IMAGE quality in imaging systems, *IMAGE reconstruction algorithms
Abstract: Due to their storage and calculational efficiency, hashing techniques have been used for cross-modal retrieval on large-scale multi-modal data. Cross-modal hashing methods retrieve relevant items of one modality for the query of the other modality by mapping heterogeneous data of different modalities into a common Hamming space, where the binary codes are generated. However, the existing cross-modal hashing methods pay little attention to the discriminative property of the binary codes. In this paper, we propose a novel supervised cross-modal hashing method, named Discriminative Correlation Hashing (DCH), which integrates discriminative property into the hashing learning procedure. DCH introduces the Linear Discriminant Analysis (LDA) to preserve the discriminative property of textual modality and transfers it to the corresponding image modality by the learned unified binary code, thus making data in the common Hamming space much more discriminative. Extensive experimental results demonstrate that DCH outperforms state-of-the-art cross-modal hashing methods. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

16. Semantic-embedding Guided Graph Network for cross-modal retrieval.

Author: Yuan, Mengru, Zhang, Huaxiang, Liu, Dongmei, Wang, Lin, and Liu, Li
Subjects: *IMAGE processing, *SEMANTICS, *SEMANTIC integration (Computer systems), *FUSION (Phase transformation), *SIGNAL processing
Abstract: Many methods focus on aligning image regions with the corresponding text fragments, and ignore that images contain fragments that cannot be expressed by texts. To fully express the information of images and prevent the performance degradation caused by fine-grained information deviating from the core meaning of images, we propose Semantic-embedding Guided Graph Network (SGGN) for cross-modal retrieval. It learns the detail representations of each modality, with an integrated semantic information, by guiding local fragments to capture the internal correlation of cross-modal data and effectively convey the information. To further bridge the semantic gap between different modalities, SGGN uses adversarial network to play a game, and uses graph aggregation network to absorb complementary information of neighbor samples. We evaluate our approach on two datasets. Our method (based on R @ 10) achieves 97.2% on Flickr30k dataset. On MS-COCO dataset, it reaches 99.2% using 1 K test set and 92.0% using 5 K test set. • A Semantic-Embedding Guided Graph Network is proposed, which captures global and local semantic information based on semantic embedding guided network. • The network uses global information to guide fine-grained information for effectively preventing deviation from the global semantic representation. • Adversarial network and aggregation network based on graph structure are beneficial to the fusion of context information and reduce the modal gap. • Experiments demonstrate the superiority of the proposed method. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

17. Generative adversarial text-to-image generation with style image constraint.

Author: Wang, Zekang, Liu, Li, Zhang, Huaxiang, Liu, Dongmei, and Song, Yu
Subjects: *GENERATIVE adversarial networks
Abstract: Most text-to-image generation works focus on the semantic consistency and neglect the style of the generated image. In this paper, a novel text-to-image generation method is proposed to generate image with style image constraint. In order to provide more comprehensive information by mining long–short-range information dependencies, the multi-group attention module is introduced to capture the multi-scale dependency information in the semantic feature. The adaptive multi-scale attention normalization is adopted to pay the multi-scale style feature attention in the style fusion process. The style information related to semantic feature is filtered out by the style feature attention. This selected style information is transferred to the generated results by aligning the mean and variance of the semantic feature and the style feature. Experiments conducted on common datasets show the validity of the proposed approach. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

18. Semi‐tensor product method to a class of event‐triggered control for finite evolutionary networked games.

Author: Guo, Peilian, Zhang, Huaxiang, Alsaadi, Fuad E., and Hayat, Tasawar
Abstract: Using the approach of semi‐tensor product of matrices, this study studies a class of event‐triggered control for finite evolutionary networked games, where the control only works at some certain individual states. First, by identifying 'control does not work' as a new specific control strategy, the controlled game dynamics is converted into an algebraic form. Second, to make the game converge globally, two necessary and sufficient conditions for the existence of event‐triggered control are obtained. Meanwhile, a constructive procedure is proposed to design state feedback control strategy and an adjustment method is presented to minimise the control times. Finally, the developed theory results are illustrated by a numerical method. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

19. A Weighted Sparse Neighbourhood-Preserving Projections for Face Recognition.

Author: Wang, Yongxin, Zhang, Huaxiang, and Yang, Feng
Subjects: *HUMAN facial recognition software, *DIMENSION reduction (Statistics), *ELECTRONIC data processing, *DISCRIMINANT analysis, *IMAGE reconstruction
Abstract: Dimensionality reduction algorithms are widely applied to high-dimensional data pre-processing, especially for face images. In this paper, we propose an unsupervised sparse subspace learning approach called weighted sparse neighbourhood-preserving projections (WSNPP) for face recognition. Unlike many existing approaches such as sparsity-preserving projections (SPP), where the constructive weights are computed by the classical sparse representation (SR), WSNPP utilizes a weighted SR model to represent samples. The obtained projections can contain more local discriminant information than classical sparse subspace learning methods. Moreover, WSNPP puts a constraint on the number of nonzero reconstruction coefficients and hence is more robust to global noises and time saving. Experiments on AR, Yale-B and ORL image datasets demonstrate its effectiveness. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

20. A two-stage learning approach to face recognition.

Author: Dong, Xiao, Zhang, Huaxiang, Sun, Jiande, and Wan, Wenbo
Subjects: *HUMAN facial recognition software, *MACHINE learning, *SAMPLE size (Statistics), *SUBSPACES (Mathematics), *DATA dictionaries
Abstract: This paper introduces the Collaborative Representation (CR) techniques to small sample size conditions, and propose a Two-Stage learning approach to face recognition based on Collaborative Representation (TSCR). Based on the assumption that the same class samples should lie in the same subspace, we first use the unlabeled samples as dictionary atoms to construct each labeled sample, and obtain the collaborative coefficients by CR. The unlabeled sample with the largest collaborative coefficient is assigned the same class label as the reconstructed labeled sample, and is added to the labeled data set. This process is repeated until about half of the unlabeled samples are labeled and added to the labeled dataset. After that, we employ the original CR approach to classify the left unlabeled samples based on the newly labeled dataset. Experimental results demonstrate that the proposed TSCR is effective on face recognition. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

21. Feature generation based on relation learning and image partition for occluded person re-identification.

Author: Pang, Yunxiao, Zhang, Huaxiang, Zhu, Lei, Liu, Dongmei, and Liu, Li
Subjects: *ARTIFICIAL neural networks, *SEMANTICS, *PEDESTRIANS, *COMPARATIVE method, *ARTIFICIAL intelligence
Abstract: In order to solve the challenging tasks of person re-identification(Re-ID) in occluded scenarios, we propose a novel approach which divides local units by forming high-level semantic information of pedestrians and generates features of occluded parts. The approach uses CNN and pose estimation to extract the feature map and key points, and a graph convolutional network to learn the relation of key points. Specifically, we design a Generating Local Part (GLP) module to divide the feature map into different units. Based on different occluded conditions, the partition mode of GLP has high flexibility and variability. The features of the non-occluded parts are clustered into an intermediate node, and then the spatially correlated features of the occluded parts are generated according to the de-clustering operation. We conduct experiments on both the occluded and the holistic datasets to demonstrate its effectiveness. • A generating local part (GLP) module is constructed to block feature map and avoid the deficiency of uniform partition. • Graph convolutional network is utilized to improve the attention of non-occluded parts and create a prerequisite for flexible partition of GLP module. • Experimental results show its performance is better than the comparative methods on occluded dataset. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

22. Application of ethyl acetoacetate bifocal additive for achieving high-performance perovskite solar cells.

Author: Ma, Haibin, Zhang, Huaxiang, Wang, Lintong, and Song, Mingjun
Subjects: *ETHYL acetoacetate, *SOLAR cells, *PEROVSKITE, *OPEN-circuit voltage, *CONTACT angle
Abstract: Perovskite layer in perovskite solar cells (PSCs) is usually polycrystalline structure with relatively small crystal size, many grain boundaries and a large number of defects, which will be one of the main reasons that limit the improvement of their photovoltaic performance and stability. Hence, improving the film quality and reducing the defect density for perovskite layer are the key to prepare high-performance PSCs. Herein, the ethyl acetoacetate (EAA) containing two functional groups of C O and –COO is introduced into perovskite precursor solution as a novel bifocal additive, which could not only promote the growth of perovskite, but also effectively deactivate the defect. The results display that EAA can increase the crystallinity and average grain sizes, effectively reduce the defects and significantly restrain the defect-assisted non-radiative recombination, attributing to the strong interaction between EAA and Pb2+ in perovskite. As a result, the perovskite film with EAA shows better film-quality, reduced defects, and suppressed defect-assisted non-radiative recombination. Hence, the EAA device achieves a high PCE of 22.08% with a markedly enhanced open-circuit voltage of 1.15 V. More importantly, the EAA perovskite exhibits excellent humidity stability. The perovskite devices with high water contact angle of 84.4° can retain around 80% of their initial PCEs under 40% RH for 4000 h. The results afford an easy but effective method to high-quality perovskite films with reduced defect density for highly efficient and stable PSCs. The ethyl acetoacetate (EAA) additive containing two functional groups of C O and –COO was introduced into perovskite precursor solution as a novel bifocal additive. The optimal device with negligible hysteresis yielded a remarkablely higher PCE of 22.08% and excellent stability. [Display omitted] • The ethyl acetoacetate (EAA) was introduced into perovskite precursor solution as a novel bifocal additive. • The sample showed better film-quality, reduced defects, and suppressed non-radiative recombination. • The optimal perovskite solar cells exhibit power conversion efficiency over 22% with suppressed hysteresis. • The devices with EAA additive exhibited excellent humidity stability. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

23. HCFN: Hierarchical cross-modal shared feature network for visible-infrared person re-identification.

Author: Li, Yueying, Zhang, Huaxiang, and Liu, Li
Subjects: *INFRARED imaging, *HIERARCHICAL Bayes model, *GRAPH algorithms, *IMAGING systems, *LEARNING
Abstract: Compared with traditional visible–visible person re-identification, the modality discrepancy between visible and infrared images makes person re-identification more challenging. Existing methods rely on learning efficient transformation mechanisms in paired images to reduce the modality gap, which inevitably introduces noise. To get rid of these limitations, we propose a Hierarchical Cross-modal shared Feature Network (HCFN) to mine modality-shared and modality-specific information. Since infrared images lack color and other information, we construct an Intra-modal Feature Extraction Module (IFEM) to learn the content information and reduce the difference between visible and infrared images. In order to reduce the heterogeneous division, we apply a Cross-modal Graph Interaction Module (CGIM) to align and narrow the set-level distance of the inter-modal images. By jointly learning two modules, our method can achieve 66.44% Rank-1 on SYSU-MM01 dataset and 74.81% Rank-1 on RegDB datasets, respectively, which is superior compared with the state-of-the-art methods. In addition, ablation experiments demonstrate that HCFN is at least 4.9% better than the baseline network. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

24. A novel image retrieval method based on multi-trend structure descriptor.

Author: Zhao, Meng, Zhang, Huaxiang, and Sun, Jiande
Subjects: *VISUAL communication, *DIGITAL techniques for visual communication, *IMAGE representation, *COMPUTER graphics research
Abstract: This paper proposes an image feature representation method, namely Multi-Trend Structure Descriptor (MTSD), which is built based on the local and multi-trend structures. The local structures can be regarded as the basic units for image analysis, and the multi-trend structures are introduced to explore the correlation among pixels in local structures according to the information change of pixels. The visual information such as color, edge orientation and intensity map are considered and quantized, and with the local structure as a bridge, we use multi-trend to detect color, edge orientation and intensity map respectively for feature extraction. MTSD can characterize not only the low-level features, such as color, shape and texture, but also the local spatial structure information. We evaluate the performance of the proposed algorithm on Corel and Caltech datasets, and experimental results demonstrate that, MTSD significantly outperforms texton co-occurrence matrix, multi-texton histogram, micro-structure descriptor and saliency structure histogram. [ABSTRACT FROM AUTHOR]
Published: 2016
Full Text: View/download PDF

25. RWO-Sampling: A random walk over-sampling approach to imbalanced data classification.

Author: Zhang, Huaxiang and Li, Mingfang
Subjects: *RANDOM walks, *STATISTICAL sampling, *DATA analysis, *APPROXIMATION theory, *SCALABILITY, *NETWORK performance
Abstract: Highlights: [•] A random walk over-sampling approach is proposed to generate instances. [•] The generated data wide the classification border. [•] The generated data and the original data approximately obey similar distribution. [•] The classifier learned is unbiased and has high scalability. [•] A broad experimental evaluation is performed. [ABSTRACT FROM AUTHOR]
Published: 2014
Full Text: View/download PDF

26. Coarse-to-fine dual-level attention for video-text cross modal retrieval.

Author: Jin, Ming, Zhang, Huaxiang, Zhu, Lei, Sun, Jiande, and Liu, Li
Subjects: *MACHINE learning, *LEARNING modules
Abstract: The effective representation of video features plays an important role in video vs. text cross-modal retrieval, and many researchers either use a single modal feature of the video or simply combine multi-modal features of the video. This makes the learned video features less robust. To enhance the robustness of video feature representation, we use coarse-fine-grained parallel attention model and feature fusion module to learn more effective video feature representation. Among them, coarse-grained attention learns the relationship between different feature blocks in the same modality feature and fine-grained attention applies attention to global features and strengthens the connection between points. Coarse-grained attention and Fine-grained attention complement each other. We integrate multi-head attention network into the model to expand the receptive field for features, and use the feature fusion module to further reduce the semantic gap between different video modalities. Our proposed model architecture not only strengthens the relationship between global features and local features, but also compensates the differences between different modality features in the video. Evaluation on three widely used datasets AcitivityNet-Captions, MSRVTT and LSMDC demonstrates its effectiveness. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

27. A spectral clustering based ensemble pruning approach.

Author: Zhang, Huaxiang and Cao, Linlin
Subjects: *CLUSTER analysis (Statistics), *STATISTICAL ensembles, *BOOTSTRAP aggregation (Algorithms), *CLASSIFICATION algorithms, *ARTIFICIAL neural networks, *ARTIFICIAL intelligence
Abstract: Abstract: This paper introduces a novel bagging ensemble classifier pruning approach. Most investigated pruning approaches employ heuristic functions to rank classifiers in the ensemble, and select part of them from the ranked ensemble, so redundancy may exist in the selected classifiers. Based on the idea that the selected classifiers should be accurate and diverse, we define classifier similarity according to the predictive accuracy and the diversity, and introduce a Spectral Clustering based classifier selection approach (SC). SC groups the classifiers into two clusters based on the classifier similarity, and retains one cluster of classifiers in the ensemble. Experimental results show that SC is competitive in terms of classification accuracy. [Copyright &y& Elsevier]
Published: 2014
Full Text: View/download PDF

28. A locality correlation preserving support vector machine.

Author: Zhang, Huaxiang, Cao, Linlin, and Gao, Shuang
Subjects: *SUPPORT vector machines, *STATISTICAL correlation, *MAXIMUM entropy method, *DATA analysis, *CANONICAL correlation (Statistics), *INFORMATION theory
Abstract: Abstract: This paper proposes a locality correlation preserving based support vector machine (LCPSVM) by combining the idea of margin maximization between classes and local correlation preservation of class data. It is a Support Vector Machine (SVM) like algorithm, which explicitly considers the locality correlation within each class in the margin and the penalty term of the optimization function. Canonical correlation analysis (CCA) is used to reveal the hidden correlations between two datasets, and a variant of correlation analysis model which implements locality preserving has been proposed by integrating local information into the objective function of CCA. Inspired by the idea used in canonical correlation analysis, we propose a locality correlation preserving within-class scatter matrix to replace the within-class scatter matrix in minimum class variance support machine (MCVSVM). This substitution has the property of keeping the locality correlation of data, and inherits the properties of SVM and other similar modified class of support vector machines. LCPSVM is discussed under linearly separable, small sample size and nonlinearly separable conditions, and experimental results on benchmark datasets demonstrate its effectiveness. [Copyright &y& Elsevier]
Published: 2014
Full Text: View/download PDF

29. Creating ensembles of classifiers via fuzzy clustering and deflection

Author: Zhang, Huaxiang and Lu, Jing
Subjects: *FUZZY sets, *DISCRIMINANT analysis, *CLUSTER analysis (Statistics), *PATTERN perception, *STATISTICAL sampling, *ENTROPY (Information theory)
Abstract: Abstract: Ensembles of classifiers can increase the performance of pattern recognition, and have become a hot research topic. High classification accuracy and diversity of the component classifiers are essential to obtain good generalization capability of an ensemble. We review the methods used to learn diverse classifiers, employ fuzzy clustering with deflection to learn the distribution characteristics of the training data, and propose a novel sampling approach to generate training data sets for the component classifiers. Our approach increases the classification accuracy and diversity of the component classifiers. The approach is evaluated using the base classifier c4.5, and the experimental results show that it outperforms Bagging and AdaBoost on almost all the randomly selected 20 benchmark UCI data sets. [Copyright &y& Elsevier]
Published: 2010
Full Text: View/download PDF

30. Semi-supervised fuzzy clustering: A kernel-based approach

Author: Zhang, Huaxiang and Lu, Jing
Subjects: *COMPUTER algorithms, *DOCUMENT clustering, *FUZZY systems, *CLASSIFICATION, *PARAMETER estimation, *PERFORMANCE evaluation, *MATHEMATICAL optimization, *ERROR analysis in mathematics
Abstract: Abstract: Semi-supervised clustering algorithms aim to improve the clustering accuracy under the supervisions of a limited amount of labeled data. Since kernel-based approaches, such as kernel-based fuzzy c-means algorithm (KFCM), have been successfully used in classification and clustering problems, in this paper, we propose a novel semi-supervised clustering approach using the kernel-based method based on KFCM and denote it the semi-supervised kernel fuzzy c-mean algorithm (SSKFCM). The objective function of SSKFCM is defined by adding classification errors of both the labeled and the unlabeled data, and its global optimum has been obtained through repeatedly updating the fuzzy memberships and the optimized kernel parameter. The objective function may have more than one local optimum, so we employ a function transformation technique to reformulate the objective function after a local minimum has been obtained, and select the best optimum as the solution to the objective function. Experimental results on both the artificial and several real data sets show SSKFCM performs better than its conventional counterparts and it achieves the best accurate clustering results when the parameter is optimized. [Copyright &y& Elsevier]
Published: 2009
Full Text: View/download PDF

31. An adaptive policy gradient in learning Nash equilibria

Author: Zhang, Huaxiang and Fan, Ying
Subjects: *NASH equilibrium, *ALGORITHMS, *GAME theory, *ITERATIVE methods (Mathematics), *LEARNING, *COMPUTER networks
Abstract: Abstract: A novel Nash equilibria (NE) learning algorithm for finite strategic games is presented in this paper. Based on an assumption that each player tries to maximize his own payoff, the algorithm explores the policies of the players in the policy profile space to increase the payoffs in each learning iteration. This paper investigates the effectiveness of the algorithm and show experimentally that the proposed algorithm accelerates the policy learning process. This algorithm learns faster than other proposed intelligent learning approaches, and can learn almost all the existing NE for a finite strategic game. [Copyright &y& Elsevier]
Published: 2008
Full Text: View/download PDF

32. Label projection online hashing for balanced similarity.

Author: Fang, Yuzhi, Zhang, Huaxiang, and Liu, Li
Subjects: *HASHING, *CLOUD storage, *INFORMATION retrieval, *BINARY codes, *VECTOR analysis
Abstract: Since online hashing has the advantages of low storage and fast calculation ,it attracts the attention of many scholars. However, the learning of new data streams separates the similarity between new data and existing data in many online hashing methods, which leads to poor retrieval performance. In addition, the similarity measure ignores the expression of different similarity. In this paper, we propose a novel supervised method, namely Label Projection Online Hashing for Balanced Similarity (LPOH). Compared with existing online hashing methods, LPOH aims to solve the problem of the effective establishment of the projection between the label vector and the binary code, and the successful realization of description of different similarity between the same labeled data. Specifically, LPOH overcomes the problem of similarity deviation caused by data imbalance via establishing a mapping matrix to derive a relationship between the data label vector and the binary code. Furthermore, the error between the binary code and the hash function concerning data streams is described. Extensive experiments on widely-used three benchmark datasets demonstrate that LPOH outperforms the state-of-the-art online hashing methods. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

33. Convergent gradient ascent with momentum in general-sum games

Author: Zhang, Huaxiang and Huang, Shangteng
Subjects: *ALGORITHMS, *ITERATIVE methods (Mathematics), *NUMERICAL analysis, *GAME theory
Abstract: We discuss the recent work in policy gradient learning in general-sum games, and address the drawbacks in policy convergence of algorithms such as IGA and WOLF-IGA. We propose a novel learning algorithm M-IGA by adding momentum terms to policy iterations. M-IGA is guaranteed to converge to Nash equilibrium policies against a M-IGA learner. [Copyright &y& Elsevier]
Published: 2004
Full Text: View/download PDF

34. Iterative graph attention memory network for cross-modal retrieval.

Author: Dong, Xinfeng, Zhang, Huaxiang, Dong, Xiao, and Lu, Xu
Subjects: *REPRESENTATIONS of graphs, *ARTIFICIAL neural networks, *MEMORY, *SEMANTICS, *MODAL logic
Abstract: How to eliminate the semantic gap between multi-modal data and effectively fuse multi-modal data is the key problem of cross-modal retrieval. The abstractness of semantics makes semantic representation one-sided. In order to obtain complementary semantic information for samples with the same semantics, we construct a local graph for each instance and utilize a graph feature extractor (GFE) to reconstruct the sample representation based on the adjacency relationship between the sample itself and its neighbors. Owing to the problem that some cross-modal methods only focus on the learning of paired samples and cannot integrate more cross-modal information from the other modalities, we propose a cross-modal graph attention strategy to generate the graph attention representation for each sample from the local graph of its corresponding paired sample. In order to eliminate heterogeneous gap between modalities, we fuse the features of the two modalities using a recurrent gated memory network to choose prominent features from other modalities and filter out unimportant information to obtain a more discriminative feature representation in the common latent space. Experiments on four benchmark datasets demonstrate the superiority of our proposed model compared with state-of-the-art cross-modal methods. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

35. Adaptive evolutionary programming based on reinforcement learning

Author: Zhang, Huaxiang and Lu, Jing
Subjects: *RESEARCH, *ALGORITHMS, *EVOLUTIONARY computation, *REINFORCEMENT learning
Abstract: Abstract: This paper studies evolutionary programming and adopts reinforcement learning theory to learn individual mutation operators. A novel algorithm named RLEP (Evolutionary Programming based on Reinforcement Learning) is proposed. In this algorithm, each individual learns its optimal mutation operator based on the immediate and delayed performance of mutation operators. Mutation operator selection is mapped into a reinforcement learning problem. Reinforcement learning methods are used to learn optimal policies by maximizing the accumulated rewards. According to the calculated Q function value of each candidate mutation operator, an optimal mutation operator can be selected to maximize the learned Q function value. Four different mutation operators have been employed as the basic candidate operators in RLEP and one is selected for each individual in different generations. Our simulation shows the performance of RLEP is the same as or better than the best of the four basic mutation operators. [Copyright &y& Elsevier]
Published: 2008
Full Text: View/download PDF

36. The Resource Utilization of Poplar Leaves for CO 2 Adsorption.

Author: Wang, Xia, Kong, Fanyuan, Zeng, Wulan, Zhang, Huaxiang, Xin, Chunling, and Kong, Xiangjun
Subjects: *CARBON dioxide, *POROSITY, *POPLARS, *ADSORPTION (Chemistry), *ADSORPTION capacity, *POLYPYRROLE
Abstract: Every late autumn, fluttering poplar leaves scatter throughout the campus and city streets. In this work, poplar leaves were used as the raw material, while H3PO4 and KOH were used as activators and urea was used as the nitrogen source to prepare biomass based-activated carbons (ACs) to capture CO2. The pore structures, functional groups and morphology, and desorption performance of the prepared ACs were characterized; the CO2 adsorption, regeneration, and kinetics were also evaluated. The results showed that H3PO4 and urea obviously promoted the development of pore structures and pyrrole nitrogen (N–5), while KOH and urea were more conductive to the formation of hydroxyl (–OH) and ether (C–O) functional groups. At optimal operating conditions, the CO2 adsorption capacity of H3PO4– and KOH–activated poplar leaves after urea treatment reached 4.07 and 3.85 mmol/g, respectively, at room temperature; both showed stable regenerative behaviour after ten adsorption–desorption cycles. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

37. Level set method with Retinex‐corrected saliency embedded for image segmentation.

Author: Liu, Dongmei, Chang, Faliang, Zhang, Huaxiang, and Liu, Li
Subjects: *IMAGE segmentation, *DIGITAL image processing, *IMAGE analysis, *COMPUTER vision, *LEVEL set methods
Abstract: It can be a very challenging task when using level set method segmenting natural images with high intensity inhomogeneity and complex background scenes. A new synthesis level set method for robust image segmentation based on the combination of Retinex‐corrected saliency region information and edge information is proposed in this work. First, the Retinex theory is introduced to correct the saliency information extraction. Second, the Retinex‐corrected saliency information is embedded into the level set method due to its advantageous quality which makes a foreground object stand out relative to the backgrounds. Combined with the edge information, the boundary of segmentation will be more precise and smooth. Experiments indicate that the proposed segmentation algorithm is efficient, fast, reliable, and robust. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

38. Deep semantic cross modal hashing with correlation alignment.

Author: Zhang, Meijia, Li, Junzheng, Zhang, Huaxiang, and Liu, Li
Subjects: *HASHING, *DATA distribution, *INFORMATION retrieval
Abstract: • We construct a new similarity for the multi-label data, which can well exploit the semantic information and improve the retrieval accuracy. • The inter-modal similarity of heterogeneous data features is preserved, which can exploit semantic correlation, and the distributions of heterogeneous data are aligned to mine the inter-modal correlation well. • The semantic label information is embedded in the hash layer of text network, which can make the learned hash matrix more stable and make hash codes more discriminative. • The results on MIRFLICKR-25K, NUS-WIDE-10, and NUS-WIDE-21 datasets demonstrate that our proposed method outperforms the state-of-the-art methods. Hashing has been extensively applied to cross modal retrieval due to its low storage and high efficiency. Deep hashing which can well extract features of multi-modal data has received increasing research attention recently. However, most of deep hashing for cross modal retrieval methods do not make full use of the semantic label information and do not fully mine correlation of heterogeneous data. In this paper, we propose a Deep Semantic cross modal hashing with Correlation Alignment (DSCA) method. In DSCA, we design two deep neural networks for image and text modality separately, and learn two hash functions. Firstly, we construct a new similarity for the multi-label data, which can well exploit the semantic information and improve the retrieval accuracy. Simultaneously, we preserve the inter-modal similarity of heterogeneous data features, which can exploit semantic correlation. Secondly, the distributions of heterogeneous data are aligned so as to mine the inter-modal correlation well. Thirdly, the semantic label information is embedded in the hash layer of the text network, which can make the learned hash matrix more stable and make the hash codes more discriminative. Experimental results demonstrate that DSCA outperforms the state-of-the-art methods. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

39. Semantic convex matrix factorisation for cross‐media retrieval.

Author: Fang, Yixian, Ren, Yuwei, and Zhang, Huaxiang
Abstract: When utilising matrix factorisation to extract latent features for cross‐media retrieval, semantic information may be lost in the process of factorisation. In addition, many presented approaches directly mapped different modalities into an isomorphic semantic space to conduct the similarity measurement of different modalities, which also resulted in the loss of crucial information. To address these problems, a semantic convex matrix factorisation subspace learning approach is proposed for cross‐media retrieval between image and text. The proposed method can extract an intermediate‐level feature representation for the high dimensional image modality in order to weaken the loss of information, in the meantime, learn a semantic feature representation with semantic information for the lower dimension text modality to strengthen the discriminated capability. After that, the intermediate‐level feature representation of image is mapped into a latent semantic space by a projection matrix. Then the similarity of different modalities can be estimated in terms of uniform dimensional latent feature representations. Experimental results on three benchmark datasets demonstrate the superiority of the proposed approach over several state‐of‐the‐art approaches. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

40. Indirect relation based individual metabolic network for identification of mild cognitive impairment.

Author: Li, Ying, Yao, Zhijun, Zhang, Huaxiang, and Hu, Bin
Subjects: *MILD cognitive impairment, *BRAIN stimulation, *ALZHEIMER'S disease, *KERNEL functions, *GAUSSIAN distribution
Abstract: Highlights • The features of network with indirect relation achieve better performance than those of network with direct relation in MCI identification. • In MCI diagnosis and convert prediction, performance is significantly improved by combining indirect relation based network with ADAS-cog scores. • Correct biomarkers are identified by features of network with indirect relation for diagnosing MCI and predicting its conversion. Abstract Background Optimized abnormalities of individual brain network may allow earlier detection of mild cognitive impairment (MCI) and accurate prediction of its conversion to Alzheimer's disease (AD). Currently, most studies constructed individual networks based on region-to-region correlation without employing multi-region information. In order to develop the potential discriminative power of network and provide supportive evidence for feasibility of individual metabolic network study, we propose a new approach to extract features from network with indirect relation based on 18F-FDG PET data. New Method Direct relation based individual network is first constructed using Gaussian kernel function. After that, the lattice-close-degree in fuzzy mathematics is applied to reflect region-to-region indirect relation using the direct relations of regions and their common neighbors. The proposed approach has been evaluated on 199 MCI subjects and 166 normal controls (NC) using SVM classifier. Results The indirect relation based network features significantly promote classification performance in separating MCI from normal controls (NC) as well as MCI converters from non-converters. Specially, further improvements can be obtained by combining indirect relation features with ADAS-cog scores. Moreover, the discriminative regions we found are consistent with previous studies, indicating the efficacy of our constructed network in identifying correct biomarkers for diagnosing MCI and predicting its conversion. Comparison with Existing Method(s) More accurate MCI identification of PET data can be achieved by features of network with indirect relation. Conclusions This work provides a new way to investigate brain network from metabolic perspective for accurate identification of MCI. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

41. CanBiPT: Cancelable biometrics with physical template.

Author: Liu, Hao, Gao, Youjun, Liu, Chengcheng, Sun, Jiande, Guo, Xin, Zhang, Huaxiang, and Wan, Wenbo
Subjects: *BIOMETRY, *DIGITAL technology, *BIOMETRIC identification, *PROBLEM solving, *STICKERS, *FACE
Abstract: • A method of cancelable biometrics with physical-template was proposed. • Two block-based methods to select the key region of the face were proposed. • A new dataset named CanBiPT dataset for testing our method in the physical world was built. • The experiments demonstrate that our method is feasible and reliable in both physical and digital world. Cancelable biometrics are designed to solve the problem where biological characteristics cannot be cancelled and reissued. At present, most of the cancelable biometrics are carried out in the digital domain, therefore we call them cancelable biometrics with soft-template. In this paper, we propose a method called Cancelable Biometrics with Physical-Template (CanBiPT). The physical-template is a printed sticker that can be worn on the selected region of the face. First, in the face image, we select a specific region referring to the entropy between the faces with and without the physical-template, i.e., the sticker. Second, we generate an image with individual features and this image is called the physical-template which can be printed as a sticker. And the sticker is worn on the selected region of the face in the first step and the face with the sticker is used for authentication, recognition and so on. Different physical-templates can be formed by changing the regions (forehead, nose or mouth) or appearance according to the parameters in the algorithm. At the same time, the image printed as the sticker can also be used for the soft-template. The experiments on the public datasets, i.e., LFW and MS-Celeb-1M datasets and our own dataset, i.e., CanBiPT dataset, demonstrate the feasibility and effectiveness of the proposed method with both physical-templates and soft-templates. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

42. Attribute-aware style adaptation for person re-identification.

Author: Qu, Xiaofeng, Liu, Li, Zhu, Lei, and Zhang, Huaxiang
Subjects: *COGNITIVE styles, *PROBLEM solving, *PEDESTRIANS, *REINFORCEMENT learning, *DEEP learning, *BODY-weight-supported treadmill training
Abstract: Person re-identification (re-ID) aims to address a unique challenge in cross-camera pedestrian retrieval, especially in the case of incomplete attribute annotation. In recent years, a robust algorithm based on a generative model has been proposed that can achieve rapid convergence by extending the training data. However, these pipelines are developed separately from re-ID learning and ignore the fine-grained extension to adapt the camera style. To solve this problem, a joint learning framework is proposed in this work to implement end-to-end optimization and ultimately achieve high-quality images and impressive performance for person re-ID. In this work, an attribute-aware style adaptation based on CamStyle, called AA-CamStyle, is designed to combine fine-grained style adaptation and discriminative person re-ID. The AA-CamStyle model integrates the critical attributes into the generative learning to smooth the differences in camera style while maintaining the fine-grained information through joint representation learning of multiple styles, including attribute-aware and camera-aware. Attribute-aware (AA) strategy is applied to recommend the transmission of appropriate attributes of each pedestrian, resulting in AA-CamStyle's tremendous quality of translated images compared to existing models. We empirically demonstrate the effectiveness of the proposed approach on person re-ID tasks. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

43. Cloudde: A Heterogeneous Differential Evolution Algorithm and Its Distributed Cloud Version.

Author: Zhan, Zhi-Hui, Liu, Xiao-Fang, Zhang, Huaxiang, Yu, Zhengtao, Weng, Jian, Li, Yun, Gu, Tianlong, and Zhang, Jun
Subjects: *DIFFERENTIAL evolution, *COMPUTER algorithms, *CLOUD computing, *EVOLUTIONARY algorithms, *HETEROGENEOUS computing
Abstract: Existing differential evolution (DE) algorithms often face two challenges. The first is that the optimization performance is significantly affected by the ad hoc configurations of operators and parameters for different problems. The second is the long runtime for real-world problems whose fitness evaluations are often expensive. Aiming at solving these two problems, this paper develops a novel double-layered heterogeneous DE algorithm and realizes it in cloud computing distributed environment. In the first layer, different populations with various parameters and/or operators run concurrently and adaptively migrate to deliver robust solutions by making the best use of performance differences among multiple populations. In the second layer, a set of cloud virtual machines run in parallel to evaluate fitness of corresponding populations, reducing computational costs as offered by cloud. Experimental results on a set of benchmark problems with different search requirements and a case study with expensive design evaluations have shown that the proposed algorithm offers generally improved performance and reduced computational time, compared with not only conventional and a number of state-of-the-art DE variants, but also a number of other distributed DE and high-performing evolutionary algorithms. The speedup is significant especially on expensive problems, offering high potential in a broad range of real-world applications. [ABSTRACT FROM PUBLISHER]
Published: 2017
Full Text: View/download PDF

44. Multi-strategy differential evolution algorithm based on adaptive hash clustering and its application in wireless sensor networks.

Author: Bu, Xianglong, Zhang, Qingke, Gao, Hao, and Zhang, Huaxiang
Subjects: *DIFFERENTIAL evolution, *WIRELESS sensor networks, *ALGORITHMS, *GLOBAL optimization
Abstract: Population-based algorithms aim to explore the entire solution space in global numerical optimization problems. However, it is important to acknowledge that the solution spaces of different problems possess distinct characteristics, and even distinct regions within the same solution space can vary significantly. Efficient exploration of these diverse regions necessitates the utilization of distinct search models. To address this challenge, this study proposes a new variant of the Differential Evolution (DE) algorithm called MHDE. The MHDE algorithm introduces hash clustering technology combined with adaptive mutation strategies. The integration of hash clustering technology enables fast population clustering, significantly enhancing clustering efficiency. Additionally, a method for evaluating the population state is designed that allows for an adaptive clustering quantity to adapt the clustering quantity to the population state. Furthermore, a novel method is devised to calculate individual improvements considering the varying levels of difficulty that individuals face in achieving improvements within a population. This method is combined with a parameter adaptive mechanism, resulting in a weighted parameter adaptive mechanism. Experiments are conducted on the CEC2017 benchmark suite to evaluate the performance of the MHDE algorithm. The experimental results demonstrate that MHDE exhibits competitive performance compared with other efficient DE variants. Moreover, the MHDE algorithm is applied to the node deployment problem in wireless sensor networks (WSNs). The MHDE algorithm demonstrates efficient performance in node deployment problems through simulation experiments in different scenarios. • Introducing a BLSH technique for population clustering. • Designing a new method to evaluate population states. • Presenting a novel approach for calculating individual contributions. • Utilizing an external population to accelerate algorithm convergence. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

45. Causality-aware Enhanced Model for Multi-hop Question Answering over Knowledge Graphs.

Author: Sui, Yuan, Feng, Shanshan, Zhang, Huaxiang, Cao, Jian, Hu, Liang, and Zhu, Nengjun
Subjects: *KNOWLEDGE graphs, *QUESTION answering systems, *LOGIC, *LATENT variables, *CAUSAL models, *CHARTS, diagrams, etc.
Abstract: To improve the performance of knowledge graph-based question answering system (KGQA), several approaches have been developed to construct a semantic parser based on entity linking, relation identification and logical/numerical structure identification. However, existing methods arrive at answers only by maximizing the data likelihood only on the sparse or imbalanced explicit relations, ignoring the potentially large number of latent relations. It makes KGQA suffer from a high level of spurious entity relations and missing link challenge. In this paper, we propose a causal filter (CF) model for KGQA (CF-KGQA), which performs causal interference on the relation representation space to reduce the spurious relation representation in a data-driven manner, i.e. , the goal of this work is to comprehensively discover disentangled latent factors to alleviate the spurious correlation problem in KGQA. The model comprises a causal pairwise aggregator (A P) and a disentangled latent factor aggregator (A C). The former filters out most spurious entity relations inconsistent to their dense groups' neighborhood, and generates a causal pairwise matrix among all the candidate relations. The latter learns the latent relation representation via an encoder–decoder on the causal pairwise matrix. It disconnects the latent factor and the causal confounder beneath the knowledge embedding space by causal intervention. To prove the effectiveness and efficiency of the proposed approach, we test CF-KGQA and other state-of-the-art methods on four public real-world datasets. The experiments indicate that our approach outperforms the recent methods and is also less sensitive to the spurious correlation problem, thus demonstrating the robustness of CF-KGQA. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

46. Dual-path image pair joint discrimination for visible–infrared person re-identification.

Author: Wang, Zhongjie, Liu, Li, and Zhang, Huaxiang
Subjects: *INFRARED imaging, *IDENTIFICATION
Abstract: Because the imaging spectra of infrared images and visible light images are different, there is a huge modal difference between visible light images and infrared ones. Existing methods use image conversion to solve the problem of modal difference between two images, but these methods usually fail to focus on the complete information of images, which lead to the results of cross modal person re-identification are unstable. To solve this problem, we propose a new visible–infrared person re-identification method, called dual-path image pair joint discriminant model (DPJD), which simultaneously optimizes the distance within and between classes, and supervises the network learning to identify feature representations. We generate images with different modalities for the samples, and separately compose the same modality image pair and different modality image pair so as to overcome the inconsistent alignment issues. In addition, we also propose a discriminant module based on dual-path (DMDP) to improve the generation quality and discrimination accuracy of image pairs. Experiments on two benchmark datasets SYSU-MM01 and RegDB demonstrate its effectiveness. • We propose an image pair generation module, which can separate the modal information and attribute information of the image and realize arbitrary modal conversion. • We design an effective dual-path discrimination module to make both modal features and attribute features be fully used for image recognition. • Experimental results on two commonly used benchmark datasets show that our method is more efficient and works better than the compared methods. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

47. Deep Discrete Cross-Modal Hashing with Multiple Supervision.

Author: Yu, En, Ma, Jianhua, Sun, Jiande, Chang, Xiaojun, Zhang, Huaxiang, and Hauptmann, Alexander G.
Subjects: *MODAL logic, *SUPERVISION, *AUTOMATED storage retrieval systems, *INFORMATION retrieval
Abstract: Deep hashing has been widely used for large-scale cross-modal retrieval benefited from the low storage cost and fast search speed. However, most existing deep supervised methods only preserve the instance-pairwise relationship supervised by the semantic similarity matrix, which always inufficient heterogeneous correlation. Thus, we propose the Deep Discrete Cross-Modal Hashing with Multiple Supervision ( DDCH ms ) to further enhance the semantic consistency of heterogeneous modalities. It improves the performance of semantic information retrieval with the joint supervision of instance-pairwise, instance-labeled and class-wise similarities. Specifically, we firstly utilize the instance-pairwise similarity matrix to supervise the learning process of heterogeneous networks and it keeps the pairwise correlation from the perspective of instance-instance. Specially, we design a semantic network to fully exploit the semantic information implicated in labels, which is also used to supervise multi-modal networks on instance-label level. Furthermore, we propose the class-wise hash codes to cooperate with the intrinsic label matrix as the prototypes, and it guides the hash learning and further ensures the precision and compactness of the learned hash codes. In addition, we design different discrete optimization strategies to optimize the class-wise hash codes and unified hash codes, respectively. That avoids the optimization errors and ensures the high-quality of learned hash codes. Experiments on three popular datasets indicate that our method outperforms other state-of-the-art methods in terms of cross-modal retrieval. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

48. An operator pre-selection strategy for multiobjective evolutionary algorithm based on decomposition.

Author: Yan, Zeyuan, Tan, Yanyan, Chen, Hongling, Meng, Lili, and Zhang, Huaxiang
Subjects: *EVOLUTIONARY algorithms, *PSYCHOLOGICAL feedback, *COEVOLUTION
Abstract: Evolutionary algorithms (EAs) are a population-based optimization method that adopts survival-of-the-fittest rules. The performance of EAs can be greatly affected by offspring quality. Many researchers hope to adjust the parameters of evolutionary operators to improve offspring quality, but most current methods are regulated by the feedback model, which does not recognize offspring quality. Instead, it adjusts an operator's search structure dynamically based on its historical performance. In fact, there is a co-evolutionary effect between different operators. An operator will not always produce offspring with the best quality, so the feedback model only selects the best operator in history to generate offspring, which ignores the capacity for co-evolution. In this paper, we propose an operator pre-selection strategy (OPS). First, to ensure operators' capacity for co-evolution, we construct an operator pool using five operators. Second, we select the best offspring based on the classification and feedback models. Finally, we evaluate the offspring and include it in the evolution. Compared with the traditional feedback model, OPS directly identifies higher-quality offspring. Even if the historical performance of an operator is poor, as long as it can generate excellent offspring, it is selected. Experimental results demonstrate the effectiveness of OPS. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

49. An operator pre-selection strategy for multiobjective evolutionary algorithm based on decomposition.

Author: Yan, Zeyuan, Tan, Yanyan, Chen, Hongling, Meng, Lili, and Zhang, Huaxiang
Subjects: *EVOLUTIONARY algorithms, *PSYCHOLOGICAL feedback, *COEVOLUTION
Abstract: Evolutionary algorithms (EAs) are a population-based optimization method that adopts survival-of-the-fittest rules. The performance of EAs can be greatly affected by offspring quality. Many researchers hope to adjust the parameters of evolutionary operators to improve offspring quality, but most current methods are regulated by the feedback model, which does not recognize offspring quality. Instead, it adjusts an operator's search structure dynamically based on its historical performance. In fact, there is a co-evolutionary effect between different operators. An operator will not always produce offspring with the best quality, so the feedback model only selects the best operator in history to generate offspring, which ignores the capacity for co-evolution. In this paper, we propose an operator pre-selection strategy (OPS). First, to ensure operators' capacity for co-evolution, we construct an operator pool using five operators. Second, we select the best offspring based on the classification and feedback models. Finally, we evaluate the offspring and include it in the evolution. Compared with the traditional feedback model, OPS directly identifies higher-quality offspring. Even if the historical performance of an operator is poor, as long as it can generate excellent offspring, it is selected. Experimental results demonstrate the effectiveness of OPS. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

50. Adversarial Graph Convolutional Network for Cross-Modal Retrieval.

Author: Dong, Xinfeng, Liu, Li, Zhu, Lei, Nie, Liqiang, and Zhang, Huaxiang
Subjects: *MODAL logic, *REPRESENTATIONS of graphs, *GENERATIVE adversarial networks
Abstract: The completeness of semantic expression plays an important role in cross-modal retrieval tasks, which contributes to align the cross-modal data and thus narrow the modality gap. But due to the abstractness of semantics, the same topic may have different aspects to be well described so it may be incomplete to express semantics with only one sample. In order to obtain semantic complementary information and strengthen similar information for samples with the same semantics, we utilize a graph convolutional network (GCN) to reconstruct the sample representation based on the adjacency relationship between the sample itself and its neighborhoods. We construct a local graph for each instance, and propose a novel Graph Feature Generator based on GCN and a fully-connected network to reconstruct node features based on local graph and map the features of two modalities into a common space. The Graph Feature Generator and Graph Feature Discriminator adopt a minimax game strategy to generate modality-invariant graph feature representations. Experiments on three benchmark datasets demonstrate the superiority of our proposed model compared with several state-of-the-art methods. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

92 results on '"Zhang, Huaxiang"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources