12,161 results
Search Results
52. Prior knowledge-infused Self-Supervised Learning and explainable AI for Fault Detection and Isolation in PEM electrolyzers
- Author
-
Dash, Balyogi Mohan, Bouamama, Belkacem Ould, Pekpe, Komi Midzodzi, and Boukerdja, Mahdi
- Published
- 2024
- Full Text
- View/download PDF
53. Open set domain adaptation with latent structure discovery and kernelized classifier learning.
- Author
-
Tang, Yongqiang, Tian, Lei, and Zhang, Wensheng
- Subjects
- *
VISUAL accommodation , *LEARNING strategies , *KNOWLEDGE transfer , *PHYSIOLOGICAL adaptation , *LATENT variables - Abstract
Numerous visual domain adaptation methods have been proposed for transferring knowledge from a well-labeled source domain to an unlabeled but related target domain. Most of existing works are only geared to closed set domain adaptation, where an identical label space is shared between two domains. In this paper, we focus on a more realistic but challenging scenario, open set domain adaptation, where the target domain contains unknown classes that do not appear in the label space of source domain. The main task of open set domain adaptation is to simultaneously recognize the target images of known classes and those of unknown classes correctly. To achieve this goal, in this paper, we propose a novel open set domain adaptation method, which consists of two parts: latent structure discovery and kernelized classifier learning. In the first part, we employ an adaptive discriminative graph learning strategy to capture the intrinsic manifold structure of the source and target domain data in the latent feature space, such that the boundaries among all classes will be delineated more clearly. In the second part, the samples from the latent feature space are mapped into a high-dimensional kernel space to make them linearly separable, and a linear classifier is learned by jointly operating unknown target samples separating, known samples matching and local structure preserving. As the optimization problem is not convex with all variables, we devise an efficient iterative algorithm to solve it. The extensive experimental results on five image datasets confirm the superiority of the proposed method compared with the state-of-the-art traditional and deep competitors. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
54. W-NetPan: Double-U network for inter-sensor self-supervised pan-sharpening.
- Author
-
Fernandez-Beltran, Ruben, Fernandez, Rafael, Kang, Jian, and Pla, Filiberto
- Subjects
- *
DEEP learning , *REMOTE sensing , *MULTISENSOR data fusion , *CONVOLUTIONAL neural networks - Abstract
The increasing availability of remote sensing data allows dealing with spatial-spectral limitations by means of pan-sharpening methods. However, fusing inter-sensor data poses important challenges, in terms of resolution differences, sensor-dependent deformations and ground-truth data availability, that demand more accurate pan-sharpening solutions. In response, this paper proposes a novel deep learning-based pan-sharpening model which is termed as the double-U network for self-supervised pan-sharpening (W-NetPan). In more details, the proposed architecture adopts an innovative W-shape that integrates two U-Net segments which sequentially work for spatially matching and fusing inter-sensor multi-modal data. In this way, a synergic effect is produced where the first segment resolves inter-sensor deviations while stimulating the second one to achieve a more accurate data fusion. Additionally, a joint loss formulation is proposed for effectively training the proposed model without external data supervision. The experimental comparison, conducted over four coupled Sentinel-2 and Sentinel-3 datasets, reveals the advantages of W-NetPan with respect to several of the most important state-of-the-art pan-sharpening methods available in the literature. The codes related to this paper will be available at https://github.com/rufernan/WNetPan. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
55. Efficient data-driven behavior identification based on vision transformers for human activity understanding.
- Author
-
Yang, Jiachen, Zhang, Zhuo, Xiao, Shuai, Ma, Shukun, Li, Yang, Lu, Wen, and Gao, Xinbo
- Subjects
- *
HUMAN activity recognition , *HUMAN behavior , *COMPUTER vision , *PHYSICAL activity , *HUMAN beings , *ENTROPY (Information theory) - Abstract
• We focus on the data dilemma encountered in the field of human activity understanding, solve practical application problems from a new perspective, and use the proposed method to reduce the model's dependence on data. • We construct a human physical activity dataset containing 10 categories Human SA-10 for use in human activity understanding research. • A Core-Weight Entropy data information evaluation method that can be applied to human behavior recognition tasks is proposed. On Human SA-10, our method can reduce data usage by 50%. Compared to other methods, this method achieved state-of-the-art performance. • In addition, we propose a new unlabeled data redundancy information removal module, which effectively avoids introducing similar data into the training set. With the development of computer vision, the research on human activity understanding has been greatly promoted. The recognition algorithm based on vision transformer has made some achievements in a large number of computer vision tasks, but it still needs to be driven by a large amount of data. How to get rid of the constraints of large amounts of data is crucial for human behavior recognition based on vision transformer. This paper focuses on solving the dilemma of big data, and tries to achieve a high-performance model through a small amount of high information human activity data. The advantage of our work is that by studying feature distribution, we proposed a core weight entropy data information evaluation method for obtaining high information data, and through redundant information elimination strategy, we can avoid introducing similar data. A large number of experimental results show the effectiveness of the proposed method. Compared with existing methods, our method reduces the data consumption by 5% to 30%, and can achieve the performance of using only 50% of 100% data. More importantly, the data our method selected has no redundancy, which is not available in other methods. In addition, we carried out a large number of ablation experiments to prove the rationality of the method. The work of this paper solves the challenge of relying on a large amount of data when using the visual converter to recognize human behavior, which is of practical significance for realizing efficient human activity understanding research with low data. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
56. MG-MVSNet: Multiple granularities feature fusion network for multi-view stereo.
- Author
-
Zhang, Xuedian, Yang, Fanzhou, Chang, Min, and Qin, Xiaofei
- Subjects
- *
DEEP learning , *POINT cloud - Abstract
[Display omitted] • The dense feature adaptive connection module improves the quality of depth map. • The distributed 3D convolution reduces the computational cost and memory space. • The joint loss function makes the network sensitive to small depth structures. The goal of Multi-View Stereo is to reconstruct the 3D point cloud model from multiple views. With the development of deep learning, more and more learning-based research has achieved remarkable results. However, existing methods ignore the fine-grained features of the bottom layer, which leads to the poor quality of model reconstruction, especially in terms of completeness. Besides, current methods still rely on a large amount of consumed memory resources because of the application of 3D convolution. To this end, this paper proposes a Multiple Granularities Feature Fusion Network for Multi-View Stereo, an end-to-end depth estimation network combining global and local features, which is characterized by fine-granularity multi-feature fusion. Firstly, we propose a dense feature adaptive connection module, which can adaptively fuse the global and local features in the scene, provide a more complete and effective feature map for inferring a more detailed depth map, and make the ultimate model more complete. Secondly, in order to further improve the accuracy and completeness of the reconstructed point cloud, we introduce normal and edge loss futead of only using depth loss functions as in the existing methods, which makes the network more sensitive to small depth structures. Finally, we propose distributed 3D convolution instead of traditional 3D convolution, which reduces memory consumption. The experimental results on the DTU and Tanks & Temples datasets demonstrate that the proposed method in this papaer achieves the state-of-the-art performance, which proves the accuracy and effectiveness of the MG-MVSNet proposed in this paper. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
57. Distributed quadratic optimization with terminal consensus iterative learning strategy.
- Author
-
Luo, Zijian, Xiong, Wenjun, Huang, Tingwen, and Duan, Jiang
- Subjects
- *
ITERATIVE learning control , *LEARNING strategies , *INFORMATION networks , *PROBLEM solving , *MULTIAGENT systems - Abstract
This paper applies a terminal learning strategy to study distributed quadratic optimization problems. Since the optimal state is unknown in advance, the tracking error information is generally unavailable. To achieve the optimal state without the tracking error information, the terminal consensus iterative learning scheme is used to solve the problem. And the terminal consensus state is obtained without the global information of network. On this basis, the optimal target is also achieved by choosing the proper initial state and learning parameters. And the optimization problem is studied with the constraints of state and control input. Results show that our approach is effective. Compared with existing distributed optimization methods, the learning strategy in this paper provides another effective analysis scheme. Last, a numerical example is presented to show the effective aspects of the method. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
58. Stochastic synchronization for semi-Markovian complex dynamic networks with partly unknown transition rates.
- Author
-
Zhang, Yue and Zheng, Cheng-De
- Subjects
- *
SYNCHRONIZATION , *INTEGRAL inequalities , *MATRIX inequalities , *JENSEN'S inequality , *TIME-varying networks , *FUZZY neural networks - Abstract
This paper investigates the synchronization of complex dynamic networks with time-varying delay and general semi-Markovian jump. The general transition rates include completely unknown and uncertain but bounded as two special cases. First, by introducing auxiliary vectors with a few nonorthogonal polynomials, two free-matrix-based integral inequalities are developed, which encompass some existing ones as special cases. Second, an integral- based delay-product-type Lyapunov-Krasovskii functional is constructed, which fully considers the information of time delay. By utilizing a deley-dependent controller, two sufficient conditions are derived to realize the global stochastic mean-square synchronization by employing the established inequalities to evaluate the infinitesimal generator of the functional. This paper takes all possibilities into consideration and divides the general transition rates into five cases, which is never investigated before. Finally a numerical example is given to show the effectiveness and practicality of the presented method. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
59. A systematic review and analysis of deep learning-based underwater object detection.
- Author
-
Xu, Shubo, Zhang, Minghua, Song, Wei, Mei, Haibin, He, Qi, and Liotta, Antonio
- Subjects
- *
DEEP learning , *OBJECT recognition (Computer vision) , *COMPUTER vision , *IMAGE intensifiers - Abstract
• Providing an in-depth and comprehensive review related to underwater object detection. • Extensively exploring the correlation between underwater image enhancement and underwater object detection. • Analyzing the impact and augmentation forms of underwater image enhancement in underwater object detection. • Discussing the challenges, future trends and applications of underwater object detection. Underwater object detection is one of the most challenging research topics in computer vision technology. The complex underwater environment makes underwater images suffer from high noise, low visibility, blurred edges, low contrast and color deviation, which brings significant challenges to underwater object detection tasks. In underwater object detection tasks, traditional object detection methods often perform poorly in terms of accuracy and generalization capabilities. Underwater object detection requires accurate, stable, generalizable, real-time and lightweight detection models, for which many researchers have proposed various underwater object detection techniques based on deep learning. Although many outstanding results have been achieved on underwater object detection over the years, the research status of underwater object detection techniques are still lack of unified induction, and some existing problems need to be further probed from the latest perspective. In addition, previous reviews lack analysis on the relationship between underwater image enhancement and object detection. Therefore, this paper provides a comprehensive review of the current research challenges, future development trends, and potential applications of underwater object detection techniques. More importantly, this paper has explored the internal relationship between underwater image enhancement and object detection, and analyzed the possible implementation manners of underwater image enhancement in the object detection task in order to further enhance its benefits. The experiments show the performances of current underwater image enhancement and state-of-the-art object detection algorithms, point out their limitations, and indicate that there is not a strict positive correlation between underwater image enhancement and the accuracy improvement of object detection. The domain shift caused by underwater image enhancement cannot be ignored. This paper can be regarded as a guide for future works on underwater object detection. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
60. FuzzyGAN: Fuzzy generative adversarial networks for regression tasks.
- Author
-
Nguyen, Ryan, Singh, Shubhendu Kumar, and Rai, Rahul
- Subjects
- *
GENERATIVE adversarial networks , *ARTIFICIAL neural networks , *CONVOLUTIONAL neural networks , *DIFFERENTIABLE dynamical systems , *FUZZY logic , *FUZZY systems - Abstract
Generative Adversarial Networks (GANs) are well-known tools for data generation and semi-supervised classification. GANs, with less labeled data, outperform Deep Neural Networks (DNNs) and Convolutional Neural Networks (CNNs) in classification tasks. The success of GANs in classification tasks motivates the development of GAN-based techniques for semi-supervised regression tasks. However, developing GANs for regression introduces two major challenges: (1) inherent instability in the GAN formulation and (2) performing regression and achieving stability simultaneously. This paper introduces techniques that show improvement in the GANs' regression capability. We bake a differentiable fuzzy logic system at multiple locations in a GAN. The fuzzy logic takes the output of either the generator or the discriminator to predict the output, y , and evaluate the generator's performance. We outline the results of applying the fuzzy logic system across multiple GANs and summarize each approach's efficacy. This paper shows that adding a fuzzy logic layer can enhance GAN's ability to perform regression; the most desirable injection location is problem-specific, and we show this through experiments over various datasets. Besides, we demonstrate empirically that the fuzzy-infused GANs are competitive with the DNNs. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
61. Joint coupled representation and homogeneous reconstruction for multi-resolution small sample face recognition.
- Author
-
Fan, Xiaojin, Liao, Mengmeng, Xue, Jingfeng, Wu, Hao, Jin, Lei, Zhao, Jian, and Zhu, Liehuang
- Subjects
- *
MACHINE learning , *FRACTIONAL programming , *FACE perception , *SHOOTING equipment , *LEARNING - Abstract
• This paper proposes a novel multivariate dictionary learning framework. • A coherence enhancement term to improve the coherent representing of the coding coefficients under different resolutions. • A multivariate dictionary optimization method to solve dictionaries involving the calculation of fractional norm. • The proposed method achieves the state-of-the-art performance on several benchmark datasets. Off-the-shelf dictionary learning algorithms have achieved satisfactory results in small sample face recognition applications. However, the achieved results depend on the facial images obtained at a single resolution. In practice, the resolution of the images captured on the same target is different because of the different shooting equipment and different shooting distances. These images of the same category at different resolutions will pose a great challenge to these algorithms. In this paper, we propose a Joint Coupled Representation and Homogeneous Reconstruction (JCRHR) for multi-resolution small sample face recognition. In JCRHR, an analysis dictionary is introduced and combined with the synthetic dictionary for coupled representation learning, which better reveals the relationship between coding coefficients and samples. In addition, a coherence enhancement term is proposed to improve the coherent representation of the coding coefficients at different resolutions, which facilitates the reconstruction of the sample by its homogeneous atoms. Moreover, each sample at different resolutions is assigned a different coding coefficient in the multi-dictionary learning process, so that the learned dictionary is more in line with the actual situation. Furthermore, a regularization term based on the fractional norm is drawn into the dictionary coupled learning to remove the redundant information in the dictionary, which can reduce the negative impacts of the redundant information. Comprehensive results demonstrate that the proposed JCRHR method achieves better results than the state-of-the-art methods, on several small sample face databases. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
62. Adaptive fusion network for RGB-D salient object detection.
- Author
-
Chen, Tianyou, Xiao, Jin, Hu, Xiaoguang, Zhang, Guofeng, and Wang, Shaojie
- Subjects
- *
OBJECT recognition (Computer vision) , *SOURCE code , *PROBLEM solving , *DEEP learning - Abstract
• A high-performance method is proposed for RGB-D salient object detection. • RGB and depth data are adaptively fused to boost the performance. • Features of all levels are refined in an iterative manner. • The proposed method outperforms other state-of-the-art models on six datasets. Existing state-of-the-art RGB-D saliency detection models mainly utilize the depth information as complementary cues to enhance the RGB information. However, depth maps can be easily influenced by environment and hence are full of noises. Thus, indiscriminately integrating multi-modality (i.e., RGB and depth) features may induce noise-degraded saliency maps. In this paper, we propose a novel Adaptive Fusion Network (AFNet) to solve this problem. Specifically, we design a triplet encoder network consisting of three subnetworks to process RGB, depth, and fused features, respectively. The three subnetworks are interlinked and form a grid net to facilitate mutual refinement of these multi-modality features. Moreover, we propose a Multi-modality Feature Interaction (MFI) module to exploit complementary cues between depth and RGB modalities and adaptively fuse the multi-modality features. Finally, we design the Cascaded Feature Interweaved Decoder (CFID) to exploit complementary information between multi-level features and refine them iteratively to achieve accurate saliency detection. Experimental results on six commonly used benchmark datasets verify that the proposed AFNet outperforms 20 state-of-the-art counterparts in terms of six widely adopted evaluation metrics. Source code will be publicly available at https://github.com/clelouch/AFNet upon paper acceptance. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
63. Deep mutual learning for brain tumor segmentation with the fusion network.
- Author
-
Gao, Huan, Miao, Qiguang, Ma, Daikai, and Liu, Ruyi
- Subjects
- *
BRAIN tumors , *DEEP learning , *LOGITS , *LEARNING strategies , *COGNITIVE training , *DECODING algorithms - Abstract
• This paper introduces the mutual learning strategy train the brain tumor segmentation network, using the shallowest feature map to supervise the subsequent feature map of the network. using the deepest logits to supervise the previous shallow network's logits. The shallow feature map and deep logit supervise mutually and improve the accuracy of tumor sub-region segmentation. • This paper introduces the depth supervision to train this network, using the prediction of each up-sample layer is to deep supervise the training process for enlarging the receptive field to improve the overall segmentation accuracy. • A large number of experiments on BraTS dataset show that our method can effectively improve the accuracy of brain tumor segmentation and achieve the performance of SOTA. Deep learning methods have been successfully applied to Brain tumor segmentation. However, the extreme data imbalance exists in the different sub-regions of tumor, results in training the deep learning methods on these data will reduce the accuracy of segmentation. We introduce the deep mutual learning strategy to address the challenges, the proposed integrates transformer layers in both encoder and decoder of a U-Net architecture. In the network, using the prediction of up-sampled layer is to deep supervise the training process for enlarging the receptive field to extract features, the feature map of the shallowest layer supervises the subsequent feature map of layers to keep more edge information to guide the sub-region segmentation accuracy. the classification logits of the deepest layer supervise the previous layer of logits to get more semantic information for distinguish of tumor sub-regions. Furthermore, the feature map and the classification logits supervise mutually to improve the overall segmentation accuracy. The experimental results on benchmark dataset shows that our method has significant performance gain over existing methods. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
64. TJU-DNN: A trajectory-unified framework for training deep neural networks and its applications.
- Author
-
Lv, Xian-Long, Chiang, Hsiao-Dong, Wang, Bin, and Zhang, Yong-Feng
- Subjects
- *
ARTIFICIAL neural networks , *ELECTRIC lines - Abstract
The training method for deep neural networks mainly adopts the gradient descent (GD) method. These methods, however, are very sensitive to initialization and hyperparameters. In this paper, an enhanced gradient descent method guided by the trajectory-based method for training deep neural networks, termed the Trajectory Unified Framework (TJU) method, is presented. From a theoretical viewpoint, the robustness of the TJU-based method is supported by an analytical basis presented in the paper. From a computational viewpoint, a TJU methodology consisting of a Block-Diagonal-Pseudo-Transient-Continuation method and a gradient descent method, termed the TJU-GD method, for training deep neural networks is added to obtain high-quality results. Furthermore, to resolve the issue of imbalanced classification, a TJU-Focal-GD method is developed and evaluated. Experimental numerical evaluation of the proposed TJU-GD on various public datasets reveals that the proposed method can achieve great improvements over baseline methods. Specifically, the proposed TJU-Focal-GD also possesses several advantages over other methods for a class of imbalanced datasets from the homemade power line inspection dataset (PLID). [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
65. A noise-suppressing discrete-time neural dynamics model for solving time-dependent multi-linear [formula omitted]-tensor equation.
- Author
-
Liu, Mei, Wu, Huanmei, and Shang, Mingsheng
- Subjects
- *
EQUATIONS - Abstract
Neural dynamics plays an important role in handling various complex problems related to matrices or even tensors, e.g., the multi-linear M -tensor equation investigated in this paper. However, the existing methods for computing the time-dependent multi-linear M -tensor equation bear the following weaknesses: 1) all of them are under the short-time invariant hypothesis, thereby generating considerable residual errors for time-dependent ones; 2) most of them are depicted in continuous-time form, which can not be directly implemented in the digital equipment; and 3) all of them only consider the noise-free conditions, lacking robustness over truncation errors and round-off errors widely existing in the digital equipment. This paper remedies these three weaknesses by proposing a noise-suppressing discrete-time neural dynamics (NSDTND) model for the time-dependent multi-linear M -tensor equation. Additionally, analyses on the convergence and robustness are shown to demonstrate that the proposed NSDTND model is globally convergent and has a superior immunity to noises. Then, numerical experimental verifications and an application to the particle movement are provided to prove the superiority and effectiveness of the proposed NSDTND model for solving time-dependent multi-linear M -tensor equation with noises considered. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
66. TransPose: 6D object pose estimation with geometry-aware Transformer
- Author
-
Lin, Xiao, Wang, Deming, Zhou, Guangliang, Liu, Chengju, and Chen, Qijun
- Published
- 2024
- Full Text
- View/download PDF
67. Formal verification of robustness and resilience of learning-enabled state estimation systems
- Author
-
Huang, Wei, Zhou, Yifan, Jin, Gaojie, Sun, Youcheng, Meng, Jie, Zhang, Fan, and Huang, Xiaowei
- Published
- 2024
- Full Text
- View/download PDF
68. A survey for solving mixed integer programming via machine learning.
- Author
-
Zhang, Jiayi, Liu, Chang, Li, Xijun, Zhen, Hui-Ling, Yuan, Mingxuan, Li, Yawen, and Yan, Junchi
- Subjects
- *
MACHINE learning , *INTEGER programming , *COMBINATORIAL optimization , *HEURISTIC algorithms , *NP-hard problems , *MACHINE theory , *PROBLEM solving - Abstract
Machine learning (ML) has been recently introduced to solving optimization problems, especially for combinatorial optimization (CO) tasks. In this paper, we survey the trend of leveraging ML to solve the mixed-integer programming problem (MIP). Theoretically, MIP is an NP-hard problem, and most CO problems can be formulated as MIP. Like other CO problems, the human-designed heuristic algorithms for MIP rely on good initial solutions and cost a lot of computational resources. Therefore, researchers consider applying machine learning methods to solve MIP since ML-enhanced approaches can provide the solution based on the typical patterns from the training data. Specifically, we first introduce the formulation and preliminaries of MIP and representative traditional solvers. Then, we show the integration of machine learning and MIP with detailed discussions on related learning-based methods, which can be further classified into exact and heuristic algorithms. Finally, we propose the outlook for learning-based MIP solvers, the direction toward more combinatorial optimization problems beyond MIP, and the mutual embrace of traditional solvers and ML components. We maintain a list of papers that utilize machine learning technologies to solve combinatorial optimization problems, which is available at https://github.com/Thinklab-SJTU/awesome-ml4co. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
69. Research on emotional semantic retrieval of attention mechanism oriented to audio-visual synesthesia.
- Author
-
Wang, Weixing, Li, Qianqian, Xie, Jingwen, Hu, Ningfeng, Wang, Ziao, and Zhang, Ning
- Subjects
- *
SYNESTHESIA , *DIGITAL video , *MUSIC videos , *ENVIRONMENTAL music , *DIGITAL media , *SELF-expression - Abstract
Digital video is widely used to record people's daily lives and share people's moods, but few researchers have conducted research on the consistency of emotional expression between short videos and music. In order to be able to match the appropriate background music to the short video image autonomously and efficiently, the paper analyzed the emotional connection between the two from the audio-visual synesthesia. First, emotional semantics was used as a bridge to connect video data and music data, and a video-music synesthesia data set based on semantic words was constructed. Then, an attention mechanism was incorporated to better extract key features in video images. In the extraction of music features, an improved lenet5 network was used, and the optimal network parameters were determined through experiments. Finally, the two types of features were fused and the mutual retrieval between video and music was performed. In order to compare the performance of different models, different CNN models were calculated in the processing of video images, including VGG16, VGG19, AlexNet and GoogleNet, and the attention mechanism was added to each network for calculation to compare its retrieval accuracy. In the processing of music data, different CNN algorithms were also used for comparative experiments, and networks with different layers were used to determine the optimal results. The experimental results show that the audiovisual synesthesia retrieval model based on emotion can effectively measure the emotional similarity between video images and music, and the method of the paper can produce a good match between them. The research method of the paper is the exploration of computer synesthetic intelligence, which can stimulate the creative inspiration of image and music creative designers. While enhancing the emotional experience of digital products, it also improves the efficiency and quality of development. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
70. Gaussian-type activation function with learnable parameters in complex-valued convolutional neural network and its application for PolSAR classification.
- Author
-
Zhang, Yun, Hua, Qinglong, Wang, Haotian, Ji, Zhenyuan, and Wang, Yong
- Subjects
- *
CONVOLUTIONAL neural networks , *RECURRENT neural networks , *SYNTHETIC aperture radar , *IMAGE recognition (Computer vision) , *GAUSSIAN function - Abstract
• Processing Complex-valued PolSAR Data Using Complex-valued Convolutional Neural Network (CV-CNN). • Uses a Gaussian-type activation function (GTAF) that preserves the integrity of complex-valued operations. • Introduces learnable Gaussian parameters for GTAF, and designs two multi-channel activation methods. • The classification accuracy is better than that of existing state-of-the-art methods in three datasets. To process complex-valued information such as SAR signals conveniently, the complex-valued convolutional neural network (CV-CNN) has been proposed in recent years, and it has achieved great success in SAR image recognition. This paper proposes an activation function with learnable parameters based on the Gaussian-type activation function (GTAF) in CV-CNN to improve the utilization of information in the real and imaginary parts of the neuro. For the multi-channel input of the feature map, this paper discusses two ways to set the parameters of the Gaussian-type activation function. One is that all channels share the same parameters, called the channel-sharing Gaussian-type activation function (CSGTAF). The other is that each channel has its independent parameters, called the channel-exclusive Gaussian-type activation function (CEGTAF). In addition, this paper derives the backpropagation formula of both CSGTAF and CEGTAF in detail for the training process of CV-CNN. This paper performs experimental analysis on three L-band standard PolSAR datasets. The experimental results show that, compared with the traditional method and the Gaussian activation function with fixed parameters, both CSGTAF and CEGTAF achieve higher recognition accuracy, and the difference in the recognition effect of different targets in the same dataset is little. Both show good recognition performance and have good stability and versatility. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
71. A practical tutorial on solving optimization problems via PlatEMO.
- Author
-
Tian, Ye, Zhu, Weijian, Zhang, Xingyi, and Jin, Yaochu
- Subjects
- *
SWARM intelligence , *PROBLEM solving , *EVOLUTIONARY algorithms , *COMPUTATIONAL intelligence , *METAHEURISTIC algorithms , *INTELLIGENCE service , *EVOLUTIONARY computation - Abstract
• This paper presents a practical tutorial on solving optimization problems via PlatEMO, by means of abundant examples and source codes. • This paper is the first tutorial for the newest version PlatEMO v4.0. • This paper does not go deep into the technical details of algorithms, but aims to enable beginners to use PlatEMO at a low cost, which is much easier to be understood than the user manual of PlatEMO. • This paper is written according to many questions raised by users in the last five years. PlatEMO is an open-source platform for solving complex optimization problems, which provides a variety of metaheuristics including evolutionary algorithms, swarm intelligence algorithms, multi-objective optimization algorithms, surrogate-assisted optimization algorithms, and many others. Due to the problem-independent nature of most metaheuristics, they are versatile for solving problems with various difficulties such as multimodal landscapes, discrete search spaces, multiple objectives, strict constraints, and expensive evaluations, regardless of the fields the problems belong to. Since PlatEMO was published in 2017, it has been used by many researchers from both academia and industry in the computational intelligence community. However, the basic terms and concepts about optimization may confuse practitioners and junior researchers new to metaheuristics. Hence, this paper presents a practical introduction to the use of PlatEMO 4.0, focusing on the procedures of defining problems, selecting suitable metaheuristics, and collecting results. Note, however, that a description of the technical details of metaheuristics is beyond the scope of this paper and interested readers may refer to the cited references. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
72. Classification of natural images inspired by the human visual system.
- Author
-
Davoodi, Paria, Ezoji, Mehdi, and Sadeghnejad, Naser
- Subjects
- *
ARTIFICIAL neural networks , *VISUAL perception , *FILTER banks , *RETINA , *VISUAL cortex , *CONVOLUTIONAL neural networks , *INFORMATION modeling - Abstract
In this paper, a three-step model based on the integration of Deep Neural Networks (DNN) and Decision Models is introduced for image classification which is inspired by the human visual system. To make a decision about an object, many actions should be done in a hierarchical process in the brain. First, the retina receives visual stimuli and transfers them to the visual cortex in the brain. The information extracted in the visual cortex, is accumulated over time to select an appropriate response. Many of the current decision-making models do not show how each image is converted into useful information for the decision model. Some models have used neural networks to convert each image into the information needed in the decision-making model; however, the role of the retina is ignored among these models. In this paper, a combination of retina inspired filters, CNN-based description and accumulator-based decision model is used to classify images. This model's structure resembles the human brain due to the usage of the DoG filter bank as retina inspired filter in the first stage of it. This model shows a significant improvement in accuracy in comparison to other models; furthermore, its performance is acceptable even with the small sample training set. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
73. A top-k POI recommendation approach based on LBSN and multi-graph fusion.
- Author
-
Fang, Jinfeng, Meng, Xiangfu, and Qi, Xueyue
- Subjects
- *
MULTIGRAPH , *SOCIAL networks - Abstract
POI(Point of Interest) recommendation is a basic and on-going issue in LBSN (Location-based Social Network) services. In this paper, a novel POI recommendation approach which is based on LBSN and multi-graph fusion is proposed. First, we take advantages of the graph neural network to construct user-POI interaction graph based on the rating data of users and construct user social graph based on the user social networks. First-order friends and high-order friends will be considered simultaneously in the user social graph. And then, we present a spectral cluster-based algorithm to gain the latent vector of the POI in location space. After this, the graph neural network is used to learn the information above. Lastly, we predict the score based on the aforementioned information and pick out the top- k POIs with the highest scores to form a recommendation list. Extensive experiments conducted on real datasets demonstrated that the method proposed in this paper can effectively generate the embedding vectors of users and POIs, and can achieve high recommendation accuracy as well. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
74. Decentralized event-triggered adaptive neural network control for nonstrict-feedback nonlinear interconnected systems with external disturbances against intermittent DoS attacks.
- Author
-
Cui, Yahui, Sun, Haibin, and Hou, Linlin
- Subjects
- *
ADAPTIVE control systems , *DENIAL of service attacks , *NONLINEAR systems , *LINEAR matrix inequalities , *LYAPUNOV stability , *TANGENT function , *PSYCHOLOGICAL feedback - Abstract
• This paper aims to construct a NN DETAC scheme for interconnected nonlinear system against Dos attacks and external disturbances. • A switching-type adaptive state observer with a disturbance estimation value is proposed and an anti-disturbance decentralized event-triggered adaptive control scheme is developed. The proposed method can enhance system anti-disturbance ability. • Different from the sampled-data control scheme in the related literature, the NN DETAC scheme is developed, which can efficiently save communication resources. • By employing the properties of the hyperbolic tangent function, the interconnection terms no longer meet the global Lipschitz conditions, which relaxes the constrain condition. This paper discusses the issue of decentralized event-triggered adaptive neural network (NN) control for nonstrict-feedback nonlinear interconnected systems with external disturbances and intermittent denial-of-service (DoS) attacks. In the presence of DoS attack, all state variables are not used to design a feedback controller via the standard backstepping method. To solve this problem, a novel switching-type adaptive state observer with a disturbance compensation is constructed, where the disturbance compensation is obtained via constructing a disturbance observer. A decentralized event-triggered adaptive controller is designed by using the backstepping method to weaken the influences of DoS attack and the waste of communication resources, where a first-order sliding mode differentiator is introduced to prevent the "calculation explosion". By using linear matrix inequality techniques, some solvable sufficient conditions are attained to derive the observer gain. The closed-loop system is proved to be stable via the improved average dell time method and the piecewise Lyapunov stability theories. This control scheme ensures that all closed-loop signals remain bounded. Finally, simulation results are utilized to demonstrate the effectiveness of the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
75. A survey of recent advances on stability analysis, state estimation and synchronization control for neural networks.
- Author
-
Chen, Yonggang, Zhang, Nannan, and Yang, Juanjuan
- Subjects
- *
SYNCHRONIZATION , *IMAGE processing , *SIGNAL processing - Abstract
Nowadays, neural networks have been widely applied in many fields such as pattern recognition, signal and image processing and control theory. Over the past two decades or so, the analysis and synthesis for neural networks have received significant research attention. This paper provides a survey on the analysis and synthesis for neural networks, which is mainly concerned with the recent advances on stability analysis, state estimation and synchronization control for neural networks. First of all, the paper summarizes the recent results on the stability analysis for delayed neural networks, especially for neural networks with multiple discrete delays, neural networks with distributed delays, and discrete-time delayed neural networks. Then, the paper reviews the recent advances regarding the state estimation for neural networks with the emphasis on the network-based state estimation. Subsequently, the paper provides an overview on the synchronization control for neural networks. Finally, the conclusions and further research directions are given. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
76. Links synchronization control for the complex dynamical network.
- Author
-
Gao, Peitao, Wang, Yinhe, Zhao, Juanxia, Zhang, LiLi, and Peng, Yi
- Subjects
- *
SYNCHRONIZATION , *ADAPTIVE control systems - Abstract
For the complex dynamical network (CDN) with uncertainties, this paper defines the links synchronization and synthesizes the adaptive control scheme to realize it. Generally speaking, a CDN can be considered as the composition system with the two coupled dynamic subsystems: the nodes subsystem (NS) and the links subsystem (LS). We observed that there are many results in the existing literature on the NS synchronization, which implies that the state at each node tends to be the same when the NS synchronization happens. However, the LS synchronization is rarely discussed in the existing literature due to its unknown meaning and engineering practice background. Inspired by the ego-networks, this paper employs the time-varying outgoing links at each node to geometrically describe the changing of network topologic structure and regards the values-weighted of outgoing links as the state variables of LS, by which the LS synchronization is introduced, the corresponding control scheme for its implementation is synthesized under the condition that the state of NS is available and the state of LS is unavailable. The control scheme includes the adaptive controller for NS and the coupling strategy for LS. Finally, the effectiveness of proposed control scheme in this paper is verified by a numerical example. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
77. UFKT: Unimportant filters knowledge transfer for CNN pruning.
- Author
-
CH, Sarvani, Dubey, Shiv Ram, and Ghorai, Mrinmoy
- Subjects
- *
KNOWLEDGE transfer , *DEEP learning , *CONVOLUTIONAL neural networks - Abstract
• This paper proposes a filter pruning approach for CNN model compression by transferring the knowledge of unimportant filters to the filters of higher importance. • Before pruning unimportant filters, a custom regularizer is utilized for knowledge transfer, which increases the gap between L1-norms of important and unimportant filters. • The effect of the penalty imposed in the custom regularizer is analyzed to justify the need for knowledge transfer before pruning. • In order to validate the robustness of the proposed framework across different CNN architectures, we experiment with five popular CNNs, namely, LeNet-5, VGG-16, ResNet-56, ResNet-110, and ResNet-50. • Experiments are performed on three benchmark datasets MNIST, CIFAR-10, and ImageNet. • An improvement over the baseline in terms of accuracy is observed even after removing 95.15%, 62.28%, and 62.39% of the Floating Point OPerations (FLOPs) from architectures LeNet-5, ResNet-56, and ResNet-110, respectively. As the deep learning models have been widely used in recent years, there is a high demand for reducing the model size in terms of memory and computation without much compromise in the model performance. Filter pruning is a very widely adopted strategy for model compression. The existing filter pruning methods identify the unimportant filters and prune them without worrying about information loss. They try to recover the same by fine-tuning the remaining filters, limiting their performance. In this paper, we tackle this problem by utilizing the knowledge from unimportant filters before pruning to minimize information loss. First, the proposed method identifies the unimportant and important filters by exploiting the lower and higher importance, respectively, using the L 1 -norm of filters. Next, the proposed custom UFKT-Reg regularizer ( R ufkt ) transfers the knowledge from unimportant filters before pruning to remaining filters, notably to a fixed number of important filters. Hence, the proposed method minimizes information loss due to the removal of unimportant filters. The experiments are conducted using the three benchmark datasets, including MNIST, CIFAR-10, and ImageNet. The proposed filter pruning method outperforms many recent state-of-the-art filter pruning methods. An improvement over the baseline in terms of accuracy is observed even after removing 95.15%, 62.28%, and 62.39% of the Floating Point OPerations (FLOPs) from architectures LeNet-5, ResNet-56, and ResNet-110, respectively. After pruning 53.25% of FLOPS from ResNet-50, only 1.02% and 0.47% of drops are observed in top-1 and top-5 accuracies, respectively. The code used in this paper will be publicly available at (https://github.com/sarvanichinthapalli/UFKT). [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
78. A new hybrid optimizer for stochastic optimization acceleration of deep neural networks: Dynamical system perspective.
- Author
-
Xie, Wenjing, Tang, Weishan, and Kuang, Yujia
- Subjects
- *
ARTIFICIAL neural networks , *DYNAMICAL systems , *HYBRID systems , *SYSTEMS theory - Abstract
Stochastic optimization acceleration is extremely significant and challenging for deep neural networks (DNNs). In recent years, several novel proportional-integral–differential-based (PID-based) optimizers have been proposed to speed up the optimization by alleviating the oscillation behavior of stochastic gradient descent with momentum (SGD-M), yet lacked theoretical analysis. Along this line of research, this paper adopts dynamical system theory to design a new hybrid optimizer and present theoretical analysis. Firstly, it is found that DNN optimization is equivalent to a discrete time dynamical system. Building upon the equivalence, high order augmented dynamical system viewpoint is utilized to design a PI-like optimizer for ensuring high accuracy, which is more stable than SGD-M. Then, hybrid dynamical system viewpoint is employed to improve the PI-like optimizer as a new hybrid form for suppressing oscillation and accelerating optimization. Lyapunov method, Taylor series, matrix theory and equilibrium are combined to theoretically investigate the convergence and the oscillation of loss function, showing that the proposed hybrid optimizer can alleviate oscillation, boost optimization speed, and maintain high accuracy. In theoretical analyses, explicit conditions of hyper-parameters that guarantee training stability are calculated and presented, practically guiding the adjustment of hyper-parameters and promoting the application of hybrid optimizer. Experiments are presented on three commonly used benchmark datasets, i.e., MNIST, CIFAR10 and CIFAR100, demonstrating that the hybrid optimizer obtains up to 42% acceleration with competitive accuracy relative to state-of-the-art optimizers. In short, this paper not only presents a new hybrid optimizer for accelerating optimization, but also provides a novel, theoretical and systematic perspective to find and analyze new optimizer for DNNs. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
79. Person identification from fingernails and knuckles images using deep learning features and the Bray-Curtis similarity measure.
- Author
-
Alghamdi, Mona, Angelov, Plamen, and Alvaro, Lopez Pellicer
- Subjects
- *
FINGERNAILS , *ARTIFICIAL neural networks , *FINGERS , *IMAGE registration , *FEATURE extraction , *AUTOMATIC identification - Abstract
In this paper, an approach that makes use of knuckle creases and fingernails for person identification is presented. It introduces a framework for automatic person identification that includes localisation of the region of interest (ROI) of many components within hand images, recognition and segmentation of the detected components using bounding boxes, and similarity matching between two different sets of segmented images. The following hand components are considered: i) the metacarpophalangeal (MCP) joint, commonly known as the base knuckle; ii) the proximal interphalangeal (PIP) joint, commonly known as the major knuckle; iii) the distal interphalangeal (DIP) joint, commonly known as the minor knuckle; iv) the interphalangeal (IP) joint, commonly known as the thumb knuckle, and v) the fingernails. Crucial elements of the proposed framework are the feature extraction and similarity matching. This paper exploits different deep learning neural networks (DLNNs), which are essential in extracting discriminative high-level abstract features. We further use various similarity measures for the matching process. We validate the proposed approach on well-known benchmarks, including the 11k Hands dataset and the Hong Kong Polytechnic University Contactless Hand Dorsal Images known as PolyU. The results indicate that knuckle patterns and fingernails play a significant role in the person identification framework. The 11K Hands dataset results indicate that the left-hand results are better than the right-hand results and the fingernails produce consistently higher identification results than other hand components, with a rank-1 score of 100 %. In addition, the PolyU dataset attains 100 % in the fingernail of the thumb finger. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
80. SemRegionNet: Region ensemble 3D semantic instance segmentation network with semantic spatial aware discriminative loss.
- Author
-
Zhang, Guanghui, Zhu, Dongchen, Shi, Wenjun, Li, Jiamao, and Zhang, Xiaolin
- Subjects
- *
POINT cloud , *RANDOM sets , *SUPERVISED learning , *STATISTICAL sampling , *SEMANTICS - Abstract
[Display omitted] The figure below is an overview of the proposed framework in this paper. Point clouds with xyz coordinates and RGB attributes are fed into the network, and the output includes semantic labels and instance labels. The proposed region ensemble structure includes Random Sampling-based Set Abstraction (RS-SA), Adaptive Regional Feature Complementary (ARFC), and Affinity-based Regional Relation Reasoning (AR3) modules. RS-SA denotes the random sampling-based set abstraction. The down-sampling in RS-SA sequentially samples the input point clouds. The ARFC module aims to adaptively complement low-level features to account for the information loss caused by the random sampling and inherent non-uniformity. The AR3 module focuses on reasoning about relationships among high-level regional features. The semantic and spatial aware discriminative loss is proposed to supervise the instance embedding learning leveraging the semantic and spatial knowledge. • Random sampling is introduced to improve set abstraction efficiency. • A region ensemble structure is developed to enhance point cloud comprehension. • A semantic spatial awareness is proposed to mitigate instance confusion problem. • Experimental results demonstrate that the proposed method achieves promising boosts. The semantic instance segmentation task on 3D data has made great progress. However, for unstructured 3D point cloud data, the mining of regional knowledge and explicit assistance of semantic for the instance segmentation task are still rarely explored. In this paper, we propose a region ensemble structure including random sampling-based set abstraction (RS-SA), an adaptive regional feature complementary (ARFC) module, and an affinity-based regional relational reasoning (AR3) module in feature encoding to enhance the point cloud comprehension. Random sampling is introduced into the set abstraction of feature encoding to improve computational efficiency. The ARFC module aims to complement low-level features to adaptively compensate for the information loss caused by random sampling and inherent non-uniformity of point clouds, and the AR3 module emphasizes mining the potential reasoning relationships among high-level features based on affinity. Furthermore, a novel semantic spatial aware discriminative loss is proposed to improve the discrimination of instance embedding. The proposed region ensemble structure and semantic spatial awareness for discriminative loss are demonstrated promising boosts on the 3D point cloud semantic instance segmentation task, and the framework achieves state-of-the-art performance on the S3DIS, ScanNet-v2, and ShapeNet datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
81. Abstractive text summarization: State of the art, challenges, and improvements.
- Author
-
Shakil, Hassan, Farooq, Ahmad, and Kalita, Jugal
- Subjects
- *
TEXT summarization , *LANGUAGE models , *AUTOMATIC summarization , *KNOWLEDGE representation (Information theory) , *RESEARCH personnel , *REINFORCEMENT learning - Abstract
Specifically focusing on the landscape of abstractive text summarization, as opposed to extractive techniques, this survey presents a comprehensive overview, delving into state-of-the-art techniques, prevailing challenges, and prospective research directions. We categorize the techniques into traditional sequence-to-sequence models, pre-trained large language models, reinforcement learning, hierarchical methods, and multi-modal summarization. Unlike prior works that did not examine complexities, scalability and comparisons of techniques in detail, this review takes a comprehensive approach encompassing state-of-the-art methods, challenges, solutions, comparisons, limitations and charts out future improvements — providing researchers an extensive overview to advance abstractive summarization research. We provide vital comparison tables across techniques categorized — offering insights into model complexity, scalability and appropriate applications. The paper highlights challenges such as inadequate meaning representation, factual consistency, controllable text summarization, cross-lingual summarization, and evaluation metrics, among others. Solutions leveraging knowledge incorporation and other innovative strategies are proposed to address these challenges. The paper concludes by highlighting emerging research areas like factual inconsistency, domain-specific, cross-lingual, multilingual, and long-document summarization, as well as handling noisy data. Our objective is to provide researchers and practitioners with a structured overview of the domain, enabling them to better understand the current landscape and identify potential areas for further research and improvement. [Display omitted] • Overview of state-of-the-art techniques in abstractive text summarization. • Comparative analysis of models in abstractive summarization. • Identification of challenges and potential improvements in the field. • Exploration of future research directions and emerging frontiers. • Holistic survey of abstractive text summarization. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
82. Large-scale multi-view subspace clustering via embedding space and partition matrix.
- Author
-
Cheng, Tianhang, Peng, Jinjia, Li, Hui, and Wang, Huibing
- Subjects
- *
NONNEGATIVE matrices , *PROBLEM solving , *INFORMATION processing , *MATRICES (Mathematics) , *DATA mapping - Abstract
Multi-view subspace clustering (MVSC) has attracted increasing attention because it can extract information from multiple views and explore the underlying structure. In general, most of the existing anchor strategies solve the problem of excessive complexity, but there is a loss of information in the process of affinity graph passing to spectral clustering. In addition, the noise in the original data leads to the learned anchor graph not representing the data features adequately. To solve the above problems, this paper proposes a Large-Scale Multi-View Subspace Clustering via Embedding Space and Partition Matrix(LMVSC-EPM) algorithm that preserves the distribution of the original data by embedding matrix mapping. The algorithm utilizes the centroid matrix and the clustering assignment matrix to derive the clustering results directly. Specifically, LMVSC-EPM employs embedding matrices to map the raw data into the embedding space and adaptively learns anchors using a view-sharing anchor strategy. Moreover, the non-negative orthogonal matrix is adopted to assign the results to the clustering assignment matrix, which avoids the loss of the affinity matrix in the passing process. Furthermore, an alternating minimization optimization method is designed in this paper to solve the optimization problem. Experimental results on seven underlying datasets demonstrate the efficiency and superiority of the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
83. Event-triggered impulsive tracking control for uncertain strict-feedback nonlinear systems via the neural-network-based backstepping technique.
- Author
-
Pan, Weihao, Fan, Debao, Li, Hanfeng, and Zhang, Xianfu
- Subjects
- *
BACKSTEPPING control method , *NONLINEAR systems , *CHEMICAL reactors , *CLOSED loop systems , *CHEMICAL systems , *ADAPTIVE control systems - Abstract
This paper studies the problem of event-triggered impulsive tracking control (ETITC) for uncertain strict-feedback nonlinear systems (USFNSs). In contrast to existing impulsive control schemes, this paper incorporates the neural-network (NN)-based backstepping technique into impulsive control design, such that stronger nonlinearities and uncertainties are allowed to be included in the concerned systems. The proposed state-feedback ETITC scheme guarantees that all the signals of the closed-loop system are bounded and the tracking error ultimately converges to an adjustable bounded region, while also avoiding the Zeno behavior. In addition, by constructing a NN-based observer, this paper further develops an output-feedback ETITC scheme to extend its investigation to the scenario where full states are not available for impulsive control design. Finally, an illustrative example involving a chemical reactor system is presented to demonstrate the effectiveness of our control schemes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
84. Fixed-time stability analysis of general impulsive systems and application to synchronization of complex networks with hybrid impulses.
- Author
-
Wang, Qihang and Abdurahman, Abdujelil
- Subjects
- *
STABILITY of nonlinear systems , *DISCONTINUOUS functions , *SYNCHRONIZATION , *INTEGRALS - Abstract
In this paper, the fixed-time stability of general impulsive systems and the fixed-time synchronization issue of impulsive complex networks are investigated. First, the fixed-time stability of a class of nonlinear systems with general impulsive effects is analyzed by means of inequality method and using some special functions. Then, the developed fixed-time stability results are used to study the fixed-time synchronization of impulsive complex networks under the saturation controller. Compared with some early published works, in this paper, the estimation accuracy of settling time is improved by calculating the value of improper integral more precisely, and the saturation controller effectively suppresses chattering phenomenon caused by discontinuous signum function. Lastly, some numerical examples are given to verify the feasibility of our theoretical results. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
85. Position-aware Interactive Attention Network for multi-intent spoken language understanding.
- Author
-
Sun, Pengfei, Cao, Han, Yu, Hongli, Cui, Yachao, and Wang, Lei
- Subjects
- *
ORAL communication , *NATURAL languages - Abstract
Spoken Language Understanding (SLU) is a crucial component of task-oriented dialog systems. In recent years, there has been increasing attention on multi-intent SLU due to its relevance to complex real-world applications. Most existing joint models only utilize multi-intent information to guide the slot filling, with only a small number of models achieving bidirectional interaction. Additionally, unlike traditional single-intent SLU, multi-intent SLU is scope-dependent, where each intent in a sentence has its specific dependency range. However, current bidirectional joint models often employ a pipeline approach to implement interaction between the two sub-tasks, which fails to fully leverage the semantic information between them and can lead to error propagation. Moreover, attention-based multi-intent joint models do not adequately model the positional relationships between words, resulting in suboptimal overall performance. In this paper, we propose a novel multi-intent SLU model called Position-aware Interactive Attention Network (PIAN), which consists of interactive attention and rotary position embedding. The aim of PIAN is to fully exploit the correlation between the two sub-tasks and capturing the positional dependencies between words. This facilitates mutual guidance and enhancement between the two sub-tasks. Additionally, to address the issue of uncoordinated slot problem generated by traditional slot filling decoders, we employ an MCRF slot filling decoder to constrain the slot labels. We evaluate our model on two public multi-intent SLU datasets, and the experimental results demonstrate that our model achieves state-of-the-art performance on key metrics. Code for this paper is publicly available at https://github.com/puhahahahahaha/SLU_with_Co_PRoE. • To achieve sufficient interaction and bidirectional promotion between the two sub-tasks, we propose an Interactive Attention Framework to achieve explicit bidirectional interaction between multi-intent and slot information. • To establish word dependencies within sentences, we use rotary position embedding, which captures more local features on the basis of global feature attention. For slot filling, MCRF replaces the traditional decoder, resolving decoding issues with uncoordinated slots. • Experimental results show that our model performs better than the current SOTA models on key metrics in two public datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
86. A review of privacy-preserving research on federated graph neural networks.
- Author
-
Ge, Lina, Li, YanKun, Li, Haiao, Tian, Lei, and Wang, Zhe
- Subjects
- *
GRAPH neural networks , *ARTIFICIAL intelligence , *INFORMATION sharing , *DATA modeling , *PRIVACY - Abstract
Graph neural networks are widely employed in diverse domains; however, they confront the peril of privacy infringement. To address this concern, federated learning emerges as a privacy-preserving approach that avoids sharing data for model training, effectively resolving the issue of privacy leakage in graph neural networks. The rapid advancement of federated neural networks has spurred the demand for more potent tools to enhance model performance owing to the concealed correlation information amongst federated learning participants. However, the structural attributes of federated graph neural networks render them vulnerable to inference attacks, reconstruction attacks, inversion attacks, and the like, potentially endangering privacy. This study delves into the intricacies of privacy-preserving within federated graph neural networks. Firstly, it introduces the architecture and variants of federated graph neural networks, analyzes the privacy risks encountered by these networks from four perspectives, and elucidates three primary attack methods. In accordance with the privacy-preserving mechanism of federated graph neural networks, it summarizes the privacy-preserving techniques and synthesizes the existing strategies from four perspectives: encryption methods, perturbation methods, anonymization, and hybrid methods. Furthermore, it summarily presents the prevailing framework for preserving privacy in neural networks. Ultimately, this paper examines the challenges and outlines future research directions pertaining to federated graph neural network technology. • This study investigates the privacy leakage risks of FedGNNs, summarizing and analyzing these risks and attack methods to provide a multifaceted investigation for privacy-preserving FedGNNs. • It collates cutting-edge research results and classifies protection mechanisms into encryption-based, perturbation-based, anonymization-based, and hybrid techniques, detailing their advantages and shortcomings. • The paper proposes future research directions for building robust, interpretable, efficient, fair, inductive, and comprehensive FedGNNs. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
87. Adaptive fixed-time neural consensus control for a class of uncertain nonlinear multi-agent systems with full state constraints.
- Author
-
Shang, Yun, Cheng, Zunshui, Xin, Youming, and Lin, Xue
- Subjects
- *
RADIAL basis functions , *MULTIAGENT systems , *HYPERBOLIC functions , *TANGENT function , *LYAPUNOV functions , *ADAPTIVE control systems - Abstract
This paper is concerned with the fixed-time consensus control problem for non-strict feedback multi-agent systems with asymmetric output constraints and full state constraints. Considering the feasibility of controlling execution, a novel practical virtual control signal is developed utilizing both saturation function and hyperbolic tangent function to ensure that this signal can remain within the same restricted range as the corresponding state variable throughout entire operation process. In backstepping steps, the design of ideal virtual control signal also adopts a different form of piecewise function than before, introducing high-order polynomial functions to avoid singularity problems in the derivation process. In addition, function approximation ability of radial basis function neural networks technique is applied to estimate uncertainties derived from the system functions and controller design procedure. Moreover, universal barrier Lyapunov function approach is improved for constructing an adaptive constrained synchronization control scheme. By fixed-time stability theory, it is shown that the tracking errors of the MAS converge to an adjustable region around the origin in a fixed time and the state variables always obey their constraints. And the upper bound of the settling time is merely dependent on design parameters, which is not affected by the initial states of MAS. The effectiveness of the proposed control strategy is shown by a numerical simulation example at last. Two scenarios are provided to demonstrate the advantages of the control protocol proposed in this paper. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
88. Transformer-based cross-modality interaction guidance network for RGB-T salient object detection.
- Author
-
Luo, Jincheng, Li, Yongjun, Li, Bo, Zhang, Xinru, Li, Chaoyue, Chenjin, Zhimin, He, Jingyi, and Liang, Yifei
- Subjects
- *
TRANSFORMER models , *IMAGE fusion , *RECOMMENDER systems , *INFORMATION filtering , *THERMOGRAPHY - Abstract
Exploring more effective multimodal fusion strategies is still challenging for RGB-T salient object detection (SOD). Most RGB-T SOD methods tend to focus on the strategy of acquiring modal complementary features by utilizing foreground information while ignoring the importance of background information for salient object localization. In addition, feature fusion without information filtering may introduce more noise. To solve these problems, this paper proposes a new cross-modal interaction guidance network (CIGNet) for RGB-T saliency object detection. Specifically, we construct a transformer-based dual-stream encoder to extract multimodal features. In the decoder, we propose an attention mechanism-based modal information complementary module (MICM) for capturing cross-modal complementary information for global comparison and salient object localization. Based on the MICM features, we design a multi-scale adaptive fusion module (MAFM) to find the optimal salient region of the multi-scale fusion process and reduce redundant features. In order to enhance the completeness of salient features after multi-scale feature fusion, this paper proposes the saliency region mining module (SRMM), which corrects the features in the boundary neighborhood by exploiting the differences between foreground and background pixels and the boundary. Comparisons with other state-of-the-art methods on three RGB-T datasets and five RGB-D datasets, the experimental results demonstrate the superiority and extensiveness of the proposed CIGNet. • The significance of thermal images in salient object detection was investigated. • Using attentional mechanisms to capture multimodal complementary features. • Importance of background pixels for correcting salient object boundary features. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
89. Lifelong reinforcement learning tracking control of nonlinear strict-feedback systems using multilayer neural networks with constraints.
- Author
-
Ganie, Irfan and Jagannathan, S.
- Subjects
- *
REINFORCEMENT learning , *NONLINEAR systems , *SYSTEM dynamics , *UNCERTAIN systems , *CLOSED loop systems - Abstract
This paper presents a novel safe integral reinforcement learning (IRL)-based optimal trajectory tracking scheme for nonlinear systems with uncertain dynamics that is subject to constraints. We leverage multilayer neural networks (MNNs) for actor-critic MNNs along with an NN identifier in the backstepping process for minimizing a discounted value function. A time-varying barrier Lyapunov function (TVBLF) is utilized for handling constraints and to provide safety assurances. Online weight update laws for the actor and critic MNNs are derived that are driven by Bellman error and control input error. We introduce an online lifelong learning (LL) method in the critic NN, utilizing the Bellman error in MNNs to address catastrophic forgetting. The method's effectiveness is demonstrated through simulations on mobile robot multitask tracking. The paper concludes with a stability analysis of the closed-loop system. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
90. Neural-based fixed-time composite learning control for multiagent systems with intermittent faults.
- Author
-
Zheng, Xiaohong, Ren, Hongru, Zhou, Qi, and Wang, Xinzhong
- Subjects
- *
MACHINE learning , *FAULT-tolerant control systems , *CLOSED loop systems , *NONLINEAR functions , *NONLINEAR systems - Abstract
In this paper, a distributed fixed-time composite learning control problem is addressed for nonlinear multiagent systems (MASs) subject to intermittent actuator faults. First, a distributed estimator is constructed for followers that are unable to communicate directly with the leader. Then, instead of using the traditional adaptive neural network (NN) algorithm, a predictor-based composite learning technique is proposed, which incorporates the prediction error into the NN update law to enhance the estimation accuracy of the unknown nonlinearity. Furthermore, an adaptive fault-tolerant control compensation mechanism is developed for intermittent faults that may occur indefinitely and frequently. To guarantee that all signals of the closed-loop system are bounded in fixed time, a nonsingular fixed-time fault-tolerant controller in the form of quadratic function is established. Finally, simulation results confirm the effectiveness of the presented algorithm. • This paper presents a singularity-free fixed-time NN algorithm for nonlinear MASs, and a composite learning algorithm is established to improve the approximation accuracy of nonlinear functions by introducing a prediction error into NN update law. • For followers without access to the leader, a local estimator is utilized to estimate the leader information. Therefore, the present control method avoids the emergence of coupling terms between agents during the controller design. • This paper considers intermittent actuator faults that may occur indefinitely and frequently, posing significant challenges. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
91. A review of research on reinforcement learning algorithms for multi-agents.
- Author
-
Hu, Kai, Li, Mingyang, Song, Zhiqiang, Xu, Keer, Xia, Qingfeng, Sun, Ning, Zhou, Peng, and Xia, Min
- Subjects
- *
MACHINE learning , *REWARD (Psychology) , *ARTIFICIAL intelligence , *LITERATURE reviews , *MULTIAGENT systems , *REINFORCEMENT learning - Abstract
In recent years, multi-agent reinforcement learning techniques have been widely used and evolved in the field of artificial intelligence. However, traditional reinforcement learning methods have limitations such as long training time, large sample data requirements, and highly delayed rewards. Therefore, this paper systematically and specifically studies the MARL algorithm. Firstly, this paper uses Citespace software to visually analyze the existing literature on multi-agent reinforcement learning and briefly indicates the research hotspots and key research directions in this field. Secondly, the applications of traditional reinforcement learning algorithms under two task objects, namely single-agent and multi-agent systems, are described in detail. Then, the paper highlights the diverse applications, challenges, and corresponding solutions of MARL algorithmic techniques in the field of MAS. Finally, the paper points out future research directions based on the existing limitations of the algorithm. Through this paper, readers will gain a systematic and in-depth understanding of MARL algorithms and how they can be utilized to better address the various challenges posed by MAS. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
92. Hybrid-order distributed SGD: Balancing communication overhead, computational complexity, and convergence rate for distributed learning.
- Author
-
Omidvar, Naeimeh, Hosseini, Seyed Mohammad, and Maddah-Ali, Mohammad Ali
- Subjects
- *
OPTIMIZATION algorithms , *COMPUTATIONAL complexity , *GENERALIZATION , *PRIOR learning , *SCALABILITY - Abstract
Communication overhead, computation load, and convergence speed are three major challenges in the scalability of distributed stochastic optimization algorithms in training large neural networks. In this paper, we propose the approach of hybrid-order distributed stochastic gradient descent (HO-SGD) which strikes a better balance between these three than the previous methods, for a general class of non-convex stochastic optimization problems. In particular, we advocate that by properly interleaving zeroth-order and first-order gradient updates, it is possible to significantly reduce the communication and computation overheads while guaranteeing a fast convergence. The proposed method guarantees the same order of convergence rate as in the fastest distributed methods (i.e., fully synchronous SGD) while having significantly less computational complexity and communication overhead per iteration, and the same order of communication overhead as in the state-of-the-art communication-efficient methods, with order-wisely less computational complexity. Moreover, it order-wisely improves the convergence rate of zeroth-order SGD methods. Finally and remarkably, empirical studies demonstrate that the proposed hybrid-order approach provides significantly higher test accuracies and superior generalization than all the baselines, owing to its novel exploration mechanism. • This paper proposes the novel approach of hybrid-order optimization and learning , which strikes a better balance between communication overhead, computational complexity, and convergence rate for distributed optimization and learning than the previous methods. • The proposed method can solve a general class of non-convex stochastic optimization problems with guaranteed convergence to the stationary point. • The proposed method guarantees the same order of convergence rate (in terms of the numbers of iterations and worker nodes) as in the fastest distributed methods (i.e., fully synchronous SGD), while having significantly less computational complexity and communication overhead per iteration. • The proposed method guarantees the same order of communication overhead as in the state-of-the-art communication-efficient methods, with order-wisely less computational complexity. • The proposed method order-wisely improves the convergence rate of zeroth-order SGD methods. • Finally and remarkably, empirical studies demonstrate that the proposed hybrid-order approach provides significantly higher test accuracies and superior generalization than all the baselines, owing to its novel exploration mechanism. • The paper proposes a novel exploration mechanism resulting in better generalization. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
93. Exploiting indirect linear correlation for label distribution learning.
- Author
-
Yu, Peiqiu and Jia, Xiuyi
- Subjects
- *
MACHINE learning , *MATRIX decomposition , *LABEL design , *ALGORITHMS , *HYPOTHESIS - Abstract
Label distribution learning represents the relevance of labels to samples using description degree, which can provide richer semantic information, thus finding wider applications. Exploiting label correlations is an effective approach to narrow down the hypothesis space of label distribution learning models. In existing works that utilize low-rank assumptions or label linear dependence to mine correlations, it is assumed that a label can be linearly expressed by other labels. However, this assumption can only be satisfied when there are linear dependency relationships between labels, thus the label correlation obtained by such methods is subject to certain distortion. To address this issue, this paper assumes that labels can be linearly represented by the same set of bases. The correlation between labels is represented by sharing common bases. Specifically, the paper employs matrix factorization to extract bases that can be used to represent all labels. And then designs a label distribution learning algorithm based on the property of sharing the same set of bases of the ground truth label distribution and predict label distribution. The effectiveness of the algorithm is verified through experimental validation. Generally speaking, the algorithm presented in this paper achieves optimal performance at 73.15% of the cases, with the best average ranking. In the two-tailed t -test, the algorithm in this paper exhibits statistical superiority compared to all comparison algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
94. 3D facial modeling, animation, and rendering for digital humans: A survey.
- Author
-
Zhang, Yiwei, Su, Renbin, Yu, Jun, and Li, Rui
- Subjects
- *
ONLINE education , *TECHNICAL literature , *RESEARCH & development , *THREE-dimensional modeling , *EVERYDAY life , *HUMAN beings - Abstract
With the continuous advancement of 3D and human–computer interaction technologies, digital human systems have been widely implemented in our daily lives, such as 3D games, online education, and virtual assistants. This paper provides a comprehensive literature survey of the technologies vitally important to digital humans in 3D facial modeling, animation, and rendering. Firstly, we combed through the research development of 3D face modeling technology and summarized the 3D facial models. Then, we summed up various methods for animating the 3D face based on data from images, videos, and the like. Lastly, we delved deeply into the techniques for 3D face rendering, comparing traditional rendering with neural network rendering. In conclusion, we provided a forward-looking perspective on the current challenges and future development of 3D digital humans. • This paper provides a comprehensive literature survey of the technologies that contribute to the development of digital humans. • The paper illustrates the hotspots in the 3D digital humans which guides a broader audience both within and outside these areas. • The paper describes and compares the key technologies (3D facial modeling, animation generation, graphical rendering) that have been widely used in 3D digital humans. • The paper provides a forward-looking perspective on the current challenges and future development of 3D digital humans. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
95. Development and challenges of object detection: A survey.
- Author
-
Li, Zonghui, Dong, Yongsheng, Shen, Longchao, Liu, Yafeng, Pei, Yuanhua, Yang, Haotian, Zheng, Lintao, and Ma, Jinwen
- Subjects
- *
OBJECT recognition (Computer vision) , *ALGORITHMS , *EVERYDAY life , *DEEP learning , *SPEED - Abstract
Object detection is a basic vision task that accompanies people's daily lives all the time. The development of object detection technology has experienced an evolution from traditional-based algorithms to deep learning-based algorithms, which has made a qualitative leap in both detection accuracy and detection speed. With the advancement of deep learning, object detection techniques are increasingly becoming a part of everyday life, with the YOLO series of algorithms being extensively applied in various industries. In this paper, we initially present the frequently utilized datasets and evaluation criteria for object detection. Subsequently, we delve into the evolution of traditional object detection algorithms, highlighting two-stage and one-stage approaches through illustrative examples of classical methods. We also conduct a comprehensive summary and analysis of the detection results obtained by these methods. In addition, we introduce object detection applications in daily life, as well as the importance and some difficulties of these applications. Finally, we analyze and summarize the difficulties and challenges facing the task of object detection, and we look forward to the future development direction of object detection. • This survey is an extended version of our paper in ICIC2023. • We review the development and challenges of object detection. • We present the application of object detection and make prospects for the future. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
96. Lp- and risk consistency of localized SVMs.
- Author
-
Köhler, Hannes
- Subjects
- *
SUPPORT vector machines , *BIG data - Abstract
Kernel-based regularized risk minimizers, also called support vector machines (SVMs), are known to possess many desirable properties but suffer from their super-linear computational requirements when dealing with large data sets. This problem can be tackled by using localized SVMs instead, which also offer the additional advantage of being able to apply different hyperparameters to different regions of the input space. In this paper, localized SVMs are analyzed with regards to their consistency. It is proven that they inherit L p - as well as risk consistency from global SVMs under very weak conditions. Though there already exist results on the latter of these two properties, this paper significantly generalizes them, notably also allowing the regions that underlie the localized SVMs to change as the size of the training data set increases, which is a situation also typically occurring in practice. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
97. A review of deep learning based malware detection techniques.
- Author
-
Wang, Huijuan, Cui, Boyan, Yuan, Quanbo, Shi, Ruonan, and Huang, Mengying
- Subjects
- *
ARTIFICIAL intelligence , *COMPUTER engineering , *PROFESSIONAL identity , *FEATURE extraction , *COMPUTER network security , *DEEP learning - Abstract
With the popularization of computer technology, the number of malware has increased dramatically in recent years. Some malware can threaten the network security of users by downloading and installing, and even spreading widely on the Internet, causing consequences such as private data leakage in the operating system, extortion, and network paralysis. In order to deal with these threats, researchers analyze malicious samples through various analysis techniques, which are usually divided into static and dynamic analysis based on the principle of whether the code needs to be executed or not. This paper analyzes in detail several classical methods of feature extraction in malware detection techniques. With the technological development of artificial intelligence, deep learning is gradually being introduced into malware detection, which does not require the identification of professional security personnel and greatly improves the generalization ability of detection. In the paper, text-based detection methods, image visualization-based detection, and graph structure-based detection techniques are reviewed according to different feature extraction methods. In addition, the paper compares 26 datasets that have been commonly used in recent years applied in the research field and explains the main contents and specifications of the datasets. Finally, a summary and outlook of the malware research field is given. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
98. Recent advances on federated learning: A systematic survey.
- Author
-
Liu, Bingyan, Lv, Nuoyan, Guo, Yuanchun, and Li, Yawen
- Subjects
- *
FEDERATED learning , *ARTIFICIAL intelligence - Abstract
Federated learning has emerged as an effective paradigm to achieve privacy-preserving collaborative learning among different parties. Compared to traditional centralized learning that requires collecting data from each party, in federated learning, only the locally trained models or computed gradients are exchanged, without exposing any data information. As a result, it is able to protect privacy to some extent. In recent years, federated learning has become more and more prevalent and there have been many surveys for summarizing related methods in this hot research topic. However, most of them focus on a specific perspective or lack the latest research progress. In this paper, we provide a systematic survey on federated learning, aiming to review the recent advanced federated methods and applications from different aspects. Specifically, this paper includes four major contributions. First, we present a new taxonomy of federated learning in terms of the pipeline and challenges in federated scenarios. Second, we summarize federated learning methods into several categories and briefly introduce the state-of-the-art methods under these categories. Third, we overview some prevalent federated learning frameworks and introduce their features. Finally, some potential deficiencies of current methods and several future directions are discussed. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
99. Revisiting vision-based violence detection in videos: A critical analysis.
- Author
-
Kaur, Gurmeet and Singh, Sarbjeet
- Subjects
- *
CRITICAL analysis , *EVIDENCE gaps , *VIOLENCE , *VIDEO surveillance , *COMPUTER vision , *STIMULUS generalization - Abstract
An ever-increasing installation of surveillance cameras at different places for ensuring public safety, security and asset protection has triggered the need for intelligent video surveillance to monitor the people and their behavior. Violence detection is a prominent application of intelligent surveillance as it plays a vital role in public safety, behavior monitoring, and law enforcement. It deals with identifying whether the violent event or behavior occurred in video sequence or not. Various researchers have developed different techniques and features for the detection of violence in recent years. The purpose of this paper is to provide an expository study of various state-of-the-art approaches for detecting violence in videos. It is evident from the ongoing effort in this field and with the advancement in computer vision technology, previous approaches get surpassed and new methods and features always keep on developing. Therefore, in order to complement ongoing research, it is also imperative to conduct a comprehensive analysis of different works from time to time. In this survey, each work has been critically analyzed, along with its pros and cons. The violence detection techniques have been divided under three categories: handcrafted features based, deep learning and hybrid violence detection approaches that have been extensively reviewed in various sub-categories. The major challenges faced by researchers and the steps involved in the process of violence detection are also discussed. Moreover, unlike typical practices of comparing the performance of models within same datasets, we examined a few models for cross-dataset assessment and revealed their limited generalization. This emphasizes the crucial need for robust model generalization to ensure efficacy across varied real-world scenarios. Finally, the paper carried out a discussion of various research gaps in the current approaches and the possible solutions to be taken to resolve them, laying a solid foundation for future work in this area. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
100. AHCL-TC: Adaptive Hypergraph Contrastive Learning Networks for Text Classification.
- Author
-
Zhang, Zhen, Ni, Hao, Jia, Xiyuan, Su, Fangfang, Liu, Mengqiu, Yun, Wenhao, and Wu, Guohua
- Subjects
- *
CLASSIFICATION , *NATURAL language processing - Abstract
Text classification is an essential and classic problem in natural language processing. In recent years, Graph Convolutional Networks (GCNs) have been widely applied to text classification tasks. However, there are still three critical challenges in practical applications: (1) Limited explanatory ability for domain-specific terminology; (2) Imbalanced sample distribution leading to a decline in model performance; (3) Excessive calculation consumption. To address these issues, this paper proposes an Adaptive Hypergraph Contrastive Learning Network (AHCL-TC) for text classification, which combines graph contrastive learning and hypergraph neural networks to better capture the internal relationship structure of domain-specific terminology and achieve superior performance with imbalanced sample distribution. AHCL-TC designs a neural network structure based on hypergraphs, using the high-order relationships of hypergraphs to model complex structures in text data. This structure allows the model to better understand multiple relationships between terms, thereby improving classification performance. Graph contrastive learning is used as the training framework of the model. By learning the intrinsic characteristics and structure of graph data, and using data enhancement algorithms to expand the data set, the model's robustness to data imbalance is improved. Additionally, the paper presents a hypergraph adaptive augmentation algorithm designed for hypergraph structures to address the data imbalance problem. The proposed model is evaluated on multiple benchmark datasets, and the experimental results demonstrate its effectiveness for text classification tasks, outperforming baseline models even with reduced training data percentage. Furthermore, a comprehensive comparison of computational efficiency was conducted. The outcomes reveal that the computational consumption of our model is notably lower than that of other models. • The model can capture comprehensive semantic information and expand sample data. • This study introduces Hypergraph Attention Networks, leveraging hypergraph structures. • Comparative analysis shows both superior performance and computational efficiency. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.