Journal: knowledge-based systems / Language: english / Publication Type: eBooks / Publication Year Range: Last 3 years / Publisher: elsevier b.v. / Search Limiters: Available in Library Collection / Topic: artificial neural networks - Searchworks@Jio Institute Digital Library Search Results

Showing total 47 results

Start Over Search Limiters Available in Library Collection Topic artificial neural networks Publication Year Range Last 3 years Language english Publication Type eBooks Journal knowledge-based systems Publisher elsevier b.v.

47 results

1. Learning sparse reparameterization with layer-wise continuous sparsification.

Author: Wang, Xiaodong, Huang, Yaxiang, Zeng, Xianxian, Guo, Jianlan, and Chen, Yuqiang
Subjects: *ARTIFICIAL neural networks, *MACHINE learning, *HTTP (Computer network protocol), *WEIGHT training, *LOTTERY tickets, *PARAMETERIZATION
Abstract: Sparse reparameterization in Deep Neural Networks (DNNs) aims to achieve a better tradeoff between the network parameter count and performance. Recently, the lottery ticket hypothesis suppose that excellent sub-networks ("winning tickets") exist in dense randomly-initialized networks. These sparse sub-networks trained from scratch are able to reach the performance of their dense counterparts. Compared with Iterative Magnitude Pruning that relies on pruning strategies, the Continuous Sparsification algorithm learns the "winning tickets" with gradient-based methods, achieving better performance. In this paper, we propose Layer-wise Continuous Sparsification (LCS) scheme for finding sparse sub-networks, in which the parameterized relaxation of step functions used to remove network parameters in each layer is integrated into the DNN loss as an optimization objective. LCS utilizes a family of sigmoid functions to asynchronously filter important per-layer weights throughout training, yielding sparser and better sub-networks. Experiments show that our method surpasses state-of-the-art methods for sparse reparameterization. Additionally, the proposed method can be utilized as a regularization technique to further improve the accuracy of dense networks 1 1 Our code is publicly available at https://github.com/RiyaoDong/LCS.. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

2. Learning hierarchical time series data augmentation invariances via contrastive supervision for human activity recognition.

Author: Cheng, Dongzhou, Zhang, Lei, Bu, Can, Wu, Hao, and Song, Aiguo
Subjects: *HUMAN activity recognition, *SUPERVISED learning, *DATA augmentation, *ARTIFICIAL neural networks, *TIME series analysis, *UBIQUITOUS computing
Abstract: Human activity recognition (HAR) using wearable sensors is always a research hotspot in ubiquitous computing scenario, in which feature learning has played a crucial role. Recent years have witnessed outstanding success of contrastive learning in image data, which learns invariant representations by adding contrastive loss to the last layer of deep neural networks. However, the advantages of contrastive loss have been rarely leveraged in time series data for activity recognition. A fundamental obstacle to contrastive learning in HAR is that image-based augmentation could not fit well with sensor data, which raises a critical issue: the distortions induced by augmentation might be further enlarged by intermediate layers of a network and thus severely harm semantic structure of original activity instance. In this paper, taking an inspiration from deeply-supervised learning, we propose a novel approach called Contrastive Supervision by considering "where" to contrast, which aims to learn time series augmentation invariances by forcing positive pairs nearby and negative pairs far apart at different depths of neural network. Our approach can be seen as a generalization of contrastive learning in a deeply-supervised setting, where the contrastive loss is used to supervise the intermediate layers instead of only the last layer, allowing us to effectively leverage label information so as to better fuse the multi-level features. Experiments on popular benchmarks demonstrate that our approach can learn better representations and improve classification accuracy without additional inference cost for various HAR tasks in supervised and semi-supervised learning paradigms. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

3. A fusion of a deep neural network and a hidden Markov model to recognize the multiclass abnormal behavior of elderly people.

Author: Wang, Lingling, Zhou, Ying, Li, Rao, and Ding, Lieyun
Subjects: *ARTIFICIAL neural networks, *OLDER people, *HUMAN activity recognition, *ELDER care, *POPULATION aging
Abstract: With a rapidly aging population, the health problems of older individuals have attracted increasing attention. Elderly people are exposed to more health risks, and their behavior can often indicate signs of crises and diseases. This paper proposes a method for multiclass abnormal behavior recognition based on the integration of sensors (accelerometer, gyroscope, and orientation sensor) in smartphones with positioning monitoring systems. An attention-convolutional neural network (CNN)-long short-term memory (LSTM) algorithm is introduced for human action recognition, which can handle time-dependent data with multiple features of varying importance. Furthermore, based on the long-term activity data (action and position) of the human body, a hidden Markov model (HMM) of the individual's daily behavior activity state is constructed. Experimental results show that compared with the existing approaches, the proposed attention-CNN-LSTM algorithm performs better in recognizing different human behaviors, with 94.2% precision, 95.1% recall, and F 1 -score of 94.6%. The developed daily behavior HMM for an individual has been proven to be able to detect changes in human behavioral patterns and indicate the specific behaviors that have changed. The method proposed in this paper can provide a powerful technical guarantee for the health and care of elderly individuals at home. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

4. GenSMILES: An enhanced validity conscious representation for inverse design of molecules.

Author: Bhadwal, Arun Singh, Kumar, Kamal, and Kumar, Neeraj
Subjects: *ARTIFICIAL neural networks, *RECURRENT neural networks, *MOLECULES
Abstract: Deep neural networks have become increasingly important in recent years for creating molecules with desirable properties. In general, SMILES strings are used to train deep neural network based models. The trained model is then used to generate the desired molecules. Unfortunately, due to syntactical and semantic flaws in the representation, the SMILES string generates a substantial number of invalid molecules. SMILES representation fails to efficiently handle rings, branch and bonds between atoms. Lack of robustness in dealing with the cited aspects results in abundance of invalid strings. To overcome this limitation, this paper proposes a SMILES like representation, called GenSMILES. GenSMILES tackles syntactical and semantic issues by relying upon derivative rules to apply constraints. This causes a generative model produces more valid SMILES strings. By substituting a single notation for the pair representation of branches and rings in SMILES with a) and ^, respectively, GenSMILES corrects the syntactical issues. The mismatching of atom's bonds is the main cause of semantic errors. GenSMILES addresses such issues by employing derivation rules during string conversion from GenSMILES to SMILES. Every SMILES string can be represented with an equivalent GenSMILES. When used for designing drug molecule, GenSMILES increases molecule's validity when compared with SMILES and DeepSMILES on two popular architectures i.e., Recurrent Neural Network and Variational Autoencoder. The main benefit of GenSMILES is that it can be applied directly to generative algorithms without adapting the model environment. GenSMILES is beneficial not only to generative approaches of DL but also to the approaches that use SMILES string-like representation. On most of the datasets, GenSMILES is effective in improving validity above 90% and diversity score 15. GenSMILES results in more diversity in the properties of generated molecules and allows exploration of larger portion of undiscovered chemical space as compared to SMILES and DeepSMILES. GenSMILES guarantee that the generative model does not need to remember any long dependencies and is principal contribution of this work. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

5. Creating CREATE queries with multi-task deep neural networks.

Author: Diker, S. Nazmi and Sakar, C. Okan
Subjects: *ARTIFICIAL neural networks, *LANGUAGE models, *DATABASES, *NATURAL language processing, *NATURAL languages, *COMPOSITE columns, *DEEP learning
Abstract: Text-to-SQL is the task of mapping natural language utterances to structured query language (SQL). Prior studies focus on information retrieval aspect of this task. In this paper, we demonstrate a new use case for the text-to-SQL studies where a user can create database models from natural language and introduce the first dataset for this task. Furthermore, we propose a framework that consists of three modular components: (1) classifier component which predicts the data type and constraints of a column, (2) constraint component which establishes foreign key relationships between tables, (3) query component which generates a series of CREATE queries through a slot-filling approach. We propose various baseline models to evaluate the classifier component in different aspects. Each model is based on a state-of-the-art pre-trained language model that allows us to assess contextualized word representations in the table creation task. The obtained results showed that such representations play a vital role in classifying column data types and constraints correctly. One of the downsides of pre-trained models is the training time and the model size. Our experiments revealed that a multi-task BERT model achieving 75% and 96% accuracy for the data type and constraint prediction tasks, respectively, effectively addresses both problems. • We demonstrate a new use case for the text-to-SQL studies and collect the first dataset for this task. • We aim to create database tables from natural language using deep neural networks. • We build a system consisting of classifier component, constraint component and query component. • The results showed that word representations play a vital role for data type and constraint prediction. • We obtained 75% and 96% accuracy for the data type and constraint prediction tasks, respectively. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

6. An improved arithmetic optimization algorithm for training feedforward neural networks under dynamic environments.

Author: Gölcük, İlker, Ozsoydan, Fehmi Burcin, and Durmaz, Esra Duygu
Subjects: *FEEDFORWARD neural networks, *MATHEMATICAL optimization, *ARITHMETIC, *ARTIFICIAL neural networks, *METAHEURISTIC algorithms
Abstract: This paper proposes an improved Arithmetic Optimization Algorithm (AOA) to train artificial neural networks (ANNs) under dynamic environments. Despite many successful applications of metaheuristic training of ANNs, these studies assume static environments, which might not be realistic in real-world nonstationary processes. In this study, the training of ANNs is modeled as a dynamic optimization problem, and the proposed AOA is used to optimize connection weights and biases of the ANN under the presence of concept drift. The proposed method is designed to work for classification tasks. The performance of the proposed algorithm has been tested on twelve dynamic classification problems. Comparative analysis with state-of-the-art metaheuristic optimization algorithms has been provided. The superiority of the compared algorithms has been verified using nonparametric statistical tests. The results show that the improved AOA outperforms compared algorithms in training ANNs under dynamic environments. The findings demonstrate the potential of improved AOA for dynamic data-driven applications. • Artificial neural networks (ANNs) are trained in dynamic environments. • An improved arithmetic optimization algorithm (AOA) is developed. • The improved AOA is used to train ANNs under concept drift. • The effectiveness of the proposed algorithm is statistically verified. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

7. Text-based person search via local-relational-global fine grained alignment.

Author: Zhou, Junfeng, Huang, Baigang, Fan, Wenjiao, Cheng, Ziqian, Zhao, Zhuoyi, and Zhang, Weifeng
Subjects: *ARTIFICIAL neural networks, *FEATURE extraction, *RELATIONAL databases
Abstract: The core difficulty of text-based person search is how to achieve fine-grained alignment of visual and linguistic modal data, so as to bridge the gap of modal heterogeneity. Most existing works on this task focus on global and local features extraction and matching, ignoring the importance of relational information. This paper proposes a new text-based person search model, named CM-LRGNet , which extracts C ross- M odal L ocal- R elational- G lobal features in an end-to-end manner, and performs fine-grained cross-modal alignment on the above three feature levels. Concretely, we first split the convolutional feature maps to obtain local features of images, and adaptively extract textual local features. Then a relation encoding module is proposed to implicitly learn the relational information implied in the images and texts. Finally, a relation-aware graph attention network is designed to fuse the local and relational features to generate global representations for both images and text queries. Extensive experimental results on benchmark dataset (CUHK-PEDES) show that our approach can achieve state-of-the-art performance (64.18%, 82.97%, 89.85% in terms of Top-1, Top-5, and Top-10 accuracies), by learning and aligning local-relational-global representations from different modalities. Our code has been released in https://github.com/zhangweifeng1218/Text-based-Person-Search. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

8. Knowledge graph extension with a pre-trained language model via unified learning method.

Author: Choi, Bonggeun and Ko, Youngjoong
Subjects: *KNOWLEDGE graphs, *UNIFIED modeling language, *ARTIFICIAL neural networks, *LANGUAGE models
Abstract: Knowledge graphs (KGs) are collections of real-world knowledge that is represented by a structured form of triples. Since they are manually built in their nascent stage, there is a common problem that some links (triples) are missing. Knowledge graph completion (KGC) aims to find those missing links and thereby complete the KGs. However, as knowledge increases through diverse sources, new entities have explosively emerged and they are needed to be connected to existing KGs. Thus, open-world KGC is targeted on extending KGs to those new entities. Dealing with those new entities is challenging because they do not have any connection with entities in the existing KGs. One way to handle the new ones is to embed them with their textual descriptions with pre-trained word embeddings and score them in the graph-vector space with the existing typical KGC models. These models have resulted in meaningful results but there is still a lack of studies on utilizing the latest neural networks, such as pre-trained language models which are known to be better at capturing contexts than pre-trained word embeddings. This paper proposes a novel model that effectively connects new entities and existing KGs through a pre-trained language model. To effectively handle the problem, we utilize two learning methods; one is the classification method of the masked language model (MLM) that predicts a word among a huge vocabulary set with a given context, and the other is multi-task learning based on the Multi-Task for Deep Neural Networks (MT-DNN). Based on the methods, the model first generates an embedding of a new entity using its textual description and then uses the embedding to find one of the existing entities from a KG where the new entity can be connected. The experimental results on three benchmark datasets, DBPedia50k, FB15k-237-OWE, and FB20k, show that the proposed model improves performances by 9. 2 % p , 4. 4 % p , and 11. 1 % p , respectively, and achieves new state-of-the-art performance for all datasets. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

9. Bio-inspired Active Learning method in spiking neural network.

Author: Zhan, Qiugang, Liu, Guisong, Xie, Xiurui, Zhang, Malu, and Sun, Guolin
Subjects: *ARTIFICIAL neural networks, *ACTIVE learning
Abstract: Spiking neural networks (SNNs) have gained a lot of attention and achievements recently because of their low-power advantages on neuromorphic hardware. However, training deep SNNs still requires a large number of labeled data which are expensive to obtain. To address this issue, we propose an effective Bio-inspired Active Learning (BAL) method in this paper to reduce the training cost of SNN models. Specifically, bio-inspired behavior patterns of spiking neurons are defined to represent the internal states of SNN models for active learning. Then, an active learning sample selection strategy is proposed by leveraging the empirical and generalization pattern divergence in SNNs. By labeling selected samples and adding them to training, behavioral patterns can be optimized to improve the performance of neural networks. Comprehensive experiments are conducted on the CIFAR-10, SVHN, and Fashion-MNIST datasets with various sample proportions. The experimental results demonstrate that the proposed BAL achieves state-of-the-arts performance in SNNs compared with the existing active learning methods. • BAL exploits the active learning feasibility in spiking neural networks. • We propose the neuron behavior patterns based on the inner states of spiking neurons. • Experiments are conducted to demonstrate the effectiveness of BAL. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

10. Hill Climb Modular Assembler Encoding: Evolving Modular Neural Networks of fixed modular architecture.

Author: Praczyk, Tomasz
Subjects: *PARTICLE swarm optimization, *ARTIFICIAL neural networks, *DIFFERENTIAL evolution, *MODULAR construction
Abstract: The paper presents a novel generative Neuro-Evolutionary (NE) method called Hill Climb Modular Assembler Encoding (HCMAE). The target application of the HCMAE is to evolve modular Artificial Neural Networks (ANNs) whose modular structure is known in advance. Different variants of HCMAE were tested on two well-known ANN benchmarks, i.e. the Two-Spiral problem (feed-forward ANNs), and the Inverted-Pendulum problem (recurrent ANNs), for four different modular neural architectures. Particle Swarm Optimization and Differential Evolution were selected as rivals for HCMAE. Both rival methods were tested in two variants, i.e. a classical one-population variant and cooperative co-evolutionary multi-population variant. The paper presents the proposed method and reports all the experiments. • HCMAE: A new method for evolving modular neural networks. • HCMAE was tested on two well-known benchmarks, for four different modular neural architectures. • HCMAE outperforms Differential Evolution and Particle Swarm Optimization. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

11. Integration of ontology reasoning-based monocular cues in deep learning modeling for single image depth estimation in urban driving scenarios.

Author: Benkirane, Fatima Ezzahra, Crombez, Nathan, Ruichek, Yassine, and Hilaire, Vincent
Subjects: *ONTOLOGIES (Information retrieval), *ARTIFICIAL neural networks, *DEEP learning, *MONOCULARS, *ONTOLOGY
Abstract: Humans are able to estimate the depth of objects in their environment even using only one eye through the use of what are known as monocular cues. In this paper, we aim to integrate human knowledge and human-like reasoning used for monocular depth estimation within deep neural networks. The idea is to support the network in order to help it learn in an explicit and fast way the essential cues for the target task. For this purpose, we investigate the possibility of directly integrating geometric, semantic, and contextual information into the monocular depth estimation process. We propose exploiting an ontology model in a deep learning context to represent the urban environment as a structured set of concepts linked with semantic relationships. Monocular cues information are extracted through reasoning performed on the proposed ontology and are fed together with the RGB image in a multistream way into the deep neural network for depth estimation. Our approach is validated and evaluated on widespread benchmark datasets: KITTI, CityScapes, and AppolloScape. The obtained results show that the proposed method improves upon the state-of-the-art monocular depth estimation deep models and shows promising results regarding cross-evaluation, mainly for unseen driving scenarios. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

12. Learning graph from graph signals: An approach based on sensitivity analysis over a deep learning framework.

Author: Roshanfekr, Behnam, Amirmazlaghani, Maryam, and Rahmati, Mohammad
Subjects: *DEEP learning, *ARTIFICIAL neural networks, *SENSITIVITY analysis
Abstract: Utilizing a meaningful graph plays an essential part in the performance of graph-based algorithms. However, a ground-truth graph representing the relationships between data points is not readily available in many applications. This paper proposes a graph learning method based on sensitivity analysis over a deep learning framework called GL-SADL. The proposed method is composed of two steps. First, it estimates the signal value for each vertex using the signal values corresponding to the other vertices with a Deep Neural Network (DNN) block. Then a sensitivity analysis approach is applied to each DNN block to determine how the input signal values influence the DNN's response. This procedure leads us to the underlying graph structure. The utilization of DNNs allows us to take advantage of the non-linearity characteristics of neural networks in modeling the observed graph signals. In addition, since the DNNs are considered as general approximators, there is no need to make any prior assumptions about the distribution of the observed graph signals. Experiments with synthetic and real-world datasets demonstrate that the proposed method can infer meaningful graph structures from observed graph signals. • A graph learning method based on sensitivity analysis over a deep learning framework called GL-SADL is proposed. • GL-SADL infers a graph structure from observed graph signals. • GL-SADL estimates the signal of each node based on the other nodes with DNN blocks. • GL-SADL applies a sensitivity analysis to the DNN blocks to infer graph structure. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

13. Multi-task learning with graph attention networks for multi-domain task-oriented dialogue systems.

Author: Zhao, Meng, Wang, Lifang, Jiang, Zejun, Li, Ronghan, Lu, Xinyu, and Hu, Zhongtian
Subjects: *ARTIFICIAL intelligence, *ARTIFICIAL neural networks, *KNOWLEDGE base, *ONTOLOGY
Abstract: A task-oriented dialogue system (TOD) is an important application of artificial intelligence. In the past few years, works on multi-domain TODs have attracted increased research attention and have seen much progress. A main challenge of such dialogue systems is finding ways to deal with cross-domain slot sharing and dialogue act temporal planning. However, existing studies seldom consider the models' reasoning ability over the dialogue history; moreover, existing methods overlook the structure information of the ontology schema, which makes them inadequate for handling multi-domain TODs. In this paper, we present a multi-task learning framework equipped with graph attention networks (GATs) to probe the above two challenges. In the method, we explore a dialogue state GAT consisting of a dialogue context subgraph and an ontology schema subgraph to alleviate the cross-domain slot sharing issue. We further construct a GAT-enhanced memory network using the updated nodes in the ontology subgraph to filter out the irrelevant nodes to acquire the needed dialogue states. For dialogue act temporal planning, a similar GAT and corresponding memory network are proposed to obtain fine-grained dialogue act representation. Moreover, we design an entity detection task to improve the capability of soft gate, which determines whether the generated tokens are from the vocabulary or knowledge base. In the training phase, four training tasks are combined and optimized simultaneously to facilitate the response generation process. The experimental results for automatic and human evaluations show that the proposed model achieves superior results compared to the state-of-the-art models on the MultiWOZ 2.0 and MultiWOZ 2.1 datasets. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

14. Heterogeneous domain adaptation by Features Normalization and Data Topology Preserving.

Author: Pirbonyeh, Mohammad Amin, Shayegan, Mohammad Amin, Sotudeh, Gholamreza, and Shamshirband, Shahab
Subjects: *ARTIFICIAL neural networks, *TOPOLOGY, *PHYSIOLOGICAL adaptation, *DATA distribution
Abstract: Transfer Learning (TL) algorithms are effective methods for utilizing the source domain knowledge to improve classifier learning in the target domain. These algorithms use labeled source domain instances to reinforce learning in the new target domain. The challenging model of these algorithms, i.e. heterogeneous domain adaptation, is characterized by the source and target domains, which have different features spaces, different data distributions and labels. Two effective factors for improving the performance of TL algorithms are reducing the difference in feature space and distribution between domains. Recently, some TL methods have focused on reducing the difference in distribution and some on reducing the difference in feature space between domains, and a few number of methods have considered the two issues, simultaneously. However, these methods usually use complex computational structures, such as deep neural networks and optimization methods, to adapt the feature space and domain distribution. Another important factor for increasing the efficiency of TL algorithms is preserving topology of data during transfer between domains that have not been studied in the existing TL algorithms. Simultaneous use of these three factors in a single framework improve the performance of TL algorithms. To this end, this paper proposes a novel method for solving heterogeneous domain adaptation problems based on Feature Normalization and Data Topology Preserving (FN-DTP). FN-DTP employs the feature normalization technique in the source and target domains. Hence, the feature spaces of the source and target domains closes together and reduces the difference in the distribution of domain data, while simultaneously preserve topology of data during transfer between domains. This method, without using complex computational structures, employs the mentioned three factors to improve the performance of TL algorithms in a unified framework. Experimental evaluation by using the Office-Caltech and PIE benchmark datasets demonstrates the effectiveness and efficiency of the proposed method compared with the state-of-the-art method in improving learning in semi-supervised classifications. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

15. DASFTOT: Dual attention spatiotemporal fused transformer for object tracking.

Author: Wu, Ruixu, Wen, Xianbin, Yuan, Liming, and Xu, Haixia
Subjects: *OBJECT tracking (Computer vision), *ARTIFICIAL neural networks, *TRACK & field, *MACHINE learning
Abstract: The Siamese network method is widely applied in the field of object tracking. The transformer-based tracker achieves state-of-the-art tracking results. However, these methods cannot effectively fuse local and global features of video images and cannot pay more attention to tracking objects spatiotemporally. In this paper, we proposed a new object tracking method (DASFTOT), which includes a backbone network, transformer mechanism and bounding prediction box. First, we use a 3D CNN to extract motion information. Second, we superimpose important temporal and spatial information through a dual attention spatiotemporal fused transformer (DASFT) to fuse local and global spatiotemporal features and calculate the correlation between templates and search regions. Third, to improve the robustness of tracking, we dynamically update part of the template frame. Finally, we position the tracking object through a bounding prediction box. To verify the effectiveness of the proposed tracker (DASFTOT), experiments on the GOT-10K, LaSOT, TrackingNet, VOT2020 and OTB100 benchmark datasets demonstrated that the proposed tracker was highly comparable to other state-of-the-art methods. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

16. Multiple instance relation graph reasoning for cross-modal hash retrieval.

Author: Hou, Chuanwen, Li, Zhixin, Tang, Zhenjun, Xie, Xiumin, and Ma, Huifang
Subjects: *ARTIFICIAL neural networks, *CURVES
Abstract: The similarity calculation is too simple in most cross-modal hash retrieval methods, which do not consider the impact of the relations between instances. To solve this problem, this paper proposes a reasoning method based on multiple instance relation graphs. By constructing similarity matrices, we establish global and local instance relation graphs, which fully exploit fine-grained relations between instances. First, we perform relation reasoning based on the relation graphs of the image and text modalities; then, we map the relations within the two modalities into the instance graphs; finally, we perform relation reasoning based on the instance graphs. Furthermore, to accommodate the features of both the image and text modalities, we employ a step-by-step training strategy to train the proposed neural network model. According to the results of experiments on the MIRFlickr and NUS-WIDE datasets, our method has apparent advantages in terms of m A P and has a good effect on the topK-precision curve. This shows that our method realizes the in-depth mining of instance relations, which can improve the retrieval performance significantly. • We compute complex relations between instances through two evaluation criteria. • We construct three kinds of relation graphs to mine fine-grained relations. • We carry out three kinds of reasoning based on the constructed relation graphs. • We propose a new step-by-step strategy for training the network framework. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

17. Multi-instance semantic similarity transferring for knowledge distillation.

Author: Zhao, Haoran, Sun, Xin, Dong, Junyu, Yu, Hui, and Wang, Gaige
Subjects: *MACHINE learning, *DISTRIBUTION (Probability theory), *ARTIFICIAL neural networks, *SEMANTICS, *KNOWLEDGE transfer
Abstract: Knowledge distillation is a popular paradigm for learning portable neural networks by transferring the knowledge from a large model into a smaller one. Most existing approaches enhance the student model by utilizing the similarity information between the categories of instance level provided by the teacher model. However, these works ignore the similarity correlation between different instances that plays an important role in confidence prediction. To tackle this issue, we propose a novel method in this paper, called multi-instance semantic similarity transferring for knowledge distillation (STKD), which aims to fully utilize the similarities between categories of multiple samples. Furthermore, we propose to better capture the similarity correlation between different instances by the mixup technique, which creates virtual samples by a weighted linear interpolation. Note that, our distillation loss can fully utilize the incorrect classes similarities by the mixed labels. The proposed approach promotes the performance of student model as the virtual sample created by multiple images produces a similar probability distribution in the teacher and student networks. Experiments and ablation studies on several public classification datasets including CIFAR-10, CIFAR-100, CINIC-10 and Tiny-ImageNet verify that this light-weight method can effectively boost the performance of the compact student model. It shows that STKD has substantially outperformed the vanilla knowledge distillation and achieved superior accuracy over the state-of-the-art knowledge distillation methods. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

18. Segmentation for regions of interest in radiotherapy by self-supervised learning.

Author: Yu, Chengrong, Hu, Junjie, Li, Guiyuan, Zhu, Shengqian, Bai, Sen, and Yi, Zhang
Subjects: *ARTIFICIAL neural networks, *COMPUTED tomography, *RADIOTHERAPY
Abstract: Segmentation of regions of interest (ROIs) is crucial in radiotherapy, which is a time-consuming and labor-intensive work performed manually by oncologists. In the addressing of this challenge, training deep neural networks (DNNs) with a large quantity of labeled data has delivered promising results. Yet, perfectly-sized and carefully-labeled datasets for model training are typically expensive to acquire. This potential limitation has critically restrained the wide application of DNNs-based segmentation methods in clinical practice. Self-supervised learning (SSL) that utilizing the massive unannotated dataset by pretext task provides a possible solution to this limitation. Currently, existing SSL related methods mainly aim at natural images, which take a less obvious advantage of the characteristics of medical images. In this paper, a novel SSL-based approach is proposed to explore the property of computed tomography (CT) image features is proposed. To give the supervised signal, the spatial distance between CT pairs is utilized to develop a new pretext task, which is based on the tomography characteristic and can be easily acquired from DICOM data based on their tomography characteristic. Models pretrained using the proposed method can be transferred to downstream tasks with significantly alleviated dependency on the annotated dataset. Multiple segmentation and classification tasks are carried out to evaluate the effectiveness of the proposed method. Empirical results demonstrate that the proposed method achieves superior performance over the ImageNet pretrained model and prevalent SSL methods. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

19. Multigroup spatial shift models for thermal infrared tracking.

Author: Li, WeiSheng, Lv, Lanbing, and Zhu, Junye
Subjects: *ARTIFICIAL neural networks, *INFRARED imaging
Abstract: Many trackers use attention mechanisms to enhance the details of feature maps. However, most attention mechanisms are designed based on RGB images and thus cannot be effectively adapted to infrared images. The features of infrared images are weak, and the attention mechanism is difficult to learn. Most thermal infrared trackers based on Siamese networks use traditional cross-correlation techniques, which ignore the correlation between local parts. To address these problems, this paper proposes a Siamese multigroup spatial shift (SiamMSS) network for thermal infrared tracking. The SiamMSS network uses a spatial shift model to enhance the details of feature maps. First, the feature map is divided into four groups according to the channel, moving unit wise in four directions of the two dimensions of height and width. Next, the sample and search image features are cross-correlated using the graph attention module cross-correlation method. Finally, split attention is used to fuse multiple response maps. Results of experiments on challenging benchmarks, including VOT-TIR2015, PTB-TIR, and LSOTB-TIR, demonstrate that the proposed SiamMSS outperforms state-of-the-art trackers. The code is available at lvlanbing/SiamMSS (github.com). [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

20. Collaborative representation learning for nodes and relations via heterogeneous graph neural network.

Author: Li, Weimin, Ni, Lin, Wang, Jianjia, and Wang, Can
Subjects: *COLLABORATIVE learning, *ARTIFICIAL neural networks, *INFORMATION design
Abstract: Heterogeneous graphs, which consist of multiple types of nodes and edges, are highly suitable for characterizing real-world complex systems. In recent years, due to their strong capability of capturing rich semantics, heterogeneous graph neural networks (HGNNs) have proven to be a powerful technique for representation learning on heterogeneous graphs. However, most of the existing HGNNs only focus on learning node representations and ignore the learning of relation representations, which are complementary to node representations. To address this limitation, we propose a new HGNN model with Co llaborative Representation Learning for N odes and R elations (named CoNR) for link prediction task in this paper. Collaborative learning means that node representations and relation representations participate in and affect each other's learning process. Specifically, node representations are obtained through a delicate two-step attention mechanism incorporating relation representations that can hierarchically aggregate information within one relation and across different relations. For relation representations, a relation encoder based on node information is designed to encode node representations into relation representations. Therefore, in this framework, node representations and relation representations are mutually updated in a layer-wise manner and work together to facilitate the downstream tasks better. Extensive experimental results on different datasets show the excellent performance of the proposed CoNR. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

21. See, hear, read: Leveraging multimodality with guided attention for abstractive text summarization.

Author: Atri, Yash Kumar, Pramanick, Shraman, Goyal, Vikram, and Chakraborty, Tanmoy
Subjects: *ACADEMIC conferences, *ARTIFICIAL neural networks, *VIDEO coding
Abstract: In recent years, abstractive text summarization with multimodal inputs has started drawing attention due to its ability to accumulate information from different source modalities and generate a fluent textual summary. However, existing methods use short videos as the visual modality and short summary as the ground-truth, therefore, perform poorly on lengthy videos and long ground-truth summary. Additionally, there exists no benchmark dataset to generalize this task on videos of varying lengths. In this paper, we introduce AVIATE, the first large-scale dataset for abstractive text summarization with videos of diverse duration, compiled from presentations in well-known academic conferences like NDSS, ICML, NeurIPS, etc. We use the abstract of corresponding research papers as the reference summaries, which ensure adequate quality and uniformity of the ground-truth. We then propose FLORAL, a factorized multi-modal Transformer based decoder-only language model, which inherently captures the intra-modal and inter-modal dynamics within various input modalities for the text summarization task. FLORAL utilizes an increasing number of self-attentions to capture multimodality and performs significantly better than traditional encoder–decoder based networks. Extensive experiments illustrate that FLORAL achieves significant improvement over the baselines in both qualitative and quantitative evaluations on the existing How2 dataset for short videos and newly introduced AVIATE dataset for videos with diverse duration, beating the best baseline on the two datasets by 1.39 and 2.74 ROUGE-L points respectively. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

22. Visual object tracking via non-local correlation attention learning.

Author: Gao, Long, Liu, Pan, Ning, Jifeng, and Li, Yunsong
Subjects: *OBJECT tracking (Computer vision), *ARTIFICIAL neural networks, *RUNNING speed
Abstract: Siamese-based trackers have achieved remarkable advancements in performance of visual object tracking. The similarity matrix computed is crucial to Siamese-based tracker. However, the similarity matrix is lack of long-range dependency information which may lead to tracking drift on challenging scenes, like significant deformation, background clutter and occlusion. To address the above issue, this paper proposes a Siamese network with non-local correlation attention (SiamNCA). First, a non-local correlation attention module is proposed to integrate the long-range information into the similarity matrix, and give each sample in the search patch a weight based on their similarity to the template. Second, bi-directional features fusion module is introduced to fuse different similarity matrixes obtained with different level features. Finally, comprehensive experiments on representative tracking benchmarks, including OTB2015, VOT-2018, LaSOT and GOT-10k, reveal that the two modules can improve the performance of the baseline method in challenging scenes, and SiamNCA achieves state-of-art. For the average running speed, SiamNCA can achieve 43 FPS in real time. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

23. Layered input GradiNet for image denoising.

Author: Qiao, Shuang, Yang, Jiarui, Zhang, Tian, and Zhao, Chenyi
Subjects: *IMAGE denoising, *IMAGE fusion, *DEEP learning, *FEATURE extraction, *ARTIFICIAL neural networks, *SOURCE code
Abstract: In image denoising, the recovery of high-frequency regions, such as image edges, directly affects the quality of the denoised images. However, previous deep learning-based denoising methods fail to effectively allocate the transmission of different frequency information and have difficulty giving network attention to high-frequency regions. In this paper, we rethink the fusion of image gradients in a neural network and deeply mine the intrinsic structure of the input image to propose a novel layered input gradient network (LIGN) for image denoising. The core of our network focuses on the features of different frequencies through two networks, which contain several key elements: (a) The input noise image is layered to widen the shallow layer of the network and to promote the hierarchical learning of different types of frequencies. (b) A multiscale feature extraction (MFE) block and information shunting (IS) block are proposed to integrate and separate various frequency features. (c) A gradient network (GradiNet) is designed to extract high-frequency information by network training, and the information is adaptively added to the input of the parallel main network (MainNet) through normalization to obtain high-quality images. Furthermore, we propose a sharpening loss function to enhance the texture details of the denoised image and improve visual quality. Extensive experiments on synthetic and real-world datasets show that the proposed method greatly enhances perceptual visual quality and achieves state-of-the-art performance on both PSNR and SSIM. The source code and pretrained models are available at https://github.com/JerryYann/LIGN. • A layered input gradient network (LIGN) based on a dual U-Net for high-quality image denoising is proposed. • Layered input and sharpening loss greatly improve the perceptual quality of the denoised image. • Multi-scale feature extraction block can capture more semantic information. • LIGN achieves the SoTA performance compared with the latest methods on synthetic and real noise datasets. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

24. Self-labeling with feature transfer for speech emotion recognition.

Author: Wen, Guihua, Liao, Huiqiang, Li, Huihui, Wen, Pengchen, Zhang, Tong, Gao, Sande, and Wang, Bao
Subjects: *EMOTION recognition, *ARTIFICIAL neural networks, *SPEECH
Abstract: Most speech emotion recognition methods based on frames have obtained good results in many applications. However, they segment each speech sample into smaller frames that are labeled with the same emotional tag as that of the speech sample. This is inconsistent with the possibility of a speech sample containing several emotional categories at the same time. Thus, this paper proposes a self-labeling (SL) learning method for speech emotion recognition, which automatically segments each speech sample into frames and then labels them with the corresponding emotional tags, where the compatibility of these tags is also checked. Then, a time–frequency deep neural network for speech emotion recognition is designed and trained. As most speech emotion datasets are very small, the feature transfer model is applied to further enhance the performance of the SL learning method, which is trained on large-scale audio data. Experimental results on various datasets demonstrate the effectiveness of the proposed method. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

25. OGCNet: Overlapped group convolution for deep convolutional neural networks.

Author: Li, Guoqing, Zhang, Meng, Zhang, Jingwei, and Zhang, Qianru
Subjects: *ARTIFICIAL neural networks, *CONVOLUTIONAL neural networks, *KNOWLEDGE transfer
Abstract: The deployment of deep convolutional neural networks (CNNs) is heavily constrained by its high computational costs and parameter redundancy. For this reason, general group convolution (GGC) and depthwise convolution (DWC) were proposed, but they limited the information transfer in the channel dimension. In this paper, a novel and efficient overlapped group convolution (OGC) is proposed to improve the information transfer between channels. In OGC, the input feature maps can be overlapped between different groups. Compared with GGC, OGC has better information transfer in the channel dimension without additional parameters and computational cost. In theory, OGC unifies the standard convolution (SDC), GGC, and DWC. In other words, SDC, GGC, and DWC all belong to the special cases of OGC. In OGC, two flexible hyperparameters are defined, the number of input feature maps in each group (g) and the stride between adjacent groups (s), which make OGC more flexible and can make the trade-off between accuracy and parameters. The performance of OGC is analyzed in terms of parameters, computational cost, accuracy, run time, etc. The classification and object detection tasks are used to evaluate the performance of OGC. Experimental results show that the OGC has higher accuracy and is more efficient than the corresponding SDC, GGC, and DWC. The ratio of the two hyperparameters in OGC has a great impact on accuracy. When 2 3 < s g < 6 7 , OGC has higher accuracy than others. The proposed OGC is more stable and robust than GGC. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

26. Trigger is Non-central: Jointly event extraction via label-aware representations with multi-task learning.

Author: Lv, Jianwei, Zhang, Zequn, Jin, Li, Li, Shuchao, Li, Xiaoyu, Xu, Guangluan, and Sun, Xian
Subjects: *ARTIFICIAL neural networks, *DATA mining
Abstract: Event extraction (EE) occupies an important position in information extraction. Recently, deep neural network methods have been demonstrated to learn potential features well. However, existing networks for EE suffer from the following challenges: (1) Event argument extraction relies heavily on the classification of event triggers. (2) Previous works fail to exploit the predefined label representations for EE. (3) The interactive information between candidate arguments has not been fully exploited. To address the above mentioned problems, in this paper, we propose an advanced multi-task learning framework, named TNC , based on a fresh concept proposed by us: T rigger is N on- C entral, in which event argument extraction no longer depends on the event triggers but is performed synchronously with it. Our TNC extracts multiple event triggers and arguments simultaneously by adopting label representations and an auxiliary task, named Sentence Event Identification (SEI), which is devised to extract the event types contained in a sentence. In addition, we design a special symbol to merge the representation of candidate arguments over the Transformer encoder. We experiment on the widely used ACE 2005 corpora and TAC-KBP 2015, and the experimental results have proved that our model achieves state-of-the-art compared to other models, with higher effectiveness and adaptability. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

27. Attention-based aspect sentiment classification using enhanced learning through cnn-Bilstm networks.

Author: Ayetiran, Eniafe Festus
Subjects: *ARTIFICIAL neural networks, *CONVOLUTIONAL neural networks, *FEATURE extraction, *TRANSFER of training, *KNOWLEDGE transfer
Abstract: Deep neural networks (dnn) techniques for aspect-based sentiment classification have been widely studied. The success of these methods depends largely on training data which are often inadequate because of the rigor involved in manually tagging large collection of opinionated texts. Attempts have been made to transfer knowledge from document-level to aspect-level sentiment task. However, the success of this approach is also dependent on the model because aspect sentiment data like other type of texts contain complex semantic features. In this paper, we present an attention-based deep learning technique which jointly learns on document and aspect-level sentiment data and which also transfers learning from the document-level data to aspect-level sentiment classification. It basically consists of a convolutional layer and a bidirectional long short-term memory (Bi lstm) layer. The first variant of our technique uses convolutional neural network (cnn) to extract high-level semantic features. The output of the feature extraction is then fed into the Bi lstm layer which captures the contextual feature representation of the texts. The second variant applies the Bi lstm layer directly on the input data. In both variants, the output hidden representation is passed to an output layer using softmax activation function for sentiment polarity classification. We evaluate our model on four standard benchmark datasets which shows the effectiveness of our approach with improvements over baselines. We also conduct ablation studies to show the effect of the different document-level weights on the learning techniques. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

28. Knowledge-based turbomachinery design system via a deep neural network and multi-output Gaussian process.

Author: Chen, Junfeng, Liu, Changxing, Xuan, Liming, Zhang, Zhenwei, and Zou, Zhengping
Subjects: *ARTIFICIAL neural networks, *GAUSSIAN processes, *MACHINE learning
Abstract: The requirements of future aeroengines challenge turbomachinery designs to be quieter, greener, and more efficient; furthermore, they must be developed at considerably reduced cycles and costs. Traditional turbomachinery design systems involve the iterative design–evaluation–improvement process, leading to high computational costs and considerable reliance on expert experience. In this paper, a knowledge-based turbomachinery aerodynamic design system consisting of a design of experiment framework and a two-stage modeling is proposed. In the first phase of the two-stage modeling, a deep neural network (DNN)-enhanced multi-output Gaussian process model is constructed. This model determines the optimal loading distribution from reference designs that are optimized for the desired performance. In the second stage, a DNN model is developed to predict the turbine profile geometry for a given loading distribution. This formulation enables the automatic determination of the optimal turbine geometry without an iterative process. Furthermore, the knowledge in the design system grows with the increasing database. The results indicate that the L 2 relative errors for predicting the optimal pressure coefficient distribution and turbine profile geometry are less than 0.25% and 1.7%, respectively. A design case of a low-pressure turbine is used to validate the entire design system. The results reveal that the turbine geometry generated by the proposed design system outperforms that of the benchmark case in terms of the design and off-design aerodynamic performance. The foregoing observations are finally confirmed through experiments. • A knowledge-based turbine design method is developed based on machine learning. • A new knowledge-based self-expansion design procedure is proposed. • DNN is combined with the multi-output Gaussian process to improve accuracy. • The accuracy of the method is verified by CFD and flat plate test rig. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

29. Augmented Score-CAM: High resolution visual interpretations for deep neural networks.

Author: Ibrahim, Rami and Shafiq, M. Omair
Subjects: *ARTIFICIAL neural networks, *CONVOLUTIONAL neural networks, *DEEP learning
Abstract: There is an increasing demand to understand how neural networks make decisions when classifying images. Recent deep learning models have a black box architecture that limits the ability to explore their functionality. This paper presents an explainable method called Augmented Score-CAM, built on top of the existing Score-CAM and the existing image augmentation techniques. Unlike previous methods that relied on one input image to generate class activation maps, we adopt the image augmentation technique used to train convolutional neural networks. We use the input image to create a set of augmented images and generate a class activation map for each. The final activation map is obtained by combining augmented activation maps. We evaluate the model by performing qualitative and quantitative experiments. Augmented Score-CAM outperformed Score-CAM in terms of human trust, faithfulness, and object localization. Our model passed the sanity check and was found to be sensitive to network and dataset randomization. Moreover, we proposed the use of feature space augmentations by embedding neural style transfer in the model. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

30. Category-aware Multi-relation Heterogeneous Graph Neural Networks for session-based recommendation.

Author: Xu, Hao, Yang, Bo, Liu, Xiangkun, Fan, Wenqi, and Li, Qing
Subjects: *ARTIFICIAL neural networks
Abstract: Session-based recommendation (SBR) is one of the hot research areas in recent years. Various SBR models have been proposed, of which graph neural network (GNN)-based models are shown to have the state-of-the-art performance. Items' category information is an important piece of information and should be utilized in SBR models to improve model performance. In this paper, we introduce a principle way to incorporate items' category information for SBR. More specifically, we propose a new SBR model, Category-aware Multi-relation Heterogeneous Graph Neural Networks (CM-HGNN). In CM-HGNN, we first propose to construct an item–category heterogeneous graph (ICHG) to model both category–category relation and item–category relation. More specifically, we propose to transform the sequential information contained in a session into a heterogeneous graph with both item nodes and category nodes, by which items and categories can learn from each other and the items belonging to the same category can also perceive one another. As a result, multiple interests in a session could be more effectively captured. Then, a multi-relation heterogeneous graph convolution method is proposed to extract the multiple relation information contained in the ICHG. Extensive experiments are conducted on three widely used real-world datasets, and the results suggest that the proposed CM-HGNN outperforms the state-of-the-art SBR models. • We point out that there exists not only category–category relation but also item–category relation. • A session-based recommendation (SBR) model, CM-HGNN, is proposed, which utilizes both types of relations. • In CM-HGNN, we propose to construct an ICHG to model both types of relations. • We propose a multi-relation heterogeneous graph convolution method which is adopted in CM-HGNN. • Experimental results illustrate that the CM-HGNN outperforms the state-of-the-art SBR models. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

31. Evolutionary neural architecture search based on evaluation correction and functional units.

Author: Shang, Ronghua, Zhu, Songling, Ren, Jinhong, Liu, Hangcheng, and Jiao, Licheng
Subjects: *ARTIFICIAL neural networks, *WILDLIFE conservation, *SEARCH algorithms, *MATE selection
Abstract: Neural architecture search (NAS) has been a great success in the automated design of deep neural networks. However, neural architecture search using evolutionary algorithms is challenging due to the diverse structure of neural networks and the difficulty in performance evaluation. To this end, this paper proposes an evolutionary neural architecture search algorithm (called EF-ENAS) based on evaluation corrections and functional units. First, a mating selection operation based on evaluation correction is developed, which can help EF-ENAS discriminate high-performance network architectures and reduce the harmful effects of low fidelity accuracy evaluation methods. Then, a functional unit-based network architecture crossover operation is designed, which divides the neural network into different functional units for crossover and protects valuable network architectures from destruction. Finally, the idea of species protection is introduced into the traditional environmental selection operation and a species protection-based environmental selection operation is designed, which can improve the diversity of network architectures in a population. The EF-ENAS is tested on ten benchmark datasets with varying complexities. In addition, the proposed algorithm is compared with 44 state-of-the-art algorithms, including DARTS, EvoCNN, CNN-GA, AE-CNN, etc. The experimental results show that the proposed algorithm 1 1 The code of EF-ENAS is available at https://github.com/codesl173/EF-ENAS. can automatically design neural networks and perform better. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

32. Heterogenous affinity graph inference network for document-level relation extraction.

Author: Li, Rongzhen, Zhong, Jiang, Xue, Zhongxuan, Dai, Qizhu, and Li, Xue
Subjects: *ARTIFICIAL neural networks
Abstract: Document-level relation extraction (Doc-level RE) is a more practical and challenging task, which provides a new perspective on obtaining factual knowledge from the more complex cross-sentence text. Recent Doc-level RE, based on pre-trained language models, uses graph neural networks to implicitly model relation reasoning in a document. However, it is not perfect that the model neglects explicit reasoning clues, leading to a weak ability and a lack of capability to model long-distance relationships. In this paper, we propose to explicitly model the heterogeneous affinity graph, HAG, including a mention graph (MG) and a coreference graph (CG). We first construct CG to cluster the expressions together as a coreference array. Then, MG and CG are incorporated to capture the reasoning clues from the adjacent affinity matrix. Moreover, HAG is aggregated into an isomorphic entity graph according to the noise suppression mechanism and RGCN. Finally, the classification is established on the normalized graph to infer the relations of entity pairs. Experimental results significantly outperform baselines by nearly 1.7% ∼ 2.0% in F1 on three public datasets, DocRED, DialogRE, and MPDD. We further conduct ablation experiments to demonstrate the effectiveness of the proposed approach. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

33. TFGAN: Traffic forecasting using generative adversarial network with multi-graph convolutional network.

Author: Khaled, Alkilane, Elsir, Alfateh M. Tag, and Shen, Yanming
Subjects: *GENERATIVE adversarial networks, *TRAFFIC estimation, *PROBABILISTIC generative models, *ARTIFICIAL neural networks, *INTELLIGENT transportation systems, *MULTIGRAPH, *CHARTS, diagrams, etc.
Abstract: Traffic forecasting constitutes a task of great importance in intelligent transport systems. Owing to the non-Euclidean structure of traffic data, the complicated spatial correlations, and the dynamic temporal dependencies, it is challenging to predict traffic accurately. Despite the fact that few prior studies have considered the interconnections between multiple traffic nodes at the same timestep, the majority of studies fail to capture the dependencies among multiple nodes at different timesteps. Furthermore, most existing work generates shallow graphs based solely on the distance between traffic nodes, which limits their representation competence and declines their power in capturing complex correlations. In particular, inspired by the recent breakthroughs in the generative adversarial network (GAN) and the power of the graph convolution network (GCN) in handling non-Euclidean data, this paper puts forward an adversarial multi-graph convolutional neural network model, named TFGAN, to address the abovementioned problems. We integrate the unsupervised model elasticity with the supervision provided by supervised training to help the GAN generator model generates accurate traffic predictions. To improve the representation and model the implicit correlations effectively, multiple GCNs are constructed within the generator based on various perspectives, such as similarity, correlation, and spatial distance. Meanwhile, GRU and self-attention are applied after each graph to capture the dynamic temporal dependencies across nodes. The comprehensive experiments on three different traffic variables (traffic flow, speed, and travel time) using six real-world traffic datasets demonstrate that TFGAN outperforms the related state-of-the-art models and achieves significant results. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

34. Enhancing low-resource neural machine translation with syntax-graph guided self-attention.

Author: Gong, Longchao, Li, Yan, Guo, Junjun, Yu, Zhengtao, and Gao, Shengxiang
Subjects: *MACHINE translating, *ARTIFICIAL neural networks, *TEMPORAL lobe
Abstract: Most neural machine translation (NMT) models only rely on parallel sentence pairs, while the performance drops sharply in low-resource cases, as the models fail to mine the linguistry of the corpus. Incorporating prior monolingual knowledge explicitly, such as syntax, has been shown to be effective for NMT, particularly in low-resource scenarios. However, existing approaches have not exploited the full potential of the NMT architectures. In this paper, we present syntax-graph guided self-attention (SGSA): a neural network model that combines the source-side syntactic knowledge with multi-head self-attention. We introduce an additional syntax-aware localness modeling as a bias, which indicates that the syntactically relevant parts need to be paid more attention to. The bias is then incorporated into the original attention distribution to form a revised distribution. Moreover, to maintain the strength of capturing the meaningful semantic representations of source-sentence, we adopt a node random dropping strategy in multi-head self-attention subnetworks. Extensive experiments on several standard small-scale datasets demonstrate that SGSA can significantly improve the performance of Transformer-based NMT, and is also superior to the previous syntax-dependent state-of-the-art. • We propose a syntax-aware self-attention that integrates syntactic knowledge. • The syntactic dependency is exploited as a guidance, without any extra cost. • The syntactic dependency is converted as a graph to combine with the NMT model. • The syntax-aware approach also explicitly exploits sub-word units. • We introduce multiple attention representations for stronger robustness. • Experiments demonstrate that the approach achieves state-of-the-art results. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

35. Deep semi-supervised learning with contrastive learning and partial label propagation for image data.

Author: Gan, Yanglan, Zhu, Huichun, Guo, Wenjing, Xu, Guangwei, and Zou, Guobing
Subjects: *SUPERVISED learning, *DEEP learning, *DATA augmentation, *MACHINE learning, *LEARNING modules, *ARTIFICIAL neural networks
Abstract: Deep semi-supervised learning is becoming an active research topic because it jointly utilizes labeled and unlabeled samples in training deep neural networks. Recent advances are mainly focused on inductive semi-supervised learning which generally extends supervised algorithms to include unlabeled data. In this paper, we propose CL_PLP, a new transductive deep semi-supervised learning algorithm based on contrastive self-supervised learning and partial label propagation. The proposed method consists of two modules, contrastive self-supervised learning module extracting features from labeled and unlabeled data and partial label propagation module generating confident pseudo-labels through label propagation. For contrastive learning, we propose an improved twins network model by adding multiple projector layers and the contrastive loss term. Meanwhile, we adopt strong and weak data augmentation to increase the diversity of the dataset and the robustness of the model. For the partial label propagation module, we interrupt the label propagation process according to the quality of pseudo-labels and improve the impact of high-quality pseudo-labels. The performance of our algorithm on three standard baseline datasets CIFAR-10, CIFAR-100 and miniImageNet is better than previous state-of-the-art transductive deep semi-supervised learning methods. By transferring our model to the medical COVID19-Xray dataset, it also achieves good performance. Finally, we propose a strategy to integrate our partial label propagation module with inductive semi-supervised learning method, and the results prove that it can further improve their performance and obtain additional high-quality pseudo-labels for the unlabeled data. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

36. Aspect-level sentiment classification based on attention-BiLSTM model and transfer learning.

Author: Xu, Guixian, Zhang, Zixin, Zhang, Ting, Yu, Shaona, Meng, Yueting, and Chen, Sijin
Subjects: *KNOWLEDGE transfer, *ARTIFICIAL neural networks, *TASK analysis, *SENTIMENT analysis, *CLASSIFICATION
Abstract: Aspect-level sentiment classification, a fine-grained sentiment analysis task which provides entire and intensive results, has been a research focus in recent years. However, the performance of neural network models is largely limited by the small scale of datasets for aspect-level sentiment classification due to the challenges to label such data. In this paper, we propose an aspect-level sentiment classification model based on Attention-Bidirectional Long Short-Term Memory (Attention-BiLSTM) model and transfer learning. Based on Attention-BiLSTM model, three models including Pre-training (PRET), Multitask learning (MTL), and Pre-training & Multitask learning (PRET+MTL) are proposed to transfer the knowledge obtained from document-level training of sentiment classification to aspect-level sentiment classification. Finally, the performance of the four models is verified on four datasets. Experiments show that proposed methods make up for the shortcomings of poor training of neural network models due to the small dataset of the aspect-level sentiment classification. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

37. Holistic Graph Neural Networks based on a global-based attention mechanism.

Author: Rassil, Asmaa, Chougrad, Hiba, and Zouaki, Hamid
Subjects: *FEATURE extraction, *ARTIFICIAL neural networks, *SOCIAL networks, *MACHINE learning, *NEIGHBORHOODS
Abstract: Graph Neural Networks (GNNs) have become increasingly popular due to their impressive capacity to perform classification or regression on high-dimensional graph-structured data. However, standard message passing GNNs typically define nodes embeddings through a recursive neighborhood aggregation process which updates the representation vector of each node with reference to its neighborhood only. In this paper, we propose the Holistic Graph Neural Network (HGNN), a two-fold architecture which introduces a global-based attention mechanism for learning and generating nodes embeddings. The global features we inject, summarize the overall global behavior of the graph in addition to the local semantic and structural information. These global features will make each individual node aware of the global behavior of the graph outside the borders of the local neighborhood. We further propose a variant of the HGNN, we call HGNN α based on a more sophisticated hierarchical global-feature extraction mechanism. We explore diverse global pooling strategies to derive highly expressive global features. We also show that state-of-the-art GNNs can significantly benefit from the addition of the global-based attention introduced. Furthermore, we prove the efficiency of the HGNN model theoretically and adapt it to support graph data which carries edge attributes for example the Molecular datasets from the Open Graph Benchmark. Experiments on Bioinformatics datasets, Social Networks and Molecular datasets demonstrate that our proposed models achieve much better performance than state-of-the-art methods, for instance we achieved improvements of + 11 % on COLLAB and + 13 % on IMDB-BINARY datasets. • We propose the Holistic Graph Neural Network (HGNN) with a global-based attention mechanism. • We also propose the Alpha Holistic Graph Neural Network a variant based on a hierarchical weighted aggregation mechanism. • We test the proposed architecture on several benchmarks and achieve remarkable results. • We investigate the best global graph pooling strategy which would enable our proposed architecture to gain in time and efficiency. • We further demonstrate that other GNNs can greatly benefit from the addition of the global-based attention. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

38. An ensemble learning method based on deep neural network and group decision making.

Author: Zhou, Xiaojun, He, Jingyi, and Yang, Chunhua
Subjects: *GROUP decision making, *FEATURE extraction, *ARTIFICIAL neural networks, *TOPSIS method, *PROBLEM solving
Abstract: Ensemble learning (EL) method which has high potential to improve the performance of single image classification model can be constructed in two steps: one is the generation of weak learners; the other is the combination of these learners. In this paper, an ensemble learning method based on deep neural network and group decision making (DNN-GDM-EL) is proposed, which uses deep neural networks (DNNs) to generate individual learners and exploits group decision making (GDM) to combine these learners. DNNs have demonstrated remarkable ability for image classification due to the powerful feature extraction ability. To ensure the diversity and accuracy, many different DNNs are used to generate individual learners. Furthermore, the individual learners are regarded as decision-makers (DMs), the categories are seen as alternatives, and the GDM aims to find an optimal alternative considering various suggestions of DMs. Specifically, a GDM model is established based on Bayesian theory which can reflect the complex relationship among the class of image, prior knowledge and output of DNN, and a GDM method based on TOPSIS is applied to solve this problem. Next, the index matrix consisted of DM's attributes is proposed, and an aggregation method based on 2-additive generalized Shapley AIVIFCA (2AGSAIVIFCA) operator is used to calculate the weights of DMs by fusing these matrixes. Further, state transition algorithm (STA) is applied to obtain the optimal weights of alternative's attributes. The effectiveness and superiority are verified in three public data sets and a real industrial problem by comparing DNN-GDM-EL method with other typical EL methods. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

39. Phrase dependency relational graph attention network for Aspect-based Sentiment Analysis.

Author: Wu, Haiyan, Zhang, Zhiqiang, Shi, Shaoyun, Wu, Qingfeng, and Song, Haiyu
Subjects: *SENTIMENT analysis, *TERMS & phrases, *BASE pairs, *ARTIFICIAL neural networks, *MICROBLOGS
Abstract: Aspect-based Sentiment Analysis (ABSA) is a subclass of sentiment analysis, which aims to identify the sentiment polarity such as positive, negative, or neutral for specific aspects or attributes that appear in a sentence. Previous studies have focused on extracting aspect-sentiment polarity pairs based on dependency trees, ignoring edge labels and phrase information. In this paper, we instead propose a phrase dependency graph attention network (PD-RGAT) on the ABSA task, which is a relational graph attention network constructed based on the phrase dependency graph, aggregating directed dependency edges and phrase information. We perform experiments with two pre-training models, GloVe and BERT. Experimental results on the benchmarking datasets (i.e., Twitter, Restaurant, and Laptop) demonstrate that our proposed PD-RGAT has comparable effectiveness to a range of state-of-the-art models and further illustrate that the graph convolutional structure based on the phrase dependency graph can capture both syntactic information and short long-range word dependencies. It also shows that incorporating directed edge labels and phrase information can enhance the capture of aspect-sentiment polarities on the ABSA task. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

40. Megnn: Meta-path extracted graph neural network for heterogeneous graph representation learning.

Author: Chang, Yaomin, Chen, Chuan, Hu, Weibo, Zheng, Zibin, Zhou, Xiaocong, and Chen, Shouzhi
Subjects: *REPRESENTATIONS of graphs, *MESSAGE passing (Computer science), *SUBGRAPHS, *ARTIFICIAL neural networks, *BIPARTITE graphs
Abstract: Heterogeneous graphs with multiple types of nodes and edges are ubiquitous in the real world and possess immense value in many graph-based downstream applications. However, the heterogeneity within nodes and edges in heterogeneous graphs has brought pressing challenges for practical node representation learning. Existing works manually define multiple meta-paths to model the semantic relationships in heterogeneous graphs. Such strategies heavily rely on the quality of domain knowledge and require extensive hand-crafted works. In this paper, we propose a novel Meta-path Extracted heterogeneous Graph Neural Network (Megnn) that is capable of extracting meaningful meta-paths in heterogeneous graphs, providing insights about data and explainable conclusions to the model's effectiveness. Concretely, Megnn leverages heterogeneous convolution to combine different bipartite sub-graphs corresponding to edge types into a new trainable graph structure. By adopting the message passing paradigm of GNNs through trainable convolved graphs, Megnn can optimize and extract effective meta-paths for heterogeneous graph representation learning. To enhance the robustness of Megnn , we leverage multiple channels to yield various graph structures and devise a channel consistency regularizer to enforce the node embeddings learned from different channels to be similar. Extensive experimental results on three datasets not only show the effectiveness of Megnn compared with the state-of-the-art methods, but also demonstrate the favorable interpretability of the extracted meta-paths. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

41. Anti-synchronization of delayed memristive neural networks with leakage term and reaction–diffusion terms.

Author: Cao, Yanyi, Jiang, Wenjun, and Wang, Jiahai
Subjects: *ARTIFICIAL neural networks, *LEAKAGE, *DIFFUSION
Abstract: In this paper, the global exponential anti-synchronization problem is studied for an array of delayed memristive neural networks (DMNNs) with leakage term and reaction–diffusion terms. Firstly, to investigate the exponential anti-synchronization problems, we will design two different types of controllers for the proposed systems. Secondly, via adopting Lyapunov functional method, the drive–response theory and some inequality techniques, several sufficient conditions are obtained to ensure the global exponential anti-synchronization of the proposed DMNNs with leakage term and reaction–diffusion terms. Specifically, the delays could be time-varying or constants in the leakage term. Compared with the previously research results, the proposed neural network models herein are more general, and the obtained results consider the diffusion effects, leakage delay and time-varying delays, and those results can improve and enrich the previously obtained results. At last, two numerical examples are provided to demonstrate the validity of the derived theoretical results. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

42. Explaining Deep Learning using examples: Optimal feature weighting methods for twin systems using post-hoc, explanation-by-example in XAI.

Author: Kenny, Eoin M. and Keane, Mark T.
Subjects: *DEEP learning, *ARTIFICIAL neural networks, *CONVOLUTIONAL neural networks, *CASE-based reasoning, *ARTIFICIAL intelligence
Abstract: In this paper, the twin-systems approach is reviewed, implemented, and competitively tested as a post-hoc explanation-by-example solution to the eXplainable Artificial Intelligence (XAI) problem. In twin-systems, an opaque artificial neural network (ANN) is explained by "twinning" it with a more interpretable case-based reasoning (CBR) system, by mapping the feature weights from the former to the latter. Extensive comparative tests are performed, over four experiments, to determine the optimal feature-weighting method for such twin-systems. Twin-systems for traditional multilayer perceptron (MLP) networks (MLP–CBR twins), convolutional neural networks (CNNs; CNN–CBR twins), and transformers for NLP (BERT–CBR twins) are examined. In addition, Feature Activation Maps (FAMs) are explored to enhance explainability by providing an additional layer of explanatory insight. The wider implications of this research on XAI is discussed, and a code library is provided to ease replicability. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

43. Porn2Vec: A robust framework for detecting pornographic websites based on contrastive learning.

Author: Zhao, Jun, Shao, Minglai, Peng, Hao, Wang, Hong, Li, Bo, and Liu, Xudong
Subjects: *WEBSITES, *ARTIFICIAL neural networks, *AUTOMATIC train control
Abstract: Pornographic websites have become one of the largest origins spreading vulgar contents, which seriously threaten the mental and physical health of juveniles. Unfortunately, the existing pornography detection approaches are ineffective against the pornographic websites, which are armed with adversarial attack examples. In this paper, we propose Porn2Vec, a robust end-to-end framework for detecting pornographic websites using contrastive learning. Particularly, we first model pornographic websites with a heterogeneous graph consisting of websites, webpages, images, texts, and their interactive relationships, and formalize pornographic website detection into node classification task on the graph. Subsequently, we present a novel contrastive learning based heterogeneous graph embedding method to learn the high-level representation of websites by jointly aggregating image-based, text-based, and structure-based features. Finally, the learned website features are fed into a neural network to train an automatic model for pornographic website detection. Experimental results show that Porn2Vec outperforms the existing state-of-the-art methods, demonstrating a more promising and robust performance for detecting well-disguised pornographic websites equipped with adversarial attack examples. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

44. FGCAN: Filter-based Gated Contextual Attention Network for event detection.

Author: Yao, Shunyu, Shuang, Kai, Li, Rui, and Su, Sen
Subjects: *NATURAL language processing, *AMBIGUITY, *ALGORITHMS, *ARTIFICIAL neural networks
Abstract: Contextual information is widely used in natural language processing and is important for event detection. How to make full use of contextual information is a challenging problem. Traditional event detection methods mainly use sentence-level information to identify event triggers and classify them into specific types. Event trigger is defined as the word or phrase that most clearly expresses an event occurrence. However, the information used for detecting events is usually spread across multiple sentences, and sentence-level information is often insufficient to resolve ambiguities for some types of events. In this paper, we propose a novel Filter-based Gated Contextual Attention Network model called FGCAN, which is augmented with hierarchical contextualized representations to utilize both sentence-level and document-level information. In document level, we construct a gated contextual attention layer to extract document-level information by considering the relatedness between the current and other sentences and dynamically incorporate it into words. In this way, we can get cross-sentence clues without designing complex inference rules. In sentence level, we feed sentences into the classifier to get global information of them, and devise a rule-based filter algorithm to rectify the prediction of each word based on the probability ranking of the sentence labels, which is highly interpretable. These two mechanisms focusing on different scopes of contextual information can complement each other. The experimental results on the widely used ACE 2005 and KBP 2015 datasets show that our approach outperforms the state-of-the-art methods and the two components are effective in using contextual information. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

45. Attention uncovers task-relevant semantics in emotional narrative understanding.

Author: Nguyen, Thanh-Son, Wu, Zhengxuan, and Ong, Desmond C.
Subjects: *SEMANTICS, *ARTIFICIAL neural networks, *NATURAL language processing, *MACHINE learning
Abstract: Attention mechanisms in deep neural network models have helped them to achieve exceptional performance at complex natural language processing tasks. Previous attempts to investigate what these models have been "paying attention to" suggest that these attention representations capture syntactic information, but there is less evidence for semantics. In this paper, we investigate the capability of an attention mechanism to "attend to" semantically meaningful words. Using a dataset of naturalistic emotional narratives, we first build a Window-Based Attention (WBA) consisting of a hierarchical, two-level long short-term memory (LSTM) with softmax attention. Our model outperforms state-of-the-art models at predicting emotional valence, and even surpassing average human performance. Next, we show in detailed analyses, including word deletion experiments and visualizations, that words that receive higher attention weights in our model also tend to have greater emotional semantic meaning. Experimental results using six different pre-trained word embeddings suggest that deep neural network models which achieve human-level performance may learn to place greater attention weights on words that humans find semantically meaningful to the task at hand. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

46. CGNet: A Cascaded Generative Network for dense point cloud reconstruction from a single image.

Author: Wang, Ping, Liu, Li, Zhang, Huaxiang, and Wang, Tianshi
Subjects: *POINT cloud, *ARTIFICIAL neural networks, *DEEP learning, *OPTICAL scanners
Abstract: Point cloud reconstruction has made great progress with the application of deep learning, but the blurred edges and sparse distribution of point clouds remain huge challenges in this field. In this paper, we propose a Cascaded Generative Network (CGNet) to reconstruct dense point clouds from a single image. To preserve shape features, the pre-reconstruction network is combined with the up-sampling network to construct the multi-stage generation framework. In the generation process, an image re-description mechanism is designed to supervise the entire network by regenerating images from the reconstructed point clouds. Furthermore, the generative network introduces a siamese structure to extract consistent high-level semantic from multiple images. Extensive experiments on the ShapeNet dataset demonstrate that CGNet outperforms the state-of-the-art point cloud reconstruction methods. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

47. Pose transfer generation with semantic parsing attention network for person re-identification.

Author: Liu, Meichen, Wang, Kejun, Ji, Ruihang, Ge, Shuzhi Sam, and Chen, Jing
Subjects: *ARTIFICIAL neural networks, *ABILITY
Abstract: Pose variation as one of the key factors prevents the network from learning a robust person re-identification (Re-ID) model. We in this paper propose a novel Semantic Parsing Attention Network (SPAN) to transfer person pose from the source to another. SPAN is constructed with several Semantic Parsing Attention Blocks. Each block focuses on a local transfer of the human manifold, which can attend to put the sample condition patches to the corresponding location of the target image. The introduction of the binary segmentation mask and the semantic parsing map is not only significant for the seamless stitching of the foreground and the background, but also decreases the computation load considerably. Compared with other methods, our network can characterize better body shape as well as keeping clothing attributes during the pose transfer. And our synthesized image can obtain better appearance and shape consistency related to the source image. Experimental results are provided to show the superiority of our network on both qualitative and quantitative results on Market-1501 and DeepFashion datasets. Furthermore, extensive experiments are also conducted on person Re-ID systems trained with the augmented data, where our network has the ability to improve the person Re-ID accuracy. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Publication Type

Database

47 results

Search Results

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources