910 results on '"learning systems"'
Search Results
2. SQL-Net: Semantic Query Learning for Point-Supervised Temporal Action Localization.
- Author
-
Wang, Yu, Zhao, Shengjie, and Chen, Shiwei
- Published
- 2025
- Full Text
- View/download PDF
3. Information Losses in Neural Classifiers From Sampling
- Author
-
Foggo, Brandon, Yu, Nanpeng, Shi, Jie, and Gao, Yuanqi
- Subjects
Information and Computing Sciences ,Machine Learning ,Neural networks ,Machine learning ,Training ,Random variables ,Training data ,Probability distribution ,Learning systems ,Deep learning ,information theory ,large deviations theory ,mutual information ,statistical learning theory ,cs.LG ,stat.ML ,Artificial Intelligence & Image Processing ,Artificial intelligence - Abstract
This article considers the subject of information losses arising from the finite data sets used in the training of neural classifiers. It proves a relationship between such losses as the product of the expected total variation of the estimated neural model with the information about the feature space contained in the hidden representation of that model. It then bounds this expected total variation as a function of the size of randomly sampled data sets in a fairly general setting, and without bringing in any additional dependence on model complexity. It ultimately obtains bounds on information losses that are less sensitive to input compression and in general much smaller than existing bounds. This article then uses these bounds to explain some recent experimental findings of information compression in neural networks that cannot be explained by previous work. Finally, this article shows that not only are these bounds much smaller than existing ones, but they also correspond well with experiments.
- Published
- 2020
4. Feature Interaction Learning Network for Cross-Spectral Image Patch Matching.
- Author
-
Yu, Chuang, Liu, Yunpeng, Zhao, Jinmiao, Wu, Shuhang, and Hu, Zhuhua
- Subjects
- *
FEATURE extraction , *TASK analysis , *MATHEMATICAL optimization , *LEARNING modules , *INSTRUCTIONAL systems - Abstract
Recently, feature relation learning has attracted extensive attention in cross-spectral image patch matching. However, most feature relation learning methods can only extract shallow feature relations and are accompanied by the loss of useful discriminative features or the introduction of disturbing features. Although the latest multi-branch feature difference learning network can relatively sufficiently extract useful discriminative features, the multi-branch network structure it adopts has a large number of parameters. Therefore, we propose a novel two-branch feature interaction learning network (FIL-Net). Specifically, a novel feature interaction learning idea for cross-spectral image patch matching is proposed, and a new feature interaction learning module is constructed, which can effectively mine common and private features between cross-spectral image patches, and extract richer and deeper feature relations with invariance and discriminability. At the same time, we re-explore the feature extraction network for the cross-spectral image patch matching task, and a new two-branch residual feature extraction network with stronger feature extraction capabilities is constructed. In addition, we propose a new multi-loss strong-constrained optimization strategy, which can facilitate reasonable network optimization and efficient extraction of invariant and discriminative features. Furthermore, a public VIS-LWIR patch dataset and a public SEN1-2 patch dataset are constructed. At the same time, the corresponding experimental benchmarks are established, which are convenient for future research while solving few existing cross-spectral image patch matching datasets. Extensive experiments show that the proposed FIL-Net achieves state-of-the-art performance in three different cross-spectral image patch matching scenarios. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
5. Robust Cross-Domain Pseudo-Labeling and Contrastive Learning for Unsupervised Domain Adaptation NIR-VIS Face Recognition.
- Author
-
Yang, Yiming, Hu, Weipeng, Lin, Haiqi, and Hu, Haifeng
- Subjects
- *
LARGE scale systems , *FEATURE extraction , *TASK analysis , *NETWORK performance , *INSTRUCTIONAL systems , *HUMAN facial recognition software - Abstract
Near-infrared and visible face recognition (NIR-VIS) is attracting increasing attention because of the need to achieve face recognition in low-light conditions to enable 24-hour secure retrieval. However, annotating identity labels for a large number of heterogeneous face images is time-consuming and expensive, which limits the application of the NIR-VIS face recognition system to larger scale real-world scenarios. In this paper, we attempt to achieve NIR-VIS face recognition in an unsupervised domain adaptation manner. To get rid of the reliance on manual annotations, we propose a novel Robust cross-domain Pseudo-labeling and Contrastive learning (RPC) network which consists of three key components, i.e., NIR cluster-based Pseudo labels Sharing (NPS), Domain-specific cluster Contrastive Learning (DCL) and Inter-domain cluster Contrastive Learning (ICL). Firstly, NPS is presented to generate pseudo labels by exploring robust NIR clusters and sharing reliable label knowledge with VIS domain. Secondly, DCL is designed to learn intra-domain compact yet discriminative representations. Finally, ICL dynamically combines and refines intrinsic identity relationships to guide the instance-level features to learn robust and domain-independent representations. Extensive experiments are conducted to verify an accuracy of over 99% in pseudo label assignment and the advanced performance of RPC network on four mainstream NIR-VIS datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
6. Incomplete Multi-View Learning Under Label Shift.
- Author
-
Fan, Ruidong, Ouyang, Xiao, Luo, Tingjin, Hu, Dewen, and Hou, Chenping
- Subjects
- *
MISSING data (Statistics) , *IMAGE processing , *PEOPLE with schizophrenia , *LUNG diseases , *SATISFACTION - Abstract
In image processing, images are usually composed of partial views due to the uncertainty of collection and how to efficiently process these images, which is called incomplete multi-view learning, has attracted widespread attention. The incompleteness and diversity of multi-view data enlarges the difficulty of annotation, resulting in the divergence of label distribution between the training and testing data, named as label shift. However, existing incomplete multi-view methods generally assume that the label distribution is consistent and rarely consider the label shift scenario. To address this new but important challenge, we propose a novel framework termed as Incomplete Multi-view Learning under Label Shift (IMLLS). In this framework, we first give the formal definitions of IMLLS and the bidirectional complete representation which describes the intrinsic and common structure. Then, a multilayer perceptron which combines the reconstruction and classification loss is employed to learn the latent representation, whose existence, consistency and universality are proved with the theoretical satisfaction of label shift assumption. After that, to align the label distribution, the learned representation and trained source classifier are used to estimate the importance weight by designing a new estimation scheme which balances the error generated by finite samples in theory. Finally, the trained classifier reweighted by the estimated weight is fine-tuned to reduce the gap between the source and target representations. Extensive experimental results validate the effectiveness of our algorithm over existing state-of-the-arts methods in various aspects, together with its effectiveness in discriminating schizophrenic patients from healthy controls. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
7. Meta Auxiliary Learning for Facial Action Unit Detection.
- Author
-
Li, Yong and Shan, Shiguang
- Abstract
Despite the success of deep neural networks on facial action unit (AU) detection, better performance depends on a large number of training images with accurate AU annotations. However, labeling AU is time-consuming, expensive, and error-prone. Considering AU detection and facial expression recognition (FER) are two highly correlated tasks, and facial expression (FE) is relatively easy to annotate, we consider learning AU detection and FER in a multi-task manner. However, the performance of the AU detection task cannot be always enhanced due to the negative transfer in the multi-task scenario. To alleviate this issue, we propose a Meta Auxiliary Learning method (MAL) that automatically selects highly related FE samples by learning adaptative weights for the training FE samples in a meta learning manner. The learned sample weights alleviate the negative transfer from two aspects: 1) balance the loss of each task automatically, and 2) suppress the weights of FE samples that have large uncertainties. Experimental results on several popular AU datasets demonstrate MAL consistently improves the AU detection performance compared with the state-of-the-art multi-task and auxiliary learning methods. MAL automatically estimates adaptive weights for the auxiliary FE samples according to their semantic relevance with the primary AU detection task. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
8. Graph Lifelong Learning: A Survey.
- Author
-
Febrinanto, Falih Gozi, Xia, Feng, Moore, Kristen, Thapa, Chandra, and Aggarwal, Charu
- Abstract
Graph learning is a popular approach for perfor ming machine learning on graph-structured data. It has revolutionized the machine learning ability to model graph data to address downstream tasks. Its application is wide due to the availability of graph data ranging from all types of networks to information systems. Most graph learning methods assume that the graph is static and its complete structure is known during training. This limits their applicability since they cannot be applied to problems where the underlying graph grows over time and/or new tasks emerge incrementally. Such applications require a lifelong learning approach that can learn the graph continuously and accommodate new information whilst retaining previously learned knowledge. Lifelong learning methods that enable continuous learning in regular domains like images and text cannot be directly applied to continuously evolving graph data, due to its irregular structure. As a result, graph lifelong learning is gaining attention from the research community. This survey paper provides a comprehensive overview of recent advancements in graph lifelong learning, including the categorization of existing methods, and the discussions of potential applications and open research problems. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
9. Stacked One-Class Broad Learning System for Intrusion Detection in Industry 4.0.
- Author
-
Yang, Kaixiang, Shi, Yifan, Yu, Zhiwen, Yang, Qinmin, Sangaiah, Arun Kumar, and Zeng, Huanqiang
- Abstract
With the vigorous development of Industry 4.0, industrial Big Data has turned into the core element of the Industrial Internet of Things. As one of the most fundamental and indispensable components in industrial cyber-physical systems (CPS), intelligent anomaly detection is still an essential and challenging issue. However, with the development of the network, there may exist unknown types of attacks, which are difficult to collect. Facing one-class industrial intrusion detection scenario that the collected training data only includes normal state, the one-class broad learning system (OCBLS) and the stacked OCBLS (ST-OCBLS) algorithms are developed. Benefiting from the characteristics of BLS, our proposed approaches retain the advantage of efficient training process. Moreover, the high-level hidden features of the network traffic data can be learned through the progressive encoding and decoding mechanism in ST-OCBLS. Extensive comparative experiments on several real-world intrusion detection tasks are carried out to demonstrate that our proposed methods have competitive performance and high efficiency in the face of complex network data and diversified types of intrusions. Overall, this article provides a new alternative solution for network intrusion detection in Industry 4.0. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
10. Deep Dense Network-Based Curriculum Reinforcement Learning for High-Speed Overtaking.
- Author
-
Liu, Jia, Li, Huiyun, Yang, Zhiheng, Dang, Shaobo, and Huang, Zhejun
- Abstract
Reinforcement learning methods have promising applications in autonomous vehicles, such as lane keeping, adaptive cruise control, and overtaking. However, long-time sequential decision making in complicated scenarios, such as continuous high-speed overtaking, remains challenge. One such challenge is that current neural networks in reinforcement learning employ a shallow neural network, which has limitations in feature extractions. To address this, we propose a deep actor–critic network framework using multilayer dense-connection networks to obtain efficient extracted features. To be specific, we employ a deep dense architecture in the framework of soft actor–critic, which has multiple hidden layers serving as feature extractions. To tackle the vanishing or exploding gradient problem, we reuse features and design shortcut connections, as well as feature-concatenating operations between hidden layers. Meanwhile, when learning multitask high-speed overtaking as one whole progression, the agent has to learn driving, lane changing, and avoiding other vehicles at the same time. Therefore, another challenge is that sparse rewards worsen the speed of convergence. We design a task-level sequencing curriculum reinforcement learning method to tackle the reward sparseness. It breaks down a complex task into two successive subtasks: the first aims to drive as fast as possible in a single-vehicle model; and the second aims to overtake other vehicles without any collisions as soon as possible according to the knowledge gained from the first task in a multivehicle model. Finally, we evaluate our method using a racing car simulator with different tracks. In a comparison using the standard actor–critic algorithm in three different multivehicle scenarios, the best median results show that our deep actor–critic network framework reduces overtaking distance and overtaking time by up to 4.9% and 8.7%, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
11. Robust Traffic Prediction From Spatial–Temporal Data Based on Conditional Distribution Learning.
- Author
-
Zeng, Zeng, Zhao, Wei, Qian, Peisheng, Zhou, Yingjie, Zhao, Ziyuan, Chen, Cen, and Guan, Cuntai
- Abstract
Traffic prediction based on massive speed data collected from traffic sensors plays an important role in traffic management. However, it is still challenging to obtain satisfactory performance due to the complex and dynamic spatial-temporal correlations among the data. Recently, many research works have demonstrated the effectiveness of graph neural networks (GNNs) for spatial–temporal modeling. However, such models are restricted by conditional distribution during training, and may not perform well when the target is outside the primary region of interest in the distribution. In this article, we address this problem with a stagewise learning mechanism, in which we redefine speed prediction as a conditional distribution learning followed by speed regression. We first perform a conditional distribution learning for each observed speed class, and then obtain speed prediction by optimizing regression learning, based on the learned conditional distribution. To effectively learn the conditional distribution, we introduce a mean–residue loss, consisting of two parts: 1) a mean loss, which penalizes the differences between the mean of the estimated conditional distribution and the ground truth and 2) a residue loss, which penalizes residue errors of the long tails in the distribution. To optimize the subsequent regression based on distribution information, we combine the mean absolute error (MAE) as another part of the loss function. We also incorporate a GNN-based architecture with our proposed learning mechanism. Mean–residue loss is employed to supervise the hidden speed representation in the network at each time interval, followed by a shared layer to recalibrate the hidden temporal dependencies in the conditional distribution. The experimental results based on three public traffic datasets have demonstrated that the effectiveness of the proposed method outperforms state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
12. Incremental Weighted Ensemble Broad Learning System for Imbalanced Data.
- Author
-
Yang, Kaixiang, Yu, Zhiwen, Chen, C. L. Philip, Cao, Wenming, You, Jane, and Wong, Hau-San
- Subjects
- *
INSTRUCTIONAL systems , *WEIGHT training , *DATA distribution - Abstract
Broad learning system (BLS) is a novel and efficient model, which facilitates representation learning and classification by concatenating feature nodes and enhancement nodes. In spite of the efficient properties, BLS is still suboptimal when facing with imbalance problem. Besides, outliers and noises in imbalanced data remain a challenge for BLS. To address the above issues, in this paper we first propose a weighted BLS, which assigns a weight to each training sample, and adopt a general weighting scheme, which augments the weight of samples from the minority class. To further explore the prior distribution of original data, we design a density based weight generation mechanism to guide the specific weight matrix generation and propose the adaptive weighted broad learning system (AWBLS). This mechanism considers the inter-class and intra-class distance simultaneously in the density calculation. Finally, we propose the incremental weighted ensemble broad learning system (IWEB) by utilizing a progressive mechanism to further improve the stability and robustness of AWBLS. Extensive comparative experiments on 38 real-world data sets verfy that IWEB outperforms most of the imbalance ensemble classification methods. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
13. Improving Deep Metric Learning by Divide and Conquer.
- Author
-
Sanakoyeu, Artsiom, Ma, Pingchuan, Tschernezki, Vadim, and Ommer, Bjorn
- Subjects
- *
DEEP learning , *COMPUTER vision , *IMAGE retrieval , *APPLICATION software , *SUBSPACES (Mathematics) , *PHENYLKETONURIA , *VISUAL cryptography - Abstract
Deep metric learning (DML) is a cornerstone of many computer vision applications. It aims at learning a mapping from the input domain to an embedding space, where semantically similar objects are located nearby and dissimilar objects far from another. The target similarity on the training data is defined by user in form of ground-truth class labels. However, while the embedding space learns to mimic the user-provided similarity on the training data, it should also generalize to novel categories not seen during training. Besides user-provided groundtruth training labels, a lot of additional visual factors (such as viewpoint changes or shape peculiarities) exist and imply different notions of similarity between objects, affecting the generalization on the images unseen during training. However, existing approaches usually directly learn a single embedding space on all available training data, struggling to encode all different types of relationships, and do not generalize well. We propose to build a more expressive representation by jointly splitting the embedding space and the data hierarchically into smaller sub-parts. We successively focus on smaller subsets of the training data, reducing its variance and learning a different embedding subspace for each data subset. Moreover, the subspaces are learned jointly to cover not only the intricacies, but the breadth of the data as well. Only after that, we build the final embedding from the subspaces in the conquering stage. The proposed algorithm acts as a transparent wrapper that can be placed around arbitrary existing DML methods. Our approach significantly improves upon the state-of-the-art on image retrieval, clustering, and re-identification tasks evaluated using CUB200-2011, CARS196, Stanford Online Products, In-shop Clothes, and PKU VehicleID datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
14. Progressive Tandem Learning for Pattern Recognition With Deep Spiking Neural Networks.
- Author
-
Wu, Jibin, Xu, Chenglin, Han, Xiao, Zhou, Daquan, Zhang, Malu, Li, Haizhou, and Tan, Kay Chen
- Subjects
- *
ARTIFICIAL neural networks , *PATTERN recognition systems , *IMAGE reconstruction , *APPROXIMATION error - Abstract
Spiking neural networks (SNNs) have shown clear advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency, due to their event-driven nature and sparse communication. However, the training of deep SNNs is not straightforward. In this paper, we propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition, which is referred to as progressive tandem learning. By studying the equivalence between ANNs and SNNs in the discrete representation space, a primitive network conversion method is introduced that takes full advantage of spike count to approximate the activation value of ANN neurons. To compensate for the approximation errors arising from the primitive network conversion, we further introduce a layer-wise learning method with an adaptive training scheduler to fine-tune the network weights. The progressive tandem learning framework also allows hardware constraints, such as limited weight precision and fan-in connections, to be progressively imposed during training. The SNNs thus trained have demonstrated remarkable classification and regression capabilities on large-scale object recognition, image reconstruction, and speech separation tasks, while requiring at least an order of magnitude reduced inference time and synaptic operations than other state-of-the-art SNN implementations. It, therefore, opens up a myriad of opportunities for pervasive mobile and embedded devices with a limited power budget. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
15. Emulating AC OPF Solvers With Neural Networks.
- Author
-
Baker, Kyri
- Subjects
- *
ELECTRICAL load , *ARTIFICIAL neural networks , *MATHEMATICAL optimization , *ACROMIOCLAVICULAR joint - Abstract
Using machine learning to obtain solutions to AC optimal power flow has recently been a very active area of research due to the astounding speedups that result from bypassing traditional optimization techniques. However, generally ensuring feasibility of the resulting predictions while maintaining these speedups is a challenging, unsolved problem. In this letter, we train a neural network to emulate an iterative solver in order to cheaply and approximately iterate towards the optimum. Once we are close to convergence, we then solve a power flow to obtain an overall AC-feasible solution. Results shown for networks up to 1,354 buses indicate the proposed method is capable of finding feasible, near-optimal solutions to AC OPF in milliseconds on a laptop computer. In addition, it is shown that the proposed method can find “difficult” AC OPF solutions that cause flat-start or DC-warm started algorithms to diverge. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
16. Addi-Reg: A Better Generalization-Optimization Tradeoff Regularization Method for Convolutional Neural Networks.
- Author
-
Lu, Yao, Zhang, Zheng, Lu, Guangming, Zhou, Yicong, Li, Jinxing, and Zhang, David
- Abstract
In convolutional neural networks (CNNs), generating noise for the intermediate feature is a hot research topic in improving generalization. The existing methods usually regularize the CNNs by producing multiplicative noise (regularization weights), called multiplicative regularization (Multi-Reg). However, Multi-Reg methods usually focus on improving generalization but fail to jointly consider optimization, leading to unstable learning with slow convergence. Moreover, Multi-Reg methods are not flexible enough since the regularization weights are generated from a definite manual-design distribution. Besides, most popular methods are not universal enough, because these methods are only designed for the residual networks. In this article, we, for the first time, experimentally and theoretically explore the nature of generating noise in the intermediate features for popular CNNs. We demonstrate that injecting noise in the feature space can be transformed to generating noise in the input space, and these methods regularize the networks in a Mini-batch in Mini-batch (MiM) sampling manner. Based on these observations, this article further discovers that generating multiplicative noise can easily degenerate the optimization due to its high dependence on the intermediate feature. Based on these studies, we propose a novel additional regularization (Addi-Reg) method, which can adaptively produce additional noise with low dependence on intermediate feature in CNNs by employing a series of mechanisms. Particularly, these well-designed mechanisms can stabilize the learning process in training, and our Addi-Reg method can pertinently learn the noise distributions for every layer in CNNs. Extensive experiments demonstrate that the proposed Addi-Reg method is more flexible and universal, and meanwhile achieves better generalization performance with faster convergence against the state-of-the-art Multi-Reg methods. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
17. Plug-and-Play Image Restoration With Deep Denoiser Prior.
- Author
-
Zhang, Kai, Li, Yawei, Zuo, Wangmeng, Zhang, Lei, Van Gool, Luc, and Timofte, Radu
- Subjects
- *
IMAGE reconstruction , *ARTIFICIAL neural networks , *CONVOLUTIONAL neural networks - Abstract
Recent works on plug-and-play image restoration have shown that a denoiser can implicitly serve as the image prior for model-based methods to solve many inverse problems. Such a property induces considerable advantages for plug-and-play image restoration (e.g., integrating the flexibility of model-based method and effectiveness of learning-based methods) when the denoiser is discriminatively learned via deep convolutional neural network (CNN) with large modeling capacity. However, while deeper and larger CNN models are rapidly gaining popularity, existing plug-and-play image restoration hinders its performance due to the lack of suitable denoiser prior. In order to push the limits of plug-and-play image restoration, we set up a benchmark deep denoiser prior by training a highly flexible and effective CNN denoiser. We then plug the deep denoiser prior as a modular part into a half quadratic splitting based iterative algorithm to solve various image restoration problems. We, meanwhile, provide a thorough analysis of parameter setting, intermediate results and empirical convergence to better understand the working mechanism. Experimental results on three representative image restoration tasks, including deblurring, super-resolution and demosaicing, demonstrate that the proposed plug-and-play image restoration with deep denoiser prior not only significantly outperforms other state-of-the-art model-based methods but also achieves competitive or even superior performance against state-of-the-art learning-based methods. The source code is available at https://github.com/cszn/DPIR. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
18. A Concise Yet Effective Model for Non-Aligned Incomplete Multi-View and Missing Multi-Label Learning.
- Author
-
Li, Xiang and Chen, Songcan
- Subjects
- *
VIDEO surveillance - Abstract
In reality, learning from multi-view multi-label data inevitably confronts three challenges: missing labels, incomplete views, and non-aligned views. Existing methods mainly concern the first two and commonly need multiple assumptions to attack them, making even state-of-the-arts involve at least two explicit hyper-parameters such that model selection is quite difficult. More toughly, they will fail in handling the third challenge, let alone addressing the three jointly. In this paper, we aim at meeting these under the least assumption by building a concise yet effective model with just one hyper-parameter. To ease insufficiency of available labels, we exploit not only the consensus of multiple views but also the global and local structures hidden among multiple labels. Specifically, we introduce an indicator matrix to tackle the first two challenges in a regression form while aligning the same individual labels and all labels of different views in a common label space to battle the third challenge. In aligning, we characterize the global and local structures of multiple labels to be high-rank and low-rank, respectively. Subsequently, an efficient algorithm with linear time complexity in the number of samples is established. Finally, even without view-alignment, our method substantially outperforms state-of-the-arts with view-alignment on five real datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
19. Joint Progressive and Coarse-to-Fine Registration of Brain MRI via Deformation Field Integration and Non-Rigid Feature Fusion.
- Author
-
Lv, Jinxin, Wang, Zhiwei, Shi, Hongkuan, Zhang, Haobo, Wang, Sheng, Wang, Yilang, and Li, Qiang
- Subjects
- *
MAGNETIC resonance imaging , *RECORDING & registration - Abstract
Registration of brain MRI images requires to solve a deformation field, which is extremely difficult in aligning intricate brain tissues, e.g., subcortical nuclei, etc. Existing efforts resort to decomposing the target deformation field into intermediate sub-fields with either tiny motions, i.e., progressive registration stage by stage, or lower resolutions, i.e., coarse-to-fine estimation of the full-size deformation field. In this paper, we argue that those efforts are not mutually exclusive, and propose a unified framework for robust brain MRI registration in both progressive and coarse-to-fine manners simultaneously. Specifically, building on a dual-encoder U-Net, the fixed-moving MRI pair is encoded and decoded into multi-scale sub-fields from coarse to fine. Each decoding block contains two proposed novel modules: i) in Deformation Field Integration (DFI), a single integrated deformation sub-field is calculated, warping by which is equivalent to warping progressively by sub-fields from all previous decoding blocks, and ii) in Non-rigid Feature Fusion (NFF), features of the fixed-moving pair are aligned by DFI-integrated deformation field, and then fused to predict a finer sub-field. Leveraging both DFI and NFF, the target deformation field is factorized into multi-scale sub-fields, where the coarser fields alleviate the estimate of a finer one and the finer field learns to make up those misalignments insolvable by previous coarser ones. The extensive and comprehensive experimental results on both private and two public datasets demonstrate a superior registration performance of brain MRI images over progressive registration only and coarse-to-fine estimation only, with an increase by at most 8% in the average Dice. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
20. Adaptive Neighborhood Metric Learning.
- Author
-
Song, Kun, Han, Junwei, Cheng, Gong, Lu, Jiwen, and Nie, Feiping
- Subjects
- *
MACHINE learning , *NEIGHBORHOODS , *DEEP learning , *DISTANCE education - Abstract
In this paper, we reveal that metric learning would suffer from serious inseparable problem if without informative sample mining. Since the inseparable samples are often mixed with hard samples, current informative sample mining strategies used to deal with inseparable problem may bring up some side-effects, such as instability of objective function, etc. To alleviate this problem, we propose a novel distance metric learning algorithm, named adaptive neighborhood metric learning (ANML). In ANML, we design two thresholds to adaptively identify the inseparable similar and dissimilar samples in the training procedure, thus inseparable sample removing and metric parameter learning are implemented in the same procedure. Due to the non-continuity of the proposed ANML, we develop an ingenious function, named log-exp mean function to construct a continuous formulation to surrogate it, which can be efficiently solved by the gradient descent method. Similar to Triplet loss, ANML can be used to learn both the linear and deep embeddings. By analyzing the proposed method, we find it has some interesting properties. For example, when ANML is used to learn the linear embedding, current famous metric learning algorithms such as the large margin nearest neighbor (LMNN) and neighbourhood components analysis (NCA) are the special cases of the proposed ANML by setting the parameters different values. When it is used to learn deep features, the state-of-the-art deep metric learning algorithms such as Triplet loss, Lifted structure loss, and Multi-similarity loss become the special cases of ANML. Furthermore, the log-exp mean function proposed in our method gives a new perspective to review the deep metric learning methods such as Prox-NCA and N-pairs loss. At last, promising experimental results demonstrate the effectiveness of the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
21. Progressive Ensemble Kernel-Based Broad Learning System for Noisy Data Classification.
- Author
-
Yu, Zhiwen, Lan, Kankan, Liu, Zhulin, and Han, Guoqiang
- Abstract
The broad learning system (BLS) is an algorithm that facilitates feature representation learning and data classification. Although weights of BLS are obtained by analytical computation, which brings better generalization and higher efficiency, BLS suffers from two drawbacks: 1) the performance depends on the number of hidden nodes, which requires manual tuning, and 2) double random mappings bring about the uncertainty, which leads to poor resistance to noise data, as well as unpredictable effects on performance. To address these issues, a kernel-based BLS (KBLS) method is proposed by projecting feature nodes obtained from the first random mapping into kernel space. This manipulation reduces the uncertainty, which contributes to performance improvements with the fixed number of hidden nodes, and indicates that manually tuning is no longer needed. Moreover, to further improve the stability and noise resistance of KBLS, a progressive ensemble framework is proposed, in which the residual of the previous base classifiers is used to train the following base classifier. We conduct comparative experiments against the existing state-of-the-art hierarchical learning methods on multiple noisy real-world datasets. The experimental results indicate our approaches achieve the best or at least comparable performance in terms of accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
22. Research Review for Broad Learning System: Algorithms, Theory, and Applications.
- Author
-
Gong, Xinrong, Zhang, Tong, Chen, C. L. Philip, and Liu, Zhulin
- Abstract
In recent years, the appearance of the broad learning system (BLS) is poised to revolutionize conventional artificial intelligence methods. It represents a step toward building more efficient and effective machine-learning methods that can be extended to a broader range of necessary research fields. In this survey, we provide a comprehensive overview of the BLS in data mining and neural networks for the first time, focusing on summarizing various BLS methods from the aspects of its algorithms, theories, applications, and future open research questions. First, we introduce the basic pattern of BLS manifestation, the universal approximation capability, and essence from the theoretical perspective. Furthermore, we focus on BLS’s various improvements based on the current state of the theoretical research, which further improves its flexibility, stability, and accuracy under general or specific conditions, including classification, regression, semisupervised, and unsupervised tasks. Due to its remarkable efficiency, impressive generalization performance, and easy extendibility, BLS has been applied in different domains. Next, we illustrate BLS’s practical advances, such as computer vision, biomedical engineering, control, and natural language processing. Finally, the future open research problems and promising directions for BLSs are pointed out. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
23. Fuzzy Broad Learning System Based on Accelerating Amount.
- Author
-
Zou, Weidong, Xia, Yuanqing, and Dai, Li
- Subjects
INSTRUCTIONAL systems ,FUZZY systems - Abstract
For taking out the adjustment process of sparse auto-encoder for broad learning system, Feng et al. proposed fuzzy broad learning system by replacing the feature nodes of broad learning system with Takagi–Sugeno fuzzy systems. In fuzzy broad learning system, artificial parameters selection of ridge regression might result in the decrease in testing accuracy. To overcome this shortcoming of fuzzy broad learning system, this article builds a novel model of fuzzy broad learning system based on accelerating amount by introducing the accelerating amount into fuzzy broad learning system. The theoretical result on the universal approximation property of fuzzy broad learning system based on accelerating amount is presented. Three experiment studies on the regression problems of UCI, fashion MNIST, and medical MNIST datasets are performed to show the improvement in testing accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
24. End-to-End Hierarchical Reinforcement Learning With Integrated Subgoal Discovery.
- Author
-
Pateria, Shubham, Subagdja, Budhitama, Tan, Ah-Hwee, and Quek, Chai
- Subjects
- *
EDUCATIONAL attainment , *MARKOV processes - Abstract
Hierarchical reinforcement learning (HRL) is a promising approach to perform long-horizon goal-reaching tasks by decomposing the goals into subgoals. In a holistic HRL paradigm, an agent must autonomously discover such subgoals and also learn a hierarchy of policies that uses them to reach the goals. Recently introduced end-to-end HRL methods accomplish this by using the higher-level policy in the hierarchy to directly search the useful subgoals in a continuous subgoal space. However, learning such a policy may be challenging when the subgoal space is large. We propose integrated discovery of salient subgoals (LIDOSS), an end-to-end HRL method with an integrated subgoal discovery heuristic that reduces the search space of the higher-level policy, by explicitly focusing on the subgoals that have a greater probability of occurrence on various state-transition trajectories leading to the goal. We evaluate LIDOSS on a set of continuous control tasks in the MuJoCo domain against hierarchical actor critic (HAC), a state-of-the-art end-to-end HRL method. The results show that LIDOSS attains better goal achievement rates than HAC in most of the tasks. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
25. Subarchitecture Ensemble Pruning in Neural Architecture Search.
- Author
-
Bian, Yijun, Song, Qingquan, Du, Mengnan, Yao, Jun, Chen, Huanhuan, and Hu, Xia
- Subjects
- *
COMPUTER architecture , *NETWORK-attached storage - Abstract
Neural architecture search (NAS) is gaining more and more attention in recent years because of its flexibility and remarkable capability to reduce the burden of neural network design. To achieve better performance, however, the searching process usually costs massive computations that might not be affordable for researchers and practitioners. Although recent attempts have employed ensemble learning methods to mitigate the enormous computational cost, however, they neglect a key property of ensemble methods, namely diversity, which leads to collecting more similar subarchitectures with potential redundancy in the final design. To tackle this problem, we propose a pruning method for NAS ensembles called “ subarchitecture ensemble pruning in neural architecture search (SAEP).” It targets to leverage diversity and to achieve subensemble architectures at a smaller size with comparable performance to ensemble architectures that are not pruned. Three possible solutions are proposed to decide which subarchitectures to prune during the searching process. Experimental results exhibit the effectiveness of the proposed method by largely reducing the number of subarchitectures without degrading the performance. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
26. Topology and Content Co-Alignment Graph Convolutional Learning.
- Author
-
Shi, Min, Tang, Yufei, and Zhu, Xingquan
- Subjects
- *
TOPOLOGY , *ARTIFICIAL neural networks - Abstract
In traditional graph neural networks (GNNs), graph convolutional learning is carried out through topology-driven recursive node content aggregation for network representation learning. In reality, network topology and node content each provide unique and important information, and they are not always consistent because of noise, irrelevance, or missing links between nodes. A pure topology-driven feature aggregation approach between unaligned neighborhoods may deteriorate learning from nodes with poor structure-content consistency, due to the propagation of incorrect messages over the whole network. Alternatively, in this brief, we advocate a co-alignment graph convolutional learning (CoGL) paradigm, by aligning topology and content networks to maximize consistency. Our theme is to enforce the learning from the topology network to be consistent with the content network while simultaneously optimizing the content network to comply with the topology for optimized representation learning. Given a network, CoGL first reconstructs a content network from node features then co-aligns the content network and the original network through a unified optimization goal with: 1) minimized content loss; 2) minimized classification loss; and 3) minimized adversarial loss. Experiments on six benchmarks demonstrate that CoGL achieves comparable and even better performance compared with existing state-of-the-art GNN models. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
27. Hyperspectral Image Super-Resolution via Deep Spatiospectral Attention Convolutional Neural Networks.
- Author
-
Hu, Jin-Fan, Huang, Ting-Zhu, Deng, Liang-Jian, Jiang, Tai-Xiang, Vivone, Gemine, and Chanussot, Jocelyn
- Subjects
- *
CONVOLUTIONAL neural networks , *HIGH resolution imaging , *DEEP learning , *MULTISPECTRAL imaging , *SPATIAL resolution , *ERROR functions - Abstract
Hyperspectral images (HSIs) are of crucial importance in order to better understand features from a large number of spectral channels. Restricted by its inner imaging mechanism, the spatial resolution is often limited for HSIs. To alleviate this issue, in this work, we propose a simple and efficient architecture of deep convolutional neural networks to fuse a low-resolution HSI (LR-HSI) and a high-resolution multispectral image (HR-MSI), yielding a high-resolution HSI (HR-HSI). The network is designed to preserve both spatial and spectral information thanks to a new architecture based on: 1) the use of the LR-HSI at the HR-MSI’s scale to get an output with satisfied spectral preservation and 2) the application of the attention and pixelShuffle modules to extract information, aiming to output high-quality spatial details. Finally, a plain mean squared error loss function is used to measure the performance during the training. Extensive experiments demonstrate that the proposed network architecture achieves the best performance (both qualitatively and quantitatively) compared with recent state-of-the-art HSI super-resolution approaches. Moreover, other significant advantages can be pointed out by the use of the proposed approach, such as a better network generalization ability, a limited computational burden, and the robustness with respect to the number of training samples. Please find the source code and pretrained models from https://liangjiandeng.github.io/Projects_Res/HSRnet_2021tnnls.html. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
28. Reinforcement Online Active Learning Ensemble for Drifting Imbalanced Data Streams.
- Author
-
Zhang, Hang, Liu, Weike, and Liu, Qingbao
- Subjects
- *
ACTIVE learning , *ONLINE education , *CLASSIFICATION algorithms , *RECEIVER operating characteristic curves , *HEURISTIC algorithms - Abstract
Applications challenged by the joint problem of concept drift and class imbalance are attracting increasing research interest. This paper proposes a novel Reinforcement Online Active Learning Ensemble for Drifting Imbalanced data stream (ROALE-DI). The ensemble classifier has a long-term stable classifier and a dynamic classifier group which applies a reinforcement mechanism to increase the weight of the dynamic classifiers, which perform better on the minority class, and decreases the weight of the opposite. When the data stream is class imbalanced, the classifiers will lack the training samples of the minority class. To supply training samples, when creating a new classifier, the labeled instances buffer is used to provide instances of the minority class. Then, a hybrid labeling strategy that combines the uncertainty strategy and imbalance strategy is proposed to define whether to obtain the real label of an instance. An experimental evaluation compares the classification performance of the proposed method with semi-supervised and supervised algorithms on both real-world and synthetic data streams. The results show that the ROALE-DI achieves higher Area Under the ROC Curve (AUC) and accuracy values with even fewer real labels, and the labeling cost dynamically adjusts according to the concept drift and class imbalance ratio. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
29. Customer Order Behavior Classification Via Convolutional Neural Networks in the Semiconductor Industry.
- Author
-
Ratusny, Marco, Schiffer, Maximilian, and Ehm, Hans
- Subjects
- *
CONVOLUTIONAL neural networks , *SEMICONDUCTOR industry , *SUPPLY chain management , *CLASSIFICATION , *DATA mining - Abstract
In the operational processes of demand planning and order management, it is crucial to understand customer order behavior to provide insights for supply chain management processes. Here, advances in the semiconductor industry have emerged through the extraction of important information from vast amounts of data. This new data and information availability paves the way for the development of improved methods to analyze and classify customer order behavior (COB). To this end, we develop a novel, sophisticated yet intuitive image-based representation for COBs using two-dimensional heat maps. This heat map representation contributes significantly to the development of a novel COB classification framework. In this framework, we utilize data enrichment via synthetical training samples to train a CNN model that performs the classification task. Integrating synthetically generated data into the training phase allows us to strengthen the inclusion of rare pattern variants that we identified during initial analysis. Moreover, we show how this framework is used in practice at Infineon. We finally use actual customer data to benchmark the performance of our framework and show that the baseline CNN approach outperforms all available state-of-the-art benchmark models. Additionally, our results highlight the benefit of synthetic data enrichment. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
30. Spectral–Spatial Classification of Few Shot Hyperspectral Image With Deep 3-D Convolutional Random Fourier Features Network.
- Author
-
Wang, Tingting, Liu, Huanyu, and Li, Junbao
- Subjects
- *
DEEP learning , *THREE-dimensional imaging , *REMOTE sensing , *FEATURE extraction , *LAND cover , *CLASSIFICATION algorithms , *SUPPORT vector machines - Abstract
Remote sensing hyperspectral images are very useful for land cover classification because of their rich spatial and spectral information. However, hyperspectral image acquisition and pixel labeling are laborious and time-consuming, so few-shot learning methods are considered to solve this problem. Deep learning has gradually been used for few-shot hyperspectral classification, but there are some problems. The feature extraction network based on deep learning requires too many parameters to be trained, resulting in a huge network model, which is not conducive to deployment on remote sensing data acquisition equipment. Moreover, due to the lack of label samples, the algorithm based on deep learning is more prone to overfitting. To solve the above problems, considering the advanced characteristics of the kernel method in dealing with nonlinear, small sample, and high-dimensional data, we propose a small scale high precision network called 3-D convolution random Fourier features (3-DCRFF) based on the random Fourier feature (RFF) kernel approximation, which is the 3-DCRFF network. First, we combine 3-D convolution with RFF as the basic structure of the network to extract the spatial and spectral features of HSI cubes. Second, we use a classifier based on attention mechanism to classify feature vectors to obtain recognition probability. Finally, the network parameters are solved from the perspective of Bayesian optimization, and the synthetic gradient optimization method is designed and implemented to realize the fast learning of the network. A large number of experiments HSI classification experiments were performed on University of Pavia (UP), Pavia Center (PC), Indian Pines (IP), and Salinas standard remote sensing datasets, the results show that our algorithm outperforms most state-of-the-art algorithms on few-shot classification. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
31. Canonical Correlation Analysis With Low-Rank Learning for Image Representation.
- Author
-
Lu, Yuwu, Wang, Wenjing, Zeng, Biqing, Lai, Zhihui, Shen, Linlin, and Li, Xuelong
- Subjects
- *
IMAGE representation , *STATISTICAL correlation , *PATTERN recognition systems , *MATRICES (Mathematics) , *EUCLIDEAN metric , *EUCLIDEAN distance - Abstract
As a multivariate data analysis tool, canonical correlation analysis (CCA) has been widely used in computer vision and pattern recognition. However, CCA uses Euclidean distance as a metric, which is sensitive to noise or outliers in the data. Furthermore, CCA demands that the two training sets must have the same number of training samples, which limits the performance of CCA-based methods. To overcome these limitations of CCA, two novel canonical correlation learning methods based on low-rank learning are proposed in this paper for image representation, named robust canonical correlation analysis (robust-CCA) and low-rank representation canonical correlation analysis (LRR-CCA). By introducing two regular matrices, the training sample numbers of the two training datasets can be set as any values without any limitation in the two proposed methods. Specifically, robust-CCA uses low-rank learning to remove the noise in the data and extracts the maximization correlation features from the two learned clean data matrices. The nuclear norm and $L_{1}$ -norm are used as constraints for the learned clean matrices and noise matrices, respectively. LRR-CCA introduces low-rank representation into CCA to ensure that the correlative features can be obtained in low-rank representation. To verify the performance of the proposed methods, five publicly image databases are used to conduct extensive experiments. The experimental results demonstrate the proposed methods outperform state-of-the-art CCA-based and low-rank learning methods. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
32. HQ2CL: A High-Quality Class Center Learning System for Deep Face Recognition.
- Author
-
Lv, Xianwei, Yu, Chen, Jin, Hai, and Liu, Kai
- Subjects
- *
DEEP learning , *FACE perception , *INSTRUCTIONAL systems - Abstract
Benefited from the proposals of function losses margin-based, face recognition has achieved significant improvements in recent years. Those losses aim to increase the margin between the different identities to enhance the discriminability. Ideally, the class center of different identities is far from each other, and face samples are compact around the corresponding class center. Hence, it’s very vital to produce a high-quality class center. However, the distribution of training sets determines the class center. With low-quality samples being in the majority, the class center would be close to the samples with little identity information. As a result, it would impair the discriminability of the learned model for those unseen samples. In this work, we propose a High-Quality Class Center Learning system (HQ2CL). This is an effective system and guides the class center to approach the high-quality samples to keep the discriminability. Specifically, HQ2CL introduces a quality-aware scale and margin layer for the identification loss and constructs a new high-quality center loss. We implement the proposed system without additional burden. And we present the experimental evaluation over different face benchmarks. The experimental results show the superiority of our proposed HQ2CL over the state-of-the-arts. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
33. Learning Calibrated Class Centers for Few-Shot Classification by Pair-Wise Similarity.
- Author
-
Guo, Yurong, Du, Ruoyi, Li, Xiaoxu, Xie, Jiyang, Ma, Zhanyu, and Dong, Yuan
- Subjects
- *
IMAGE recognition (Computer vision) , *APPROXIMATION error , *CLUSTER sampling , *FEATURE extraction , *CLASSIFICATION , *NAIVE Bayes classification - Abstract
Metric-based methods achieve promising performance on few-shot classification by learning clusters on support samples and generating shared decision boundaries for query samples. However, existing methods ignore the inaccurate class center approximation introduced by the limited number of support samples, which consequently leads to biased inference. Therefore, in this paper, we propose to reduce the approximation error by class center calibration. Specifically, we introduce the so-called Pair-wise Similarity Module (PSM) to generate calibrated class centers adapted to the query sample by capturing the semantic correlations between the support and the query samples, as well as enhancing the discriminative regions on support representation. It is worth noting that the proposed PSM is a simple plug-and-play module and can be inserted into most metric-based few-shot learning models. Through extensive experiments in metric-based models, we demonstrate that the module significantly improves the performance of conventional few-shot classification methods on four few-shot image classification benchmark datasets. Codes are available at: https://github.com/PRIS-CV/Pair-wise-Similarity-module. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
34. Unsupervised Meta Learning With Multiview Constraints for Hyperspectral Image Small Sample set Classification.
- Author
-
Gao, Kuiliang, Liu, Bing, Yu, Xuchu, and Yu, Anzhu
- Subjects
- *
DEEP learning , *SUPERVISED learning , *MACHINE learning , *CLASSIFICATION - Abstract
The difficulties of obtaining sufficient labeled samples have always been one of the factors hindering deep learning models from obtaining high accuracy in hyperspectral image (HSI) classification. To reduce the dependence of deep learning models on training samples, meta learning methods have been introduced, effectively improving the classification accuracy in small sample set scenarios. However, the existing methods based on meta learning still need to construct a labeled source data set with several pre-collected HSIs, and must utilize a large number of labeled samples for meta-training, which is actually time-consuming and labor-intensive. To solve this problem, this paper proposes a novel unsupervised meta learning method with multiview constraints for HSI small sample set classification. Specifically, the proposed method first builds an unlabeled source data set using unlabeled HSIs. Then, multiple spatial-spectral multiview features of each unlabeled sample are generated to construct tasks for unsupervised meta learning. Finally, the designed residual relation network is used for meta-training and small sample set classification based on the voting strategy. Compared with existing supervised meta learning methods for HSI classification, our method can only utilize HSIs without any label for unsupervised meta learning, which significantly reduces the number of requisite labeled samples in the whole classification process. To verify the effectiveness of the proposed method, extensive experiments are carried out on 8 public HSIs in the cross-domain and in-domain classification scenarios. The statistical results demonstrate that, compared with existing supervised meta learning methods and other advanced classification models, the proposed method can achieve competitive or better classification performance in small sample set scenarios. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
35. Occlusion-Aware Unsupervised Learning of Depth From 4-D Light Fields.
- Author
-
Jin, Jing and Hou, Junhui
- Subjects
- *
KERNEL (Mathematics) , *GRAPHICS processing units , *COHERENCE (Physics) , *OCCLUSION (Chemistry) - Abstract
Depth estimation is a fundamental issue in 4-D light field processing and analysis. Although recent supervised learning-based light field depth estimation methods have significantly improved the accuracy and efficiency of traditional optimization-based ones, these methods rely on the training over light field data with ground-truth depth maps which are challenging to obtain or even unavailable for real-world light field data. Besides, due to the inevitable gap (or domain difference) between real-world and synthetic data, they may suffer from serious performance degradation when generalizing the models trained with synthetic data to real-world data. By contrast, we propose an unsupervised learning-based method, which does not require ground-truth depth as supervision during training. Specifically, based on the basic knowledge of the unique geometry structure of light field data, we present an occlusion-aware strategy to improve the accuracy on occlusion areas, in which we explore the angular coherence among subsets of the light field views to estimate initial depth maps, and utilize a constrained unsupervised loss to learn their corresponding reliability for final depth prediction. Additionally, we adopt a multi-scale network with a weighted smoothness loss to handle the textureless areas. Experimental results on synthetic data show that our method can significantly shrink the performance gap between the previous unsupervised method and supervised ones, and produce depth maps with comparable accuracy to traditional methods with obviously reduced computational cost. Moreover, experiments on real-world datasets show that our method can avoid the domain shift problem presented in supervised methods, demonstrating the great potential of our method. The code will be publicly available at https://github.com/jingjin25/LFDE-OccUnNet. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
36. Diverse Complementary Part Mining for Weakly Supervised Object Localization.
- Author
-
Meng, Meng, Zhang, Tianzhu, Yang, Wenfei, Zhao, Jian, Zhang, Yongdong, and Wu, Feng
- Subjects
- *
PRODUCT management software , *SCALABILITY - Abstract
Weakly Supervised Object Localization (WSOL) aims to localize objects with only image-level labels, which has better scalability and practicability than fully supervised methods in the actual deployment. However, a common limitation for available techniques based on classification networks is that they only highlight the most discriminative part of the object, not the entire object. To alleviate this problem, we propose a novel end-to-end part discovery model (PDM) to learn multiple discriminative object parts in a unified network for accurate object localization and classification. The proposed PDM enjoys several merits. First, to the best of our knowledge, it is the first work to directly model diverse and robust object parts by exploiting part diversity, compactness, and importance jointly for WSOL. Second, three effective mechanisms including diversity, compactness, and importance learning mechanisms are designed to learn robust object parts. Therefore, our model can exploit complementary spatial information and local details from the learned object parts, which help to produce precise bounding boxes and discriminate different object categories. Extensive experiments on two standard benchmarks demonstrate that our PDM performs favorably against state-of-the-art WSOL approaches. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
37. Information Symmetry Matters: A Modal-Alternating Propagation Network for Few-Shot Learning.
- Author
-
Ji, Zhong, Hou, Zhishen, Liu, Xiyao, Pang, Yanwei, and Han, Jungong
- Subjects
- *
SYMMETRY , *PETRI nets , *INFORMATION asymmetry , *DATABASES , *SEMANTICS , *KNOWLEDGE transfer - Abstract
Semantic information provides intra-class consistency and inter-class discriminability beyond visual concepts, which has been employed in Few-Shot Learning (FSL) to achieve further gains. However, semantic information is only available for labeled samples but absent for unlabeled samples, in which the embeddings are rectified unilaterally by guiding the few labeled samples with semantics. Therefore, it is inevitable to bring a cross-modal bias between semantic-guided samples and nonsemantic-guided samples, which results in an information asymmetry problem. To address this problem, we propose a Modal-Alternating Propagation Network (MAP-Net) to supplement the absent semantic information of unlabeled samples, which builds information symmetry among all samples in both visual and semantic modalities. Specifically, the MAP-Net transfers the neighbor information by the graph propagation to generate the pseudo-semantics for unlabeled samples guided by the completed visual relationships and rectify the feature embeddings. In addition, due to the large discrepancy between visual and semantic modalities, we design a Relation Guidance (RG) strategy to guide the visual relation vectors via semantics so that the propagated information is more beneficial. Extensive experimental results on three semantic-labeled datasets, i.e., Caltech-UCSD-Birds 200-2011, SUN Attribute Database and Oxford 102 Flower, have demonstrated that our proposed method achieves promising performance and outperforms the state-of-the-art approaches, which indicates the necessity of information symmetry. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
38. A Continual Learning Survey: Defying Forgetting in Classification Tasks.
- Author
-
De Lange, Matthias, Aljundi, Rahaf, Masana, Marc, Parisot, Sarah, Jia, Xu, Leonardis, Ales, Slabaugh, Gregory, and Tuytelaars, Tinne
- Subjects
- *
TASKS , *ARTIFICIAL neural networks - Abstract
Artificial neural networks thrive in solving the classification problem for a particular rigid task, acquiring knowledge through generalized learning behaviour from a distinct training phase. The resulting network resembles a static entity of knowledge, with endeavours to extend this knowledge without targeting the original task resulting in a catastrophic forgetting. Continual learning shifts this paradigm towards networks that can continually accumulate knowledge over different tasks without the need to retrain from scratch. We focus on task incremental classification, where tasks arrive sequentially and are delineated by clear boundaries. Our main contributions concern: (1) a taxonomy and extensive overview of the state-of-the-art; (2) a novel framework to continually determine the stability-plasticity trade-off of the continual learner; (3) a comprehensive experimental comparison of 11 state-of-the-art continual learning methods; and (4) baselines. We empirically scrutinize method strengths and weaknesses on three benchmarks, considering Tiny Imagenet and large-scale unbalanced iNaturalist and a sequence of recognition datasets. We study the influence of model capacity, weight decay and dropout regularization, and the order in which the tasks are presented, and qualitatively compare methods in terms of required memory, computation time, and storage. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
39. Global-Local Interplay in Semantic Alignment for Few-Shot Learning.
- Author
-
Hao, Fusheng, He, Fengxiang, Cheng, Jun, and Tao, Dacheng
- Subjects
- *
INFORMATION design , *SEMANTICS - Abstract
Few-shot learning aims to recognize novel classes from only a few labeled training examples. Aligning semantically relevant local regions has shown promise in effectively comparing a query image with support images. However, global information is usually overlooked in the existing approaches, resulting in a higher possibility of learning semantics unrelated to the global information. To address this issue, we propose a Global-Local Interplay Metric Learning (GLIML) framework to employ the interplay between global features and local features to guide semantic alignment. We first design a Global-Local Information Concurrent Learning (GLICL) module to extract both global features and local features and perform global-local interplay. We then design a Global-Local Information Cross-Covariance Estimator (GLICCE) to learn the similarity on the global-local interplay, in contrast to the current practice where only local features are considered. Visualizations show that the global-local interplay decreases (1) the weights placed on the semantics that are irrelevant to the global information and (2) the variability of the learned features within every class in the feature space. Quantitative experiments on three benchmark datasets demonstrate that GLIML achieves state-of-the-art performance while maintaining high efficiency. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
40. Improving Video Temporal Consistency via Broad Learning System.
- Author
-
Sheng, Bin, Li, Ping, Ali, Riaz, and Chen, C. L. Philip
- Abstract
Applying image-based processing methods to original videos on a framewise level breaks the temporal consistency between consecutive frames. Traditional video temporal consistency methods reconstruct an original frame containing flickers from corresponding nonflickering frames, but the inaccurate correspondence realized by optical flow restricts their practical use. In this article, we propose a temporally broad learning system (TBLS), an approach that enforces temporal consistency between frames. We establish the TBLS as a flat network comprising the input data, consisting of an original frame in an original video, a corresponding frame in the temporally inconsistent video on which the image-based technique was applied, and an output frame of the last original frame, as mapped features in feature nodes. Then, we refine extracted features by enhancing the mapped features as enhancement nodes with randomly generated weights. We then connect all extracted features to the output layer with a target weight vector. With the target weight vector, we can minimize the temporal information loss between consecutive frames and the video fidelity loss in the output videos. Finally, we remove the temporal inconsistency in the processed video and output a temporally consistent video. Besides, we propose an alternative incremental learning algorithm based on the increment of the mapped feature nodes, enhancement nodes, or input data to improve learning accuracy by a broad expansion. We demonstrate the superiority of our proposed TBLS by conducting extensive experiments. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
41. Neighborhood Preserving and Weighted Subspace Learning Method for Drift Compensation in Gas Sensor.
- Author
-
Yi, Zhengkun, Shang, Wanfeng, Xu, Tiantian, and Wu, Xinyu
- Subjects
- *
GAS detectors , *NEIGHBORHOODS , *GAUSSIAN distribution , *LEARNING ability - Abstract
This article presents a novel discriminative subspace-learning-based unsupervised domain adaptation (DA) method for the gas sensor drift problem. Many existing subspace learning approaches assume that the gas sensor data follow a certain distribution such as Gaussian, which often does not exist in real-world applications. In this article, we address this issue by proposing a novel discriminative subspace learning method for DA with neighborhood preserving (DANP). We introduce two novel terms, including the intraclass graph term and the interclass graph term, to embed the graphs into DA. Besides, most existing methods ignore the influence of the subspace learning on the classifier design. To tackle this issue, we present a novel classifier design method (DANP+) that incorporates the DA ability of the subspace into the learning of the classifier. The weighting function is introduced to assign different weights to different dimensions of the subspace. We have verified the effectiveness of the proposed methods by conducting experiments on two public gas sensor datasets in comparison with the state-of-the-art DA methods. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
42. DVG-Face: Dual Variational Generation for Heterogeneous Face Recognition.
- Author
-
Fu, Chaoyou, Wu, Xiang, Hu, Yibo, Huang, Huaibo, and He, Ran
- Subjects
- *
DEPERSONALIZATION , *FACE , *IDENTITIES (Mathematics) , *GALLIUM nitride , *PUBLIC safety , *DATA distribution , *FACE perception - Abstract
Heterogeneous face recognition (HFR) refers to matching cross-domain faces and plays a crucial role in public security. Nevertheless, HFR is confronted with challenges from large domain discrepancy and insufficient heterogeneous data. In this paper, we formulate HFR as a dual generation problem, and tackle it via a novel dual variational generation (DVG-Face) framework. Specifically, a dual variational generator is elaborately designed to learn the joint distribution of paired heterogeneous images. However, the small-scale paired heterogeneous training data may limit the identity diversity of sampling. In order to break through the limitation, we propose to integrate abundant identity information of large-scale visible data into the joint distribution. Furthermore, a pairwise identity preserving loss is imposed on the generated paired heterogeneous images to ensure their identity consistency. As a consequence, massive new diverse paired heterogeneous images with the same identity can be generated from noises. The identity consistency and identity diversity properties allow us to employ these generated images to train the HFR network via a contrastive learning mechanism, yielding both domain-invariant and discriminative embedding features. Concretely, the generated paired heterogeneous images are regarded as positive pairs, and the images obtained from different samplings are considered as negative pairs. Our method achieves superior performances over state-of-the-art methods on seven challenging databases belonging to five HFR tasks, including NIR-VIS, Sketch-Photo, Profile-Frontal Photo, Thermal-VIS, and ID-Camera. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
43. Byzantine-Resilient Decentralized Stochastic Gradient Descent.
- Author
-
Guo, Shangwei, Zhang, Tianwei, Yu, Han, Xie, Xiaofei, Ma, Lei, Xiang, Tao, and Liu, Yang
- Subjects
- *
FAULT tolerance (Engineering) , *DEEP learning , *INSTRUCTIONAL systems - Abstract
Decentralized learning has gained great popularity to improve learning efficiency and preserve data privacy. Each computing node makes equal contribution to collaboratively learn a Deep Learning model. The elimination of centralized Parameter Servers (PS) can effectively address many issues such as privacy, performance bottleneck and single-point-failure. However, how to achieve Byzantine Fault Tolerance in decentralized learning systems is rarely explored, although this problem has been extensively studied in centralized systems. In this paper, we present an in-depth study towards the Byzantine resilience of decentralized learning systems with two contributions. First, from the adversarial perspective, we theoretically illustrate that Byzantine attacks are more dangerous and feasible in decentralized learning systems: even one malicious participant can arbitrarily alter the models of other participants by sending carefully crafted updates to its neighbors. Second, from the defense perspective, we propose Ubar, a novel algorithm to enhance decentralized learning with Byzantine Fault Tolerance. Specifically, Ubar provides a Uniform Byzantine-resilient Aggregation Rule for benign nodes to select the useful parameter updates and filter out the malicious ones in each training iteration. It guarantees that each benign node in a decentralized system can train a correct model under very strong Byzantine attacks with an arbitrary number of faulty nodes. We conduct extensive experiments on standard image classification tasks and the results indicate that Ubar can effectively defeat both simple and sophisticated Byzantine attacks with higher performance efficiency than existing solutions. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
44. UNFusion: A Unified Multi-Scale Densely Connected Network for Infrared and Visible Image Fusion.
- Author
-
Wang, Zhishe, Wang, Junyao, Wu, Yuanyuan, Xu, Jiawei, and Zhang, Xiaoqin
- Subjects
- *
IMAGE fusion , *INFRARED imaging , *VISUAL perception , *DEEP learning , *IMAGE reconstruction , *FEATURE extraction - Abstract
Infrared image retains typical thermal targets while visible image preserves rich texture details, image fusion aims to reconstruct a synthesized image containing prominent targets and abundant texture details. Most of deep learning-based methods mainly focus on convolution operation to extract the local features, but do not fully consider their multi-scale characteristics and global dependencies, which may cause loss of target regions and texture details in the fused image. Towards this goal, we present a unified multi-scale densely connected fusion network in this paper, named as UNFusion. We carefully design a multi-scale encoder-decoder architecture that can efficiently extract and reconstruct multi-scale deep features. Dense skip connections are employed in both encoder and decoder sub-networks to reuse all the intermediate features of different layers and scales for fusion tasks. In the fusion layer, $L_{p} $ normalized attention models, which include three kinds of different norms, are proposed to highlight and combine these deep features from spatial and channel dimensions, and the combined spatial and channel attention maps are used to reconstruct a final fused image. We conduct extensive experiments on the public TNO and Roadscene datasets, and the results demonstrate that our UNFusion can simultaneously preserve high brightness of typical thermal targets and abundant texture details to obtain superior scene representation and better visual perception. Besides, our UNFusion achieves better fusion performance and transcends other state-of-the-art methods in terms of qualitative and quantitative comparisons. Our code is available at https://github.com/Zhishe-Wang/UNFusion. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
45. Topology-Aware Differential Privacy for Decentralized Image Classification.
- Author
-
Guo, Shangwei, Zhang, Tianwei, Xu, Guowen, Yu, Han, Xiang, Tao, and Liu, Yang
- Subjects
- *
PRIVACY , *NOISE control , *ARTIFICIAL intelligence , *FAULT tolerance (Engineering) , *DEEP learning , *QUEUING theory - Abstract
Image classification is a fundamental artificial intelligence task that labels images into one of some predefined classes. However, training complex image classification models requires a large amount of computation resources and data in order to reach state-of-the-art performance. This demand drives the growth of distributed deep learning, where multiple agents cooperatively train global models with their individual datasets. Among such learning systems, decentralized learning is particularly attractive, as it can improve the efficiency and fault tolerance by eliminating the centralized parameter server, which could be the single point of failure or performance bottleneck. Although the agents do not need to disclose their training image samples, they exchange parameters with each other at each iteration, which can put them at the risk of data privacy leakage. Past works demonstrated the possibility of recovering training images from the exchanged parameters. One common defense direction is to adopt Differential Privacy (DP) to secure the optimization algorithms such as Stochastic Gradient Descent (SGD). Those DP-based methods mainly focus on standalone systems, or centralized distributed learning. How to enforce and optimize DP protection in decentralized learning systems is unknown and challenging, due to their complex communication topologies and distinct learning characteristics. In this paper, we design TOP- DP, a novel solution to optimize the differential privacy protection of decentralized image classification systems. The key insight of our solution is to leverage the unique features of decentralized communication topologies to reduce the noise scale and improve the model usability. (1) We enhance the DP-SGD algorithm with this topology-aware noise reduction strategy, and integrate the time-aware noise decay technique. (2) We design two novel learning protocols (synchronous and asynchronous) to protect systems with different network connectivities and topologies. We formally analyze and prove the DP requirement of our proposed solutions. Experimental evaluations demonstrate that our solution achieves a better trade-off between usability and privacy than prior works. To the best of our knowledge, this is the first DP optimization work from the perspective of network topologies. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
46. Deep Continual Learning for Emerging Emotion Recognition.
- Author
-
Thuseethan, Selvarajah, Rajasegarar, Sutharshan, and Yearwood, John
- Abstract
Understanding an unknown facial emotion that emerges in the future underpins significant impacts in various domains. Knowing the fact that emotional states grow in vocabulary, new emotional states need to be adapted while the existing knowledge of known emotional states is preserved. While human beings spontaneously perform this task, the challenge is, how to devise a deep learning technique that can effectively recognize an unknown emotion category in the future. Although the deep convolutional neural network has shown excellent emotion recognition performances in the past, it is conventionally a predefined multi-way classifier showing little resilience towards adding a new emotion class. Considering the aforementioned challenge, in this paper, we propose a generic deep convolutional neural network-based architecture that constantly absorbs the upcoming emotion categories and recognizes them effectively. We further propose an indicator loss, which is associated with the distillation mechanism that preserves the existing knowledge. In order to demonstrate the feasibility of our proposed method, we evaluated our model using benchmark emotion datasets. The results confirm that the proposed approach is superior in recognizing unknown emotional states compared to continual learning benchmarks. Further, our proposed method demonstrates higher accuracy, compared to the transfer learning baselines. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
47. An Attention Encoder-Decoder Network Based on Generative Adversarial Network for Remote Sensing Image Dehazing.
- Author
-
Zhao, Liquan, Zhang, Yupeng, and Cui, Ying
- Abstract
Remote sensing image dehazing is a difficult problem for its complex characteristics. It can be regarded as the preprocessing of high-level tasks of remote sensing images. To remove haze from the hazy remote sensing image, an encoder-decoder based on generative adversarial network is proposed. It first learns the low-frequency information of the image, and then learns the high-frequency information of the image. The skip connection is also added in the network to avoid losing information. To further improve the ability of learning more useful information, a multi-scale attention module is proposed. Meanwhile, a CBlock module is also designed to extract more feature information. It can capture different size of receptive fields. In order to reduce the computational pressure of the network, a distillation module is used in the network. Inspired by multi-scale network, an enhance module is designed and introduced it in the end of the network to further improve the dehazing ability of the network by integrating context information on multi-scale. We compared with five methods and our proposed method on RICE dataset. Experimental results show that our method achieves the best effect, both qualitatively and quantitatively. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
48. A Learning-Based AoA Estimation Method for Device-Free Localization.
- Author
-
Hong, Ke, Wang, Tianyu, Liu, Junchen, Wang, Yu, and Shen, Yuan
- Abstract
Device-free localization (DFL), an important aspect in integrated sensing and communication, can be achieved through exploiting multipath components in ultra-wide bandwidth systems. However, incorrect identification of multipath components in the channel impulse responses will lead to large angle-of-arrival (AoA) estimation errors and subsequently poor localization performance. This letter proposes a learning-based AoA estimation method to improve the DFL accuracy. In the proposed method, we first design a classifier to identify the multipath components and then exploit the phase-difference-of-arrival to mitigate the AoA estimation error through a multilayer perceptron. Our learning-based method is validated using the datasets collected by ultra-wide bandwidth arrays, which significantly outperforms conventional methods in terms of AoA estimation and localization performance. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
49. Incremental Learning With Open-Set Recognition for Remote Sensing Image Scene Classification.
- Author
-
Liu, Weiwei, Nie, Xiangli, Zhang, Bo, and Sun, Xian
- Subjects
- *
MACHINE learning , *REMOTE sensing , *IMAGE recognition (Computer vision) , *TEXT recognition , *MNEMONICS - Abstract
Image scene classification aiming to assign specific semantic labels for each image is vitally important for the applications of remote sensing (RS) data. In real world, since the observation environment is open and dynamic, RS images are collected sequentially and the numbers of images and classes grow rapidly over time. Most existing scene classification methods are offline learning algorithms, which are inefficient and unscalable for this scenario. In this article, an incremental learning with open-set recognition (ILOSR) framework is proposed for RS image scene classification in the open and dynamic environment, which can identify the unknown classes from a stream of data and learn these new classes incrementally. Specifically, a controllable convex hull-based exemplar selection strategy is designed to address the catastrophic forgetting issue in incremental learning, which can reduce training time and memory footprint effectively. In addition, a new loss function based on prototype learning and uncertainty measurement is proposed for OSR to enhance the interclass discrimination and intraclass compactness of the learned deep features. Experimental results on real RS datasets demonstrate that the proposed method can not only outperform the state-of-the-art approaches on offline classification, incremental learning, and OSR problem separately but also achieve better and more stable performance in the experiments for ILOSR. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
50. JHU-CROWD++: Large-Scale Crowd Counting Dataset and A Benchmark Method.
- Author
-
Sindagi, Vishwanath A., Yasarla, Rajeev, and Patel, Vishal M.
- Subjects
- *
CROWDS , *COUNTING , *MAPS - Abstract
We introduce a new large scale unconstrained crowd counting dataset (JHU-CROWD++) that contains “4,372” images with “1.51 million” annotations. In comparison to existing datasets, the proposed dataset is collected under a variety of diverse scenarios and environmental conditions. Specifically, the dataset includes several images with weather-based degradations and illumination variations, making it a very challenging dataset. Additionally, the dataset consists of a rich set of annotations at both image-level and head-level. Several recent methods are evaluated and compared on this dataset. The dataset can be downloaded from http://www.crowd-counting.com. Furthermore, we propose a novel crowd counting network that progressively generates crowd density maps via residual error estimation. The proposed method uses VGG16 as the backbone network and employs density map generated by the final layer as a coarse prediction to refine and generate finer density maps in a progressive fashion using residual learning. Additionally, the residual learning is guided by an uncertainty-based confidence weighting mechanism that permits the flow of only high-confidence residuals in the refinement path. The proposed Confidence Guided Deep Residual Counting Network (CG-DRCN) is evaluated on recent complex datasets, and it achieves significant improvements In errors. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.