21 results on '"Zhang, Xu-Yao"'
Search Results
2. Online semi-supervised learning with learning vector quantization
- Author
-
Shen, Yuan-Yuan, Zhang, Yan-Ming, Zhang, Xu-Yao, and Liu, Cheng-Lin
- Published
- 2020
- Full Text
- View/download PDF
3. Large-scale continual learning for ancient Chinese character recognition.
- Author
-
Xu, Yue, Zhang, Xu-Yao, Zhang, Zhaoxiang, and Liu, Cheng-Lin
- Subjects
- *
CHINESE characters , *PATTERN recognition systems , *FEATURE extraction , *PROBLEM solving - Abstract
Ancient Chinese character recognition is a challenging problem in the field of pattern recognition. It is difficult to collect all character classes during the training stage due to the numerous classes of ancient Chinese characters and the likelihood of discovering new characters over time. A solution to address this problem is continual learning. However, most continual learning methods are not well-suited for large-scale applications, making them insufficient for solving the problem of ancient Chinese character recognition. Although saving raw data for old classes is a good approach for continual learning to address large-scale problems, it is often infeasible due to the lack of data accessibility in reality. To solve these problems, we propose a large-scale continual learning framework based on the convolutional prototype network (CPN), which does not save raw data for old classes. In this paper, several basic strategies have been proposed for the initial training stage to enhance the feature extraction ability and robustness of the network, which can improve the performance of the model in continual learning. In addition, we propose two practical methods in varying feature space (parameters of feature extractor are changeable) and fixed feature space (parameters of feature extractor are fixed), which enable the model to carry out large-scale continual learning. The proposed method does not save the raw data of old classes and enables simultaneous classification of all existing classes without knowing the incremental batch number. Experiments on the CASIA-AHCDB dataset with 5000 character classes demonstrate the effectiveness and superiority of the proposed method. • We propose a prototype-based framework for large-scale continual learning. • The proposed strategies in the initial stage are good for continual learning. • We propose two continual learning methods in varying and fixed feature space. • The proposed methods achieve SOTA results on the CASIA-AHCDB dataset with 5000 classes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. Deep representation learning for domain generalization with information bottleneck principle.
- Author
-
Zhang, Jiao, Zhang, Xu-Yao, Wang, Chuang, and Liu, Cheng-Lin
- Subjects
- *
ARTIFICIAL neural networks , *GENERALIZATION - Abstract
• A theoretical framework for domain generalization (DG) from the perspective of information bottleneck (IB) principle is established. • Based on the framework, a feasible solution by class-wise instance discrimination (CID) combined with maximizing feature entropy (MFE) is proposed to learn the desired representation for DG. • The proposed method achieves excellent performance on multiple datasets without knowing domain labels. • The proposed regularization rule (MFE) improves other invariance-based DG methods consistently. Although deep neural networks have achieved superior performance on many classical tasks, they deteriorate in real applications due to the unpredictable distribution shift. Domain generalization (DG) focuses on improving the generalization ability of the predictive model in unseen domains by training on multiple available source domains. All these domains share the same categories but commonly obey different distributions. In this paper, we establish a new theoretical framework for domain generalization from the perspective of the information bottleneck (IB) principle, which links representation learning in DG with domain-invariant representation learning and maximizing feature entropy (MFE). Based on the theoretical framework, we provide a feasible solution by class-wise instance discrimination combined with inter-dimension decorrelation and intra-dimension uniformity to learn the desired representation for domain generalization, which achieves excellent performance on multiple datasets without knowing domain labels. Extensive experiments show that the proposed regularization rule (MFE) can improve invariance-based DG methods consistently. Moreover, as an extreme case of domain generalization, we also show that MFE is promising to improve adversarial robustness. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
5. LG-CNN: From local parts to global discrimination for fine-grained recognition.
- Author
-
Xie, Guo-Sen, Zhang, Xu-Yao, Yang, Wenhan, Xu, Mingliang, Yan, Shuicheng, and Liu, Cheng-Lin
- Subjects
- *
PATTERN recognition systems , *IMAGE analysis , *SIGNAL convolution , *ARTIFICIAL neural networks , *PROBLEM solving - Abstract
Fine-grained recognition is one of the most difficult topics in visual recognition, which aims at distinguishing confusing categories such as bird species within a genus. The information of part and bounding boxes in fine-grained images is very important for improving the performance. However, in real applications, the part and/or bounding box annotations may not exist. This makes fine-grained recognition a challenging problem. In this paper, we propose a jointly trained Convolutional Neural Network (CNN) architecture to solve the fine-grained recognition problem without using part and bounding box information. In this framework, we first detect part candidates by calculating the gradients of feature maps of a trained CNN model w.r.t. the input image and then filter out unnecessary ones by fusing two saliency detection methods. Meanwhile, two groups of global object locations are obtained based on the saliency detection methods and a segmentation method. With the filtered part candidates and approximate object locations as inputs, we construct the CNN architecture with local parts and global discrimination (LG-CNN) which consists of two CNN networks with shared weights. The upper stream of LG-CNN is focused on the part information of the input image, the bottom stream of LG-CNN is focused on the global input image. LG-CNN is jointly trained by two stream loss functions to guide the updating of the shared weights. Experiments on three popular fine-grained datasets well validate the effectiveness of our proposed LG-CNN architecture. Applying our LG-CNN architecture to generic object recognition datasets also yields superior performance over the directly fine-tuned CNN architecture with a large margin. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
6. Online and offline handwritten Chinese character recognition: A comprehensive study and new benchmark.
- Author
-
Zhang, Xu-Yao, Bengio, Yoshua, and Liu, Cheng-Lin
- Subjects
- *
HANDWRITING recognition (Computer science) , *CHINESE characters , *PATTERN recognition systems , *DEEP learning , *ARTIFICIAL neural networks - Abstract
Recent deep learning based methods have achieved the state-of-the-art performance for handwritten Chinese character recognition (HCCR) by learning discriminative representations directly from raw data. Nevertheless, we believe that the long-and-well investigated domain-specific knowledge should still help to boost the performance of HCCR. By integrating the traditional normalization-cooperated direction-decomposed feature map (directMap) with the deep convolutional neural network (convNet), we are able to obtain new highest accuracies for both online and offline HCCR on the ICDAR-2013 competition database. With this new framework, we can eliminate the needs for data augmentation and model ensemble, which are widely used in other systems to achieve their best results. This makes our framework to be efficient and effective for both training and testing. Furthermore, although directMap+convNet can achieve the best results and surpass human-level performance, we show that writer adaptation in this case is still effective. A new adaptation layer is proposed to reduce the mismatch between training and test data on a particular source layer. The adaptation process can be efficiently and effectively implemented in an unsupervised manner. By adding the adaptation layer into the pre-trained convNet, it can adapt to the new handwriting styles of particular writers, and the recognition accuracy can be further improved consistently and significantly. This paper gives an overview and comparison of recent deep learning based approaches for HCCR, and also sets new benchmarks for both online and offline HCCR. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
7. Towards prior gap and representation gap for long-tailed recognition.
- Author
-
Zhang, Ming-Liang, Zhang, Xu-Yao, Wang, Chuang, and Liu, Cheng-Lin
- Subjects
- *
DEEP learning , *FEATURE extraction , *IMAGE recognition (Computer vision) - Abstract
• A unified theoretical framework for long-tailed recognition is established. • Corresponding mitigation solutions for prior gap and representation gap are proposed. • Theoretically analyzing the existing methods and the proposed methods in terms of the impact on two gaps. • The proposed methods yield superior performance on five long-tailed benchmarks. Most deep learning models are elaborately designed for balanced datasets, and thus they inevitably suffer performance degradation in practical long-tailed recognition tasks, especially to the minority classes. There are two crucial issues in learning from imbalanced datasets: skew decision boundary and unrepresentative feature space. In this work, we establish a theoretical framework to analyze the sources of these two issues from Bayesian perspective, and find that they are closely related to the prior gap and the representation gap, respectively. Under this framework, we show that existing long-tailed recognition methods manage to remove either the prior gap or the presentation gap. Different from these methods, we propose to simultaneously remove the two gaps to achieve more accurate long-tailed recognition. Specifically, we propose the prior calibration strategy to remove the prior gap and introduce three strategies (representative feature extraction, optimization strategy adjustment and effective sample modeling) to mitigate the representation gap. Extensive experiments on five benchmark datasets validate the superiority of our method against the state-of-the-art competitors. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
8. Cross-modal prototype learning for zero-shot handwritten character recognition.
- Author
-
Ao, Xiang, Zhang, Xu-Yao, and Liu, Cheng-Lin
- Subjects
- *
PATTERN recognition systems , *HANDWRITING recognition (Computer science) , *ARTIFICIAL neural networks , *PROTOTYPES , *CHINESE characters - Abstract
• We extend the cross-modal prototype learning CMPL framework to three modalities. • CMPL achieves state-of-the-art results on online and offline zero-shot handwritten character recognition. • CMPL shows promising cross-domain generalization ability in zero-shot handwritten character recognition. Traditional methods of handwritten character recognition rely on extensive labeled data. However, humans can generalize to unseen handwritten characters by watching a few printed examples in textbooks. To simulate this ability, we propose a cross-modal prototype learning method (CMPL) to realize zero-shot recognition. For each character class, a prototype is generated by mapping the printed character into a deep neural network feature space. For unseen character class, its prototype can be directly produced from a printed character sample, therefore, not requiring any handwritten samples to realize class-incremental learning. Specifically, CMPL considers different modalities simultaneously - online handwritten trajectories, offline handwritten images, and auxiliary printed character images. The joint learning of the above modalities is achieved through sharing printed prototypes between online and offline data. In zero-shot inference, we feed CMPL the printed samples to obtain corresponding class prototypes, and then the unseen handwritten character can be recognized by the nearest prototype. Our experimental results demonstrate that CMPL outperforms the state-of-the-art methods in both online and offline zero-shot handwritten Chinese character recognition. Moreover, we also show the cross-domain generalization of CMPL from two perspectives: cross-language and modern-to-ancient handwritten character recognition, focusing on the transferability between different languages and different styles (i.e., modern and historical handwritings). [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
9. Imitating the oracle: Towards calibrated model for class incremental learning.
- Author
-
Zhu, Fei, Cheng, Zhen, Zhang, Xu-Yao, and Liu, Cheng-Lin
- Subjects
- *
MACHINE learning , *CALIBRATION - Abstract
Class-incremental learning (CIL) aims to recognize classes that emerged in different phases. The joint-training (JT), which trains the model jointly with all classes, is often considered as the upper bound of CIL. In this paper, we thoroughly analyze the difference between CIL and JT in feature space and weight space. Motivated by the comparative analysis, we propose two types of calibration: feature calibration and weight calibration to imitate the oracle (ItO), i.e., JT. Specifically, on the one hand, feature calibration introduces deviation compensation to maintain the class decision boundary of old classes in feature space. On the other hand, weight calibration leverages forgetting-aware weight perturbation to increase transferability and reduce forgetting in parameter space. With those two calibration strategies, the model is forced to imitate the properties of joint-training at each incremental learning stage, thus yielding better CIL performance. Our ItO is a plug-and-play method and can be implemented into existing methods easily. Extensive experiments on several benchmark datasets demonstrate that ItO can significantly and consistently improve the performance of existing state-of-the-art methods. Our code is publicly available at https://github.com/Impression2805/ItO4CIL. • We explore and study how class incremental learning (CIL) differs from joint training (i.e., the oracle), and identify the crucial difference in both feature space and weight space. Therefore, we propose to improve CIL by imitating the oracle (ItO). • In the feature space, the proposed feature calibration introduces deviation compensation to maintain the class decision boundary of old classes for CIL. • In the weights space, the proposed weight calibration leverages forgetting-aware weight perturbation to increase transferability and reduce forgetting for CIL. • Extensive experiments demonstrate that our ItO can significantly and consistently improve the performance of existing state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
10. Discovery of novel AHLs as potent antiproliferative agents.
- Author
-
Ren, Jing-Li, Zhang, Xu-Yao, Yu, Bin, Wang, Xi-Xin, Shao, Kun-Peng, Zhu, Xiao-Ge, and Liu, Hong-Min
- Subjects
- *
DRUG development , *ANTINEOPLASTIC agents , *LACTONES , *ORGANIC synthesis , *CANCER cells , *CELL-mediated cytotoxicity , *SUBSTITUENTS (Chemistry) , *CELL cycle regulation - Abstract
Three series of novel AHL analogs were synthesized and evaluated for their in vitro cytotoxic activity against four human cancer cell lines. The SARs investigation indicated that AHLs with a terminal phenyl group, especially those with the chalcone scaffold had remarkably enhanced cytotoxicity than those with the hydrophobic side chains. Besides, some of these compounds were much more potent than 5-Fu and natural OdDHL. Through the detailed SARs discussions, we found that compounds 10a-k and 14 with the 4-amino chalcone scaffold showed excellent inhibition against all the tested cancer cell lines and were much more potent than 5-Fu and AHLs. Such scaffold may act as a template for further lead optimization. Compound 10i with a 3, 4, 5-trimethoxy group was the most potent one against all the tested cancer cell lines. Flow cytometry analysis indicated that analog 11e induced the cellular apoptosis and cell cycle arrest of MCF-7 cells at G2/M phase in a concentration-and time-dependent manner. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
11. Synthesis and anticancer activities of novel 1,2,4-triazolo[3,4-a]phthalazine derivatives.
- Author
-
Xue, Deng-Qi, Zhang, Xu-Yao, Wang, Chao-Jie, Ma, Li-Ying, Zhu, Nan, He, Peng, Shao, Kun-Peng, Chen, Peng-Ju, Gu, Yi-Fei, Zhang, Xiao-Song, Wang, Cai-Feng, Ji, Cong-Hui, Zhang, Qiu-Rong, and Liu, Hong-Min
- Subjects
- *
ANTINEOPLASTIC agents , *PHTHALAZINE , *CHEMICAL synthesis , *DRUG design , *FLUOROURACIL , *FLOW cytometry , *APOPTOSIS - Abstract
Trying to develop potent and selective anticancer agents, two series of novel 1,2,4-triazolo[3,4-a]phthalazine derivatives were designed and synthesized. Their antitumor activities were evaluated by MTT method against four selected human cancer cell lines (MGC-803, EC-9706, HeLa and MCF-7). Our results showed that compound 11h exhibited good anticancer activities compared to 5-fluorouracil against the four tested cell lines, with IC 50 values ranging from 2.0 to 4.5 μM. Flow cytometry analysis indicated that compound 11h induced the cellular early apoptosis and cell cycle arrest at G2/M phase in EC-9706. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
12. Evaluation of weighted Fisher criteria for large category dimensionality reduction in application to Chinese handwriting recognition
- Author
-
Zhang, Xu-Yao and Liu, Cheng-Lin
- Subjects
- *
FISHER discriminant analysis , *DIMENSION reduction (Statistics) , *HANDWRITING recognition (Computer science) , *APPROXIMATION theory , *LINEAR statistical models , *COMPUTATIONAL complexity - Abstract
Abstract: To improve the class separability of Fisher linear discriminant analysis (FDA) for large category problems, we investigate the weighted Fisher criterion (WFC) by integrating weighting functions for dimensionality reduction. The objective of WFC is to maximize the sum of weighted distances of all class pairs. By setting larger weights for the most confusable classes, WFC can improve the class separation while the solution remains an eigen-decomposition problem. We evaluate five weighting functions in three different weighting spaces in a typical large category problem of handwritten Chinese character recognition. The weighting functions include four based on existing methods, namely, FDA, approximate pairwise accuracy criterion (aPAC), power function (POW), confused distance maximization (CDM), and a new one based on K-nearest neighbors (KNN). All the weighting functions can be calculated in the original feature space, low-dimensional space, or fractional space. Our experiments on a 3,755-class Chinese handwriting database demonstrate that WFC can improve the classification accuracy significantly compared to FDA. Among the weighting functions, the KNN method in the original space is the most competitive model which achieves significantly higher classification accuracy and has a low computational complexity. To further improve the performance, we propose a nonparametric extension of the KNN method from the class level to the sample level. The sample level KNN (SKNN) method is shown to outperform significantly other methods in Chinese handwriting recognition such as the locally linear discriminant analysis (LLDA), neighbor class linear discriminant analysis (NCLDA), and heteroscedastic linear discriminant analysis (HLDA). [Copyright &y& Elsevier]
- Published
- 2013
- Full Text
- View/download PDF
13. Realtime multi-scale scene text detection with scale-based region proposal network.
- Author
-
He, Wenhao, Zhang, Xu-Yao, Yin, Fei, Luo, Zhenbo, Ogier, Jean-Marc, and Liu, Cheng-Lin
- Subjects
- *
DETECTORS - Abstract
• We propose a novel network named SRPN to realize both text/non-text localization and scale estimation efficiently. • A two-stage detection scheme based on SRPN is proposed to avoid using multi-scale pyramid input and achieve faster detection speed. • The proposed method achieves remarkable speedup on ICDAR2015, ICDAR2013 and MSRA-TD500 while keeping competitive performance. • Ablation experiments are given to prove reasonableness of the proposed method from different aspects. Multi-scale approaches have been widely used for achieving high accuracy for scene text detection, but they usually slow down the speed of the whole system. In this paper, we propose a two-stage framework for realtime multi-scale scene text detection. The first stage employs a novel S cale-based R egion P roposal N etwork (SRPN) which can localize text of wide scale range and estimate text scale efficiently. Based on SRPN, non-text regions are filtered out, and text region proposals are generated. Moreover, based on text scale estimation by SRPN, small or big texts in region proposals are resized into a unified normal scale range. The second stage then adopts a Fully Convolutional Network based scene text detector to localize text words from proposals of the first stage. Text detector in the second stage detects texts of narrow scale range but accurately. Since most non-text regions are eliminated through SRPN efficiently, and texts in proposals are properly scaled to avoid multi-scale pyramid processing, the whole system is quite fast. We evaluate both performance and speed of the proposed method on datasets ICDAR2015, ICDAR2013, and MSRA-TD500. On ICDAR2015, our system can reach the state-of-the-art F -measure score of 85.40% at 16.5 fps (frame per second), and competitive performance of 79.66% at 35.1 fps, either of which is more than 5 times faster than previous best methods. On ICDAR2013 and MSRA-TD500, we also achieve remarkable speedup by keeping competitive performance. Ablation experiments are also provided to demonstrate the reasonableness of our method. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
14. LightweightNet: Toward fast and lightweight convolutional neural networks via architecture distillation.
- Author
-
Xu, Ting-Bing, Yang, Peipei, Zhang, Xu-Yao, and Liu, Cheng-Lin
- Subjects
- *
ARTIFICIAL neural networks , *PATTERN recognition systems , *IMAGING systems , *INFORMATION retrieval , *COMPUTER networks - Abstract
Highlights • We present a new framework of deep convolutional neural network architecture distillation, namely LightweightNet, for acceleration and compression. • We exploit the prior knowledge of pre-defined network architecture to guide the efficient design of acceleration/compression strategies, while not using pre-trained model. • The proposed framework consists of network parameter compression, network structure acceleration, and non-tensor layer improvement. • The proposed framework demonstrates a higher acceleration/compression rate than previous methods in experiments, including a large category handwritten Chinese character recognition task with state-of-the-art performance. Abstract In recent years, deep neural networks have achieved remarkable successes in many pattern recognition tasks. However, the high computational cost and large memory overhead hinder them from applications on resource-limited devices. To address this problem, many deep network acceleration and compression methods have been proposed. One group of methods adopt decomposition and pruning techniques to accelerate and compress a pre-trained model. Another group designs single compact unit to stack their own networks. These methods are subject to complicated training processes, or lack of generality and extensibility. In this paper, we propose a general framework of architecture distillation, namely LightweightNet, to accelerate and compress convolutional neural networks. Rather than compressing a pre-trained model, we directly construct the lightweight network based on a baseline network architecture. The LightweightNet, designed based on a comprehensive analysis of the network architecture, consists of network parameter compression, network structure acceleration, and non-tensor layer improvement. Specifically, we propose the strategy of low-dimensional features of fully-connected layers for substantial memory saving, and design multiple efficient compact blocks to distill convolutional layers of baseline network with accuracy-sensitive distillation rule for notable time saving. Finally, it can effectively reduce the computational cost and the model size by > 4 × with negligible accuracy loss. Benchmarks on MNIST, CIFAR-10, ImageNet and HCCR (handwritten Chinese character recognition) datasets demonstrate the advantages of the proposed framework in terms of speed, performance, storage and training process. In HCCR, our method even outperforms traditional handcrafted features-based classifiers in terms of speed and storage while maintaining state-of-the-art recognition performance. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
15. Adversarial training with distribution normalization and margin balance.
- Author
-
Cheng, Zhen, Zhu, Fei, Zhang, Xu-Yao, and Liu, Cheng-Lin
- Subjects
- *
CORRUPTION , *EXPLANATION , *CLASSIFICATION , *RADAR in aeronautics - Abstract
• We propose distribution normalization to constrain the covariance to be an identity matrix to eliminate the vulnerability induced by features with smaller variance and provide a theoretical explanation. • We incorporate margin balance to enlarge the minimal margin of classes to boost adversarial robustness, contributing to an equal margin between classes. • We show that DNMB achieves better adversarial robustness than state-of-the-art methods under white-box attacks, black-box attacks, adaptive attacks, unseen attacks, and common corruptions. Adversarial training is the most effective method to improve adversarial robustness. However, it does not explicitly regularize the feature space during training. Adversarial attacks usually move a sample iteratively along the direction which causes the steepest ascent of classification loss by crossing decision boundary. To alleviate this problem, we propose to regularize the distributions of different classes to increase the difficulty of finding an attacking direction. Specifically, we propose two strategies named Distribution Normalization (DN) and Margin Balance (MB) for adversarial training. The purpose of DN is to normalize the features of each class to have identical variance in every direction, in order to eliminate easy-to-attack intra-class directions. The purpose of MB is to balance the margins between different classes, making it harder to find confusing class directions (i.e., those with smaller margins) to attack. When integrated with adversarial training, our method can significantly improve adversarial robustness. Extensive experiments under white-box, black-box, and adaptive attacks demonstrate the effectiveness of our method over other state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
16. Adaptive spatial pooling for image classification.
- Author
-
Liu, Yinglu, Zhang, Yan-Ming, Zhang, Xu-Yao, and Liu, Cheng-Lin
- Subjects
- *
FEATURE extraction , *CLASSIFICATION algorithms , *BOOSTING algorithms , *BENCHMARK problems (Computer science) , *ADAPTIVE computing systems - Abstract
In this paper, we propose an adaptive spatial pooling method for enhancing the discriminability of feature representation for image classification. The core idea is to adopt a spatial distribution matrix to define how the image patches are pooled together. By formulating the pooling distribution learning and classifier training jointly, our method can extract multiple spatial layouts of arbitrary shapes rather than regular rectangular regions. By proper mathematical transformation, the distributions can be learned via a boosting-like algorithm, which improves the efficiency of learning especially for large distribution matrices. Further, our method allows category-specific pooling operations to take advantage of the different spatial layouts of different categories. Experimental results on three benchmark datasets UIUC-Sports, 21-Land-Use and Scene 15 demonstrate the effectiveness of our method. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
17. Automatic recognition of serial numbers in bank notes.
- Author
-
Feng, Bo-Yuan, Ren, Mingwu, Zhang, Xu-Yao, and Suen, Ching Y.
- Subjects
- *
AUTOMATION , *BANK notes , *FEATURE extraction , *FORGERY , *COMMERCIAL crimes , *RELIABILITY in engineering - Abstract
Abstract: This paper presents a new topic of automatic recognition of bank note serial numbers, which will not only facilitate the prevention of forgery crimes, but also have a positive impact on the economy. Among all the different currencies, we focus on the study of RMB (renminbi bank note, the paper currency used in China) serial numbers. For evaluation, a new database NUST-RMB2013 has been collected from scanned RMB images, which contains the serial numbers of 35 categories with 17,262 training samples and 7000 testing samples in total. We comprehensively implement and compare two classic and one newly merged feature extraction methods (namely gradient direction feature, Gabor feature, and CNN trainable feature), four different types of well-known classifiers (SVM, LDF, MQDF, and CNN), and five multiple classifier combination strategies (including a specially designed novel cascade method). To further improve the recognition accuracy, the enhancements of three different kinds of distortions have been tested. Since high reliability is more important than accuracy in financial applications, we introduce three rejection schemes of first rank measurement (FRM), first two ranks measurement (FTRM) and linear discriminant analysis based measurement (LDAM). All the classifiers and classifier combination schemes are combined with different rejection criteria. A novel cascade rejection measurement achieves 100% reliability with less rejection rate compared with the existing methods. Experimental results show that MQDF reaches the accuracy of 99.59% using the gradient direction feature trained with gray level normalized data; the cascade classifier combination achieves the best performance of 99.67%. The distortions have been proved to be very helpful because the performances of CNNs boost at least 0.5% by training with transformed samples. With the cascade rejection method, 100% reliability has been obtained by rejecting 1.01% test samples. [Copyright &y& Elsevier]
- Published
- 2014
- Full Text
- View/download PDF
18. Dynamics-aware loss for learning with label noise.
- Author
-
Li, Xiu-Chuan, Xia, Xiaobo, Zhu, Fei, Liu, Tongliang, Zhang, Xu-Yao, and Liu, Cheng-Lin
- Subjects
- *
ARTIFICIAL neural networks - Abstract
Label noise poses a serious threat to deep neural networks (DNNs). Employing robust loss functions which reconcile fitting ability with robustness is a simple but effective strategy to handle this problem. However, the widely-used static trade-off between these two factors contradicts the dynamics of DNNs learning with label noise, leading to inferior performance. Therefore, we propose a dynamics-aware loss (DAL) to solve this problem. Considering that DNNs tend to first learn beneficial patterns, then gradually overfit harmful label noise, DAL strengthens the fitting ability initially, then gradually improves robustness. Moreover, at the later stage, to further reduce the negative impact of label noise and combat underfitting simultaneously, we let DNNs put more emphasis on easy examples than hard ones and introduce a bootstrapping term. Both the detailed theoretical analyses and extensive experimental results demonstrate the superiority of our method. • Based on the dynamic nature of DNNs learning with label noise, we propose Dynamics-Aware Loss (DAL). • We present detailed analyses to certify the superiority of DAL. • Our extensive experiments demonstrate the superior performance and practicality of DAL. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
19. MuLTReNets: Multilingual text recognition networks for simultaneous script identification and handwriting recognition.
- Author
-
Chen, Zhuo, Yin, Fei, Zhang, Xu-Yao, Yang, Qing, and Liu, Cheng-Lin
- Subjects
- *
TEXT recognition , *HANDWRITING recognition (Computer science) , *SCRIPTS , *PATTERN recognition systems , *IDENTIFICATION - Abstract
• A novel multi-task system, named MuLTReNets, to optimize script identification and handwriting recognition jointly for multilingual handwritten text recognition. • The MuLTReNets are extended into two versions: one for multi-lingual text recognition with merged alphabet (MuLTReNetV1), one for cascaded script identification and unilingual text recognition with joint training (MuLTReNetV2). • Auto-weighter keeps the balance among datasets of different scripts. • Performance is superior to cascade systems and unilingual recognition systems. • Experimental analysis for better understanding the system. Multilingual handwritten text recognition is often accomplished in two cascaded steps: script identification and handwriting recognition. However, this scheme is not optimal due to error accumulation. To perform simultaneous script identification and handwriting recognition, in this paper, we propose a new framework named multilingual text recognition networks (MuLTReNets). Specifically, the system has four major modules: feature extractor, script identifier, handwriting recognizer and auto-weighter. The feature extractor integrates both spatial and temporal knowledge to encode text images into features shared by the script identifier and recognizer. The script identifier predicts script category from a variable-length sequence incorporating an auto-weighter for balancing different scripts, while the handwriting recognizer adopts long-short term memory (LSTM) and Connectionist Temporal Classification (CTC) to accomplish sequence decoding. Via multi-task learning, the proposed framework can benefit both two multilingual recognition schemes: unified recognition with merged alphabet (MuLTReNetV1) and cascaded script identification-single script recognition with joint training (MuLTReNetV2). We evaluated the performance of the proposed method on handwritten text databases of five languages, which are English, French, Kannada, Urdu, and Bangla. Experimental results demonstrate that our method performs superiorly for both script identification and handwriting recognition. The accuracy of script identification reaches 99.9%. While in handwriting recognition, the proposed system not only outperforms cascade systems but also surpasses systems particularly designed for specific scripts. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
20. Self-information of radicals: A new clue for zero-shot Chinese character recognition.
- Author
-
Luo, Guo-Feng, Wang, Da-Han, Du, Xia, Yin, Hua-Yi, Zhang, Xu-Yao, and Zhu, Shunzhi
- Subjects
- *
CHINESE characters , *PATTERN recognition systems , *RADICALS , *INFORMATION theory - Abstract
• We propose the self-information of radicals (SIR) from the information theory perspective to measure the importance of radicals in recognizing Chinese characters. • The proposed SIR can be easily adopted by two commonly used radical-based zero-shot Chinese Character Recognition (ZSCCR) frameworks, i.e., sequence matching based and attribute embedding based. • For sequence matching based ZSCCR, we propose a novel Chinese character uncertainty elimination (CUE) framework, which is capable of alleviating the sequence mismatch problem. • For attribute embedding based ZSCCR, we propose a novel radical information embedding (RIE) method, which can highlight the importance of indispensable radicals. • Comprehensive experiments on the CASIA-HWDB, ICDAR2013, CTW, and AHCDB datasets demonstrate the effectiveness and high extensibility of the proposed SIR. Zero-shot Chinese character recognition (ZSCCR) is an important research topic in Chinese character recognition as it attempts to recognize unseen Chinese characters. As basic components and mid-level representations, radicals are significant for ZSCCR. However, previous methods treat the importance of radicals equally, ignoring the different contributions of radicals in distinguishing characters. In this paper, we propose the self-information of radicals (SIR) to measure the importance of radicals in recognizing Chinese characters. The proposed SIR can be easily adopted by two commonly used radical-based ZSCCR frameworks, i.e., sequence matching based and attribute embedding based. For sequence matching based ZSCCR, we propose a novel Chinese character uncertainty elimination (CUE) framework to alleviate the radical sequence mismatch problem. For attribute embedding based ZSCCR, we propose a novel radical information embedding (RIE) method that can highlight the importance of indispensable radicals and weaken the influence of some unnecessary radicals. We conducted comprehensive experiments on the CASIA-HWDB, ICDAR2013, CTW datasets, and AHCDB datasets to evaluate the proposed method. Experiments show that our proposed methods can achieve superior performance to the state-of-the-art methods, which demonstrate the effectiveness and the high extensibility of the proposed SIR. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
21. A survey of robust adversarial training in pattern recognition: Fundamental, theory, and methodologies.
- Author
-
Qian, Zhuang, Huang, Kaizhu, Wang, Qiu-Feng, and Zhang, Xu-Yao
- Subjects
- *
ARTIFICIAL neural networks , *COMPUTER vision , *DEEP learning , *PATTERN recognition systems , *MACHINE learning , *BRAIN stimulation - Abstract
• We present a timely and comprehensive survey on robust adversarial training. • This survey offers the fundamentals of adversarial training, a unified theory that can be used to interpret various methods, and a comprehensive summarization of different methodologies. • This survey also addresses three important research focus in adversarial training: interpretability, robust generalization, and robustness evaluation, which can stimulate future inspirations as well as research outlook. Deep neural networks have achieved remarkable success in machine learning, computer vision, and pattern recognition in the last few decades. Recent studies, however, show that neural networks (both shallow and deep) may be easily fooled by certain imperceptibly perturbed input samples called adversarial examples. Such security vulnerability has resulted in a large body of research in recent years because real-world threats could be introduced due to the vast applications of neural networks. To address the robustness issue to adversarial examples particularly in pattern recognition, robust adversarial training has become one mainstream. Various ideas, methods, and applications have boomed in the field. Yet, a deep understanding of adversarial training including characteristics, interpretations, theories, and connections among different models has remained elusive. This paper presents a comprehensive survey trying to offer a systematic and structured investigation on robust adversarial training in pattern recognition. We start with fundamentals including definition, notations, and properties of adversarial examples. We then introduce a general theoretical framework with gradient regularization for defending against adversarial samples - robust adversarial training with visualizations and interpretations on why adversarial training can lead to model robustness. Connections will also be established between adversarial training and other traditional learning theories. After that, we summarize, review, and discuss various methodologies with defense/training algorithms in a structured way. Finally, we present analysis, outlook, and remarks on adversarial training. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.