54,619 results on '"Pattern recognition (psychology)"'
Search Results
102. A new two-stage hybrid feature selection algorithm and its application in Chinese medicine
- Author
-
Guoliang Xu, Wangping Xiong, Zhiqin Li, Bin Nie, Jigen Luo, and Jianqiang Du
- Subjects
Markov blanket ,Lasso (statistics) ,Artificial Intelligence ,Computer science ,Feature (computer vision) ,Pattern recognition (psychology) ,Stability (learning theory) ,Feature selection ,Computer Vision and Pattern Recognition ,Overfitting ,Algorithm ,Software ,Curse of dimensionality - Abstract
High-dimensional small sample data are prone to the curse of dimensionality and overfitting and contain many irrelevant and redundant features. In order to solve these feature selection problems, a new Two-stage Hybrid Feature Selection Algorithm (Ts-HFSA) is proposed. The first stage uses the Filter method combined with the Wrapper method to adaptively remove irrelevant features. In the second stage, a De-redundancy Algorithm of Fusing Approximate Markov Blanket with L1 Regular Term (DA2MBL1) is used to solve the AMB’s problem of information loss when deleting redundant features and potential redundancy in the subset of features obtained by AMB. The experimental results on multiple UCI public data sets and datasets from the material foundation of Chinese medicine showed that the Ts-HFSA better deleted irrelevant features and redundant features, found smaller and higher quality feature subsets, and improved stability, indicating that it offers more advantages than AMB, FCBF, RF, GBDT, XGBoost, Lasso, and CI_AMB. Moreover, in the face of data of the material foundation of Chinese medicine, with higher feature dimensions and fewer sample sizes, Ts-HFSA performed better, which can also improve the precision of the model after greatly reducing the dimension. The results indicated that Ts-HFSA is an effective method for feature selection of high-dimensional small samples and an excellent research method for the material foundation of Chinese medicine.
- Published
- 2021
103. Handling non-stationarity in E-nose design: a review
- Author
-
Sanjay Singh, Santanu Chaudhury, and Vishakha Pareek
- Subjects
Artificial neural network ,Sensor array ,Point (typography) ,Computer science ,Pattern recognition (psychology) ,Perspective (graphical) ,Electrical and Electronic Engineering ,Machine olfaction ,Data science ,Industrial and Manufacturing Engineering ,Field (computer science) ,Bridge (nautical) - Abstract
Purpose The electronic nose is an array of chemical or gas sensors and associated with a pattern-recognition framework competent in identifying and classifying odorant or non-odorant and simple or complex gases. Despite more than 30 years of research, the robust e-nose device is still limited. Most of the challenges towards reliable e-nose devices are associated with the non-stationary environment and non-stationary sensor behaviour. Data distribution of sensor array response evolves with time, referred to as non-stationarity. The purpose of this paper is to provide a comprehensive introduction to challenges related to non-stationarity in e-nose design and to review the existing literature from an application, system and algorithm perspective to provide an integrated and practical view. Design/methodology/approach The authors discuss the non-stationary data in general and the challenges related to the non-stationarity environment in e-nose design or non-stationary sensor behaviour. The challenges are categorised and discussed with the perspective of learning with data obtained from the sensor systems. Later, the e-nose technology is reviewed with the system, application and algorithmic point of view to discuss the current status. Findings The discussed challenges in e-nose design will be beneficial for researchers, as well as practitioners as it presents a comprehensive view on multiple aspects of non-stationary learning, system, algorithms and applications for e-nose. The paper presents a review of the pattern-recognition techniques, public data sets that are commonly referred to as olfactory research. Generic techniques for learning in the non-stationary environment are also presented. The authors discuss the future direction of research and major open problems related to handling non-stationarity in e-nose design. Originality/value The authors first time review the existing literature related to learning with e-nose in a non-stationary environment and existing generic pattern-recognition algorithms for learning in the non-stationary environment to bridge the gap between these two. The authors also present details of publicly available sensor array data sets, which will benefit the upcoming researchers in this field. The authors further emphasise several open problems and future directions, which should be considered to provide efficient solutions that can handle non-stationarity to make e-nose the next everyday device.
- Published
- 2021
104. Development of the classifier based on a multilayer perceptron using genetic algorithm and cart decision tree
- Author
-
Lyudmila Dobrovska and Olena Nosovets
- Subjects
neural network ,Computer science ,multilayer perceptron using a genetic algorithm ,CART decision tree ,Decision tree ,Evolutionary algorithm ,Energy Engineering and Power Technology ,Machine learning ,computer.software_genre ,Industrial and Manufacturing Engineering ,Management of Technology and Innovation ,Genetic algorithm ,Classifier (linguistics) ,T1-995 ,Industry ,Electrical and Electronic Engineering ,Technology (General) ,Artificial neural network ,business.industry ,Applied Mathematics ,Mechanical Engineering ,HD2321-4730.9 ,Computer Science Applications ,Control and Systems Engineering ,Test set ,Multilayer perceptron ,Pattern recognition (psychology) ,Artificial intelligence ,business ,computer - Abstract
The problem of developing universal classifiers of biomedical data, in particular those that characterize the presence of a large number of parameters, inaccuracies and uncertainty, is urgent. Many studies are aimed at developing methods for analyzing these data, among them there are methods based on a neural network (NN) in the form of a multilayer perceptron (MP) using GA. The question of the application of evolutionary algorithms (EA) for setting up and learning the neural network is considered. Theories of neural networks, genetic algorithms (GA) and decision trees intersect and penetrate each other, new developed neural networks and their applications constantly appear. An example of a problem that is solved using EA algorithms is considered. Its goal is to develop and research a classifier for the diagnosis of breast cancer, obtained by combining the capabilities of the multilayer perceptron using the genetic algorithm (GA) and the CART decision tree. The possibility of improving the classifiers of biomedical data in the form of NN based on GA by applying the process of appropriate preparation of biomedical data using the CART decision tree has been established. The obtained results of the study indicate that these classifiers show the highest efficiency on the set of learning and with the minimum reduction of Decision Trees; increasing the number of contractions usually degrades the simulation result. On two datasets on the test set, the simulation accuracy was »83–87%. The experiments carried out have confirmed the effectiveness of the proposed method for the synthesis of neural networks and make it possible to recommend it for practical use in processing data sets for further diagnostics, prediction, or pattern recognition
- Published
- 2021
105. The SYSTEM IDENTIFICATION OF FRIEND AND ENEMY USING NIGHT VISION CAMERA ON BATTLE ROBOT (CQB) USING PATTERN RECOGNITION METHOD
- Author
-
Eko Wahyu Pratama, Mohammad Ansori, and Kusno Suryadi
- Subjects
Battle ,Computer science ,business.industry ,media_common.quotation_subject ,Pattern recognition ,Light intensity ,Urban warfare ,Identification (information) ,Obstacle ,Night vision ,Pattern recognition (psychology) ,Robot ,Artificial intelligence ,business ,media_common - Abstract
– Indonesian Army has the main task of protecting against enemy attacks, which include forest battles as well as urban battles. The obstacle faced today is that in the process of urban warfare operations, infiltration and hostage rescue in buildings are still less efficient and optimal. The robot is designed with a system that can identify friends and foes using a Night Vision camera and the Pattern Recognition method. Pattern recognition is a symbolic grouping automatically that is done by a computer to find out objects or patterns. The results of the Night vision camera test are able to detect human objects with a maximum distance of 6 meters. This night vision scope has a fairly large accuracy rate of 83%. And light intensity has an influence on the identification process because if the light intensity is more than 200 lux then the system is not able to identify the object.
- Published
- 2021
106. Pain detection from facial expressions using domain adaptation technique
- Author
-
Poonam Sheoran, Sudesh Pahal, and Neeru Rathee
- Subjects
Facial expression ,Visual analogue scale ,business.industry ,Computer science ,Deep learning ,Machine learning ,computer.software_genre ,Field (computer science) ,Artificial Intelligence ,Pain assessment ,Pattern recognition (psychology) ,Benchmark (computing) ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Adaptation (computer science) ,business ,computer - Abstract
Pain management is gaining the attention of clinical practitioners to relieve patients from pain in an effective manner. Pain management is primarily dependent on pain measurement. Researchers have proposed various techniques to measure pain from facial expressions improving the accuracy and efficiency of the traditional pain measurement such as self-reporting and visual analog scale. Developments in the field of deep learning have further enhanced the pain assessment technique. Despite of the state of the art performance of deep learning algorithms, adaptation to new subjects is still a problem due to availability of a few samples of the same. Authors have addressed this issue by employing a model agnostic meta-learning algorithm for pain detection and fast adaptation of the trained algorithm for new subjects using only a few labeled images. The model is pre-trained with labeled images of subjects with five pain levels to acquire meta-knowledge in the presented work. This meta-knowledge is then used to adapt the model to a new learning task in the form of a new subject. The proposed model is evaluated on a benchmark dataset, i.e., UNBC McMaster pain archive database. Experimental results show that the model can be very easily adapted to new subjects with the accuracy of $$96\%$$ and $$98\%$$ for 1-shot and 5-shot learning respectively, proving the potential of the proposed algorithm for clinical use.
- Published
- 2021
107. A critical state identification approach to inverse reinforcement learning for autonomous systems
- Author
-
Maxwell Hwang, Wei-Cheng Jiang, and Yu-Jen Chen
- Subjects
Computer science ,business.industry ,Computational intelligence ,Function (mathematics) ,Space (commercial competition) ,Identification (information) ,Artificial Intelligence ,Pattern recognition (psychology) ,State space ,Reinforcement learning ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Set (psychology) ,business ,Software - Abstract
Inverse reinforcement learning features a reward function based on reward features and positive demonstrations. When complex learning tasks are performed, the entire state space is used to form the set of reward features, but this large set results in a long computational time. Retrieving important states from the full state space addresses this problem. This study formulated a method that extracts critical features by combining negative and positive demonstrations of searching for critical states from the entire state space to increase learning efficiency. In this method, two types of demonstrations are used: positive demonstrations, which are given by experts and agents imitate, and negative demonstrations, which demonstrate incorrect motions to be avoided by agents. All significant features are extracted by identifying the critical states over the entire state space. This is achieved by comparing the difference between the negative and positive demonstrations. When these critical states are identified, they form the set of reward features, and a reward function is derived that enables agents to learn a policy using reinforcement learning quickly. A speeding car simulation was used to verify the proposed method. The simulation results demonstrate that the proposed approach allows an agent to search for a positive strategy and that the agent then displays intelligent expert-like behavior.
- Published
- 2021
108. Dual-Attention-Guided Network for Ghost-Free High Dynamic Range Imaging
- Author
-
Dong Gong, Anton van den Hengel, Ian Reid, Qingsen Yan, Yanning Zhang, Chunhua Shen, and Javen Shi
- Subjects
business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Optical flow ,Artificial Intelligence ,Hallucinating ,Feature (computer vision) ,High-dynamic-range imaging ,Pattern recognition (psychology) ,Computer vision ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Ghosting ,business ,Software ,High dynamic range ,Block (data storage) - Abstract
Ghosting artifacts caused by moving objects and misalignments are a key challenge in constructing high dynamic range (HDR) images. Current methods first register the input low dynamic range (LDR) images using optical flow before merging them. This process is error-prone, and often causes ghosting in the resulting merged image. We propose a novel dual-attention-guided end-to-end deep neural network, called DAHDRNet, which produces high-quality ghost-free HDR images. Unlike previous methods that directly stack the LDR images or features for merging, we use dual-attention modules to guide the merging according to the reference image. DAHDRNet thus exploits both spatial attention and feature channel attention to achieve ghost-free merging. The spatial attention modules automatically suppress undesired components caused by misalignments and saturation, and enhance the fine details in the non-reference images. The channel attention modules adaptively rescale channel-wise features by considering the inter-dependencies between channels. The dual-attention approach is applied recurrently to further improve feature representation, and thus alignment. A dilated residual dense block is devised to make full use of the hierarchical features and increase the receptive field when hallucinating missing details. We employ a hybrid loss function, which consists of a perceptual loss, a total variation loss, and a content loss to recover photo-realistic images. Although DAHDRNet is not flow-based, it can be applied to flow-based registration to reduce artifacts caused by optical-flow estimation errors. Experiments on different datasets show that the proposed DAHDRNet achieves state-of-the-art quantitative and qualitative results.
- Published
- 2021
109. Visual and linguistic semantic representations are aligned at the border of human visual cortex
- Author
-
Natalia Y. Bilenko, Sara F Popham, Fatma Deniz, Alexander G. Huth, James S. Gao, Anwar O. Nunez-Elizalde, and Jack L. Gallant
- Subjects
Communication ,medicine.diagnostic_test ,Computer science ,business.industry ,General Neuroscience ,Semantic map ,Visual Physiology ,Human brain ,Semantics ,Visual cortex ,medicine.anatomical_structure ,Pattern recognition (psychology) ,medicine ,Semantic information ,Functional magnetic resonance imaging ,business ,Neuroscience - Abstract
Semantic information in the human brain is organized into multiple networks, but the fine-grain relationships between them are poorly understood. In this study, we compared semantic maps obtained from two functional magnetic resonance imaging experiments in the same participants: one that used silent movies as stimuli and another that used narrative stories. Movies evoked activity from a network of modality-specific, semantically selective areas in visual cortex. Stories evoked activity from another network of semantically selective areas immediately anterior to visual cortex. Remarkably, the pattern of semantic selectivity in these two distinct networks corresponded along the boundary of visual cortex: for visual categories represented posterior to the boundary, the same categories were represented linguistically on the anterior side. These results suggest that these two networks are smoothly joined to form one contiguous map.
- Published
- 2021
110. A new approach for $$H_{\infty }$$ deconvolution filtering of 2D systems described by the Fornasini–Marchesini and discrete moments
- Author
-
Mostafa El Mallahi, Abdelaziz Hmamed, Ismail Boumhidi, Bensalem Boukili, Amal Zouhri, and Abderrahim El Amrani
- Subjects
Matrix (mathematics) ,Artificial Intelligence ,Feature vector ,Pattern recognition (psychology) ,Linear matrix inequality ,Computer Vision and Pattern Recognition ,Deconvolution ,Coupling (probability) ,Stability (probability) ,Algorithm ,Mathematics ,Convolution - Abstract
This study proposes a new approach of $$H_{\infty }$$ deconvolution filtering of 2D system described by Fornasini–Marchesini model and Tchebichef moments. The challenge of this method is to generate an unknown 2D signal by transmission channel. This canal is mobilized by convolution system and deconvolution filter to rebuild the output signal. To resolve this problem, firstly, we use the Tchebichef moments to extract the feature vectors of a medicinal Cannabis sativa plant for generating the input system with the minimum information. Next, we implement the system with the model of Fornasini–Marchesini for convolution and deconvolution. However, the free matrix variables are used to eliminate coupling between Lyapunov matrix and system matrices to obtain sufficient conditions in linear matrix inequality form to ensure the desired stability and performance of the error systems. Experimental results show that the new approach for $$H_{\infty }$$ deconvolution filtering of 2D systems described by the Fornasini–Marchesini model and Tchebichef moments achieves good performance than the recent works.
- Published
- 2021
111. Scheduling Hardware-Accelerated Cloud Functions
- Author
-
Wayne Luk, Jose G. F. Coutinho, and Jessica Vandebon
- Subjects
Multi-core processor ,business.industry ,media_common.quotation_subject ,Cloud computing ,Theoretical Computer Science ,Scheduling (computing) ,Resource (project management) ,Hardware and Architecture ,Control and Systems Engineering ,Modeling and Simulation ,Signal Processing ,Pattern recognition (psychology) ,Function (engineering) ,business ,Reduced cost ,Field-programmable gate array ,Computer hardware ,Information Systems ,media_common - Abstract
This paper presents a Function-as-a-Service (FaaS) approach for deploying managed cloud functions onto heterogeneous cloud infrastructures. Current FaaS systems, such as AWS Lambda, allow domain-specific functionality, such as AI, HPC and image processing, to be deployed in the cloud while abstracting users from infrastructure and platform concerns. Existing approaches, however, use a single type of resource configuration to execute all function requests. In this paper, we present a novel FaaS approach that allows cloud functions to be effectively executed across heterogeneous compute resources, including hardware accelerators such as GPUs and FPGAs. We implement heterogeneous scheduling to tailor resource selection to each request, taking into account performance and cost concerns. In this way, our approach makes use of different processor types and quantities (e.g. 2 CPU cores), uniquely suited to handle different types of workload, potentially providing improved performance at a reduced cost. We validate our approach in three application domains: machine learning, bio-informatics, and physics, and target a hardware platform with a combined computational capacity of 24 FPGAs and 12 CPU cores. Compared to traditional FaaS, our approach achieves a cost improvement for non-uniform traffic of up to 8.9 times, while maintaining performance objectives.
- Published
- 2021
112. Representation learning based on hybrid polynomial approximated extreme learning machine
- Author
-
Tinghui Ouyang and Xun Shen
- Subjects
Computer science ,business.industry ,Random projection ,Dimensionality reduction ,Pattern recognition ,Autoencoder ,Discriminative model ,Artificial Intelligence ,Pattern recognition (psychology) ,Feature (machine learning) ,Artificial intelligence ,business ,Feature learning ,Extreme learning machine - Abstract
As an effective algorithm in feature learning, autoencoder (AE) and its variants have been widely applied in machine learning. To overcome the expensive time consumption in backpropagation learning and parameters iterative tuning, extreme learning machine (ELM) is combined with AE, developed as ELM-based AE (ELM-AE) in unsupervised feature learning. However, random projection of ELM makes the learned features not stable for the final target recognition. On the other hand, considering to enhance high-order nonlinear expression in ELM-AE, common methods increase the computation overhead. Therefore, this paper proposes to construct a new ELM-AE based on approximation of hybrid high-order polynomial functions. The proposed model will keep fast learning speed by linearization of the high-order nonlinear expression, be robust to random projection issue, and learn discriminative features for pattern recognition. Two feature learning application scenarios, feature reconstruction and dimension reduction, are discussed based on different ELM-AE models. Experiments on publicly available datasets including small and large datasets demonstrate the proposed model’s feasibility and superiority in feature learning.
- Published
- 2021
113. A Physically Constrained Variational Autoencoder for Geochemical Pattern Recognition
- Author
-
Renguang Zuo, Yihui Xiong, Zijing Luo, and Xueqiu Wang
- Subjects
Generalization ,business.industry ,Deep learning ,Pattern recognition ,Fractal analysis ,Autoencoder ,Consistency (database systems) ,Mathematics (miscellaneous) ,Pattern recognition (psychology) ,General Earth and Planetary Sciences ,Domain knowledge ,Artificial intelligence ,business ,Interpretability - Abstract
Quantification and recognition of geochemical patterns are extremely important for geochemical prospecting and can facilitate a better understanding of regional metallogenesis. Recognition of such patterns with deep learning (DL) algorithms has attracted considerable attention, as these algorithms can generally extract high-level geochemical features and thus create models in which geochemical patterns are fully exploited. These DL algorithms are usually constructed to be compatible and consistent with the underlying data, but their interpretability and pertinent physical constraints (such as granitic intrusions related to magmatic-hydrothermal mineralization) are generally overlooked. This paper introduces a physically constrained variational autoencoder (VAE) architecture to identify geochemical patterns associated with tungsten polymetallic mineralization. We first identify physical constraints from geological characteristics and metallogenic regulation via the methods of fry analysis, standard deviation ellipses, and fractal analysis to reveal the controlling function of granitic intrusions on mineralization. Subsequently, we construct the physical constraints based on the nonlinear controlling function and add the geological constraints into the VAE loss function as a penalty term. After optimization of the network parameters, the well-trained physically constrained VAE architecture can recognize the geochemical anomaly patterns that the conventional VAE cannot. These extracted geochemical anomaly patterns generally show a strong spatial relationship with the granitic intrusions. In addition, the performance measures involving the receiver operating characteristic curve and success-rate further indicate that the generalization accuracy of the conventional VAE can be enhanced via physics-based regularization. These results suggest that the proposed physically constrained VAE, which integrates physical knowledge into the VAE loss function, not only improves the geochemical pattern recognition performance but also demonstrates consistency with geological and physical domain knowledge.
- Published
- 2021
114. Learning a Robust Part-Aware Monocular 3D Human Pose Estimator via Neural Architecture Search
- Author
-
Liang Wang, Zerui Chen, Yan Huang, and Hongyuan Yu
- Subjects
Monocular ,business.industry ,Computer science ,Stability (learning theory) ,Estimator ,FLOPS ,Machine learning ,computer.software_genre ,Artificial Intelligence ,Robustness (computer science) ,Pattern recognition (psychology) ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Architecture ,business ,computer ,Pose ,Software - Abstract
Even though most existing monocular 3D human pose estimation methods achieve very competitive performance, they are limited in estimating heterogeneous human body parts with the same decoder architecture. In this work, we present an approach to build a part-aware 3D human pose estimator to better deal with these heterogeneous human body parts. Our proposed method consists of two learning stages: (1) searching suitable decoder architectures for specific parts and (2) training the part-aware 3D human pose estimator built with these optimized neural architectures. Consequently, our searched model is very efficient and compact and can automatically select a suitable decoder architecture to estimate each human body part. In comparison with previous state-of-the-art models built with ResNet-50 network, our method can achieve better performance and reduce 64.4% parameters and 8.5% FLOPs (multiply-adds). We validate the robustness and stability of our searched models by conducting extensive and rigorous ablation experiments. Our method can advance state-of-the-art accuracy on both the single-person and multi-person 3D human pose estimation benchmarks with affordable computational cost.
- Published
- 2021
115. Adaptive coefficient-based kernelized network for personalized activity recognition
- Author
-
Zheng Huo, Xinlong Jiang, Lisha Hu, and Chunyu Hu
- Subjects
Forgetting ,Computational complexity theory ,Computer science ,business.industry ,Process (computing) ,Computational intelligence ,Machine learning ,computer.software_genre ,Activity recognition ,Artificial Intelligence ,Pattern recognition (psychology) ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,Adaptation (computer science) ,computer ,Software ,Wearable technology - Abstract
Human activity recognition (HAR) based on wearable devices has found wide applications in fitness, health care, etc. Given the personalized wearing styles of such devices and distinctive motion patterns, the activities of daily living normally vary from person to person in terms of strength, amplitude, speed, category, etc. The specialization of a universal HAR model to a specific subject without experiencing catastrophic forgetting is a significant challenge. In this paper, we propose a novel incremental learning method, namely, an adaptive coefficient-based kernelized and regularized network (KeRNet-AC), for personalized activity recognition. During the adaptation stage of the model training process, KeRNet-AC consistently monitors the probable ill-conditioned degree of the generated solution, which we believe is strongly correlated with the catastrophic forgetting problem, and automatically makes the solution well conditioned. To reduce the computational complexity of KeRNet-AC, we also introduce an active data selection principle into KeRNet-AC. This variation is called A-KeRNet-AC. To evaluate the performance of KeRNet-AC and A-KeRNet-AC, we conduct extensive experiments on five public activity datasets. The experimental results demonstrate that KeRNet-AC outperforms related state-of-the-art methods in most cases and that A-KeRNet-AC can quickly perform model training and activity prediction. Moreover, the performance of the proposed methods steadily improves during the adaptation stage and ultimately converges without degradation, demonstrating the strong potential of KeRNet-AC and A-KeRNet-AC for personalized activity recognition.
- Published
- 2021
116. Content authentication and tampering detection of Arabic text: an approach based on zero-watermarking and natural language processing
- Author
-
Mohammad Mahzari, Mohammed Medani, Fahd N. Al-Wesabi, Anwer Mustafa Hilal, Khalid Mahmood, and Manar Ahmed Hamza
- Subjects
Soft computing ,Authentication ,Alphanumeric ,Computer science ,Digital content ,Watermark ,computer.software_genre ,Artificial Intelligence ,Robustness (computer science) ,Pattern recognition (psychology) ,Computer Vision and Pattern Recognition ,Data mining ,computer ,Digital watermarking - Abstract
Due to the rapid increase in exchange of text information via internet network, the security and the reliability of the digital content has become a major research issue. The main challenges faced by researchers are content authentication, integrity verification and tampering detection of digital contents. In this paper, these issues are addressed with great emphasis on text information, which is natural language dependent. Hence, a novel intelligent zero-watermarking approach is proposed for content authentication and tampering detection of Arabic text contents. In the proposed approach, both the embedding and extracting of the watermark are logically implemented, which causes no change on the digital text. This is achieved by using fourth-level-order and alphanumeric mechanism of Markov model as a soft computing technique for the analysis of Arabic text to obtain the features of the given text which is considered as the digital watermark. This digital watermark is later used for the detection of any tampering attack on the received Arabic text. An extensive set of experiments using four datasets of varying lengths proves the effectiveness of the proposed approach in terms of robustness, effectiveness and applicability under multiple random locations of insertion, reorder and deletion attacks. Compared with baseline approaches, the proposed approach has improved performance regarding watermark robustness and tampering detection accuracy.
- Published
- 2021
117. Mixed attention hourglass network for robust face alignment
- Author
-
Xiongkai Shao, Zou Yang, Gao Rong, Jun Wan, and Zhihui Lai
- Subjects
Computer science ,business.industry ,Complex system ,Pattern recognition ,Computational intelligence ,law.invention ,Discriminative model ,Artificial Intelligence ,law ,Position (vector) ,Face (geometry) ,Pattern recognition (psychology) ,Benchmark (computing) ,Computer Vision and Pattern Recognition ,Hourglass ,Artificial intelligence ,business ,Software - Abstract
Unconstrained face alignment is still a challenging problem due to the large poses, partial occlusions and complicated illuminations. To address these issues, in this paper, we propose a mixed attention hourglass network (MAttHG) to learn more discriminative representations by modeling the correlated relationships between features. Specifically, by integrating the attention module from features of different levels in the stacked hourglass networks, MAttHG can capture rich contextual correlations, which can be further used to combine local features to better model the spatial position relationship of facial landmarks. Furthermore, by combining the hourglass network and the attention module, MAttHG can effectively model the global attention and local attention to enhance the facial shape constraints for robust face alignment. Moreover, a head pose prediction module is designed to adaptively adjust the weight of each sample in the training set and redefine the loss function for addressing the problem of data imbalance. Experimental results on challenging benchmark datasets demonstrate the superiority of our MAttHG over state-of-the-art face alignment methods.
- Published
- 2021
118. Dual discriminator adversarial distillation for data-free model compression
- Author
-
Haoran Zhao, Junyu Dong, Hui Yu, Milos Manic, Xin Sun, and Huiyu Zhou
- Subjects
Discriminator ,Artificial neural network ,Edge device ,Computer science ,business.industry ,Computational intelligence ,Machine learning ,computer.software_genre ,law.invention ,Generative model ,Artificial Intelligence ,law ,Pattern recognition (psychology) ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,computer ,Distillation ,Software ,Generator (mathematics) - Abstract
Knowledge distillation has been widely used to produce portable and efficient neural networks which can be well applied on edge devices for computer vision tasks. However, almost all top-performing knowledge distillation methods need to access the original training data, which usually has a huge size and is often unavailable. To tackle this problem, we propose a novel data-free approach in this paper, named Dual Discriminator Adversarial Distillation (DDAD) to distill a neural network without the need of any training data or meta-data. To be specific, we use a generator to create samples through dual discriminator adversarial distillation, which mimics the original training data. The generator not only uses the pre-trained teacher’s intrinsic statistics in existing batch normalization layers but also obtains the maximum discrepancy from the student model. Then the generated samples are used to train the compact student network under the supervision of the teacher. The proposed method obtains an efficient student network which closely approximates its teacher network, without using the original training data. Extensive experiments are conducted to demonstrate the effectiveness of the proposed approach on CIFAR, Caltech101 and ImageNet datasets for classification tasks. Moreover, we extend our method to semantic segmentation tasks on several public datasets such as CamVid, NYUv2, Cityscapes and VOC 2012. To the best of our knowledge, this is the first work on generative model based data-free knowledge distillation on large-scale datasets such as ImageNet, Cityscapes and VOC 2012. Experiments show that our method outperforms all baselines for data-free knowledge distillation.
- Published
- 2021
119. Roza: a new and comprehensive metric for evaluating classification systems
- Author
-
Negin Melek and Mesut Melek
- Subjects
Jaccard index ,business.industry ,Computer Science::Information Retrieval ,Biomedical Engineering ,Bioengineering ,Pattern recognition ,General Medicine ,Measure (mathematics) ,Imbalanced data ,Computer Science Applications ,Machine Learning ,Human-Computer Interaction ,Cohen's kappa ,Area Under Curve ,Computer Science::Computer Vision and Pattern Recognition ,Pattern recognition (psychology) ,Metric (mathematics) ,Area under curve ,Cluster Analysis ,Artificial intelligence ,business ,Mathematics - Abstract
Many metrics such as accuracy rate (ACC), area under curve (AUC), Jaccard index (JI), and Cohen's kappa coefficient are available to measure the success of the system in pattern recognition and machine/deep learning systems. However, the superiority of one system to one other cannot be determined based on the mentioned metrics. This is because such a system can be successful using one metric, but not the other ones. Moreover, such metrics are insufficient when the number of samples in the classes is unequal (imbalanced data). In this case, naturally, by using these metrics, a sensible comparison cannot be made between two given systems. In the present study, the comprehensive, fair, and accurate Roza (Roza means rose in Persian. When different permutations of the metrics used are superimposed in a polygon format, it looks like a flower, so we named it Roza.) metric is introduced for evaluating classification systems. This metric, which facilitates the comparison of systems, expresses the summary of many metrics with a single value. To verify the stability and validity of the metric and to conduct a comprehensive, fair, and accurate comparison between the systems, the Roza metric of the systems tested under the same conditions are calculated and comparisons are made. For this, systems tested with three different strategies on three different datasets are considered. The results show that the performance of the system can be summarized by a single value and the Roza metric can be used in all systems that include classification processes, as a powerful metric.
- Published
- 2021
120. Pattern recognition from light delivery vehicle crash characteristics
- Author
-
Subasish Das, Anandi Dutta, M. Ashifur Rahman, and Xiaoduan Sun
- Subjects
Transport engineering ,Computer science ,Pattern recognition (psychology) ,Transportation ,Food delivery ,Light delivery ,human activities ,Safety Research ,Motor vehicle crash - Abstract
In the era of food delivery and grocery delivery startups, traffic crashes associated with light delivery vehicles have increased significantly. Since the number of these crashes is increasing, it ...
- Published
- 2021
121. Neural Networks As A Tool For Pattern Recognition of Fasteners
- Author
-
Amer Tahseen Abu-Jassar, S. Sotnik, Yasser Mohammad Al-Sharo, and V. Lyashenko
- Subjects
Artificial neural network ,Computer science ,business.industry ,Pattern recognition (psychology) ,General Engineering ,Pattern recognition ,Artificial intelligence ,business - Published
- 2021
122. Sparse robust multiview feature selection via adaptive-weighting strategy
- Author
-
Jing Zhong, Yuqing Chen, Ping Zhong, and Zhi Wang
- Subjects
Computational complexity theory ,business.industry ,Iterative method ,Computer science ,Feature selection ,Pattern recognition ,Computational intelligence ,Discriminative model ,Artificial Intelligence ,Robustness (computer science) ,Pattern recognition (psychology) ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,Software ,Block (data storage) - Abstract
Due to the rich and comprehensive information of multiview data, multi-view learning has been attracted widely attention. Efficiently exploiting multiview data to select discriminative features to improve classification performance is very important in multi-view learning. Most existing supervised methods learn an entire projection matrix by concatenating multiple views into a long vector, thus they often ignore the relationship between views. To solve this problem, in this paper, we propose a novel sparse robust multiview feature selection model, which simultaneously considers the robustness, individuality and commonality of views via adaptive-weighting strategy. The model adopts the soft capped-norm loss to calculate the residual in each view to effectively reduce the impact of noises and outliers. Moreover, the model employs the adaptive-weighting strategy to show the individuality and commonality of views without introducing extra parameters. In addition, it introduces structured sparsity regularization to select the discriminative features. An efficient iterative algorithm is proposed to individually learn each block of the projection matrix with low computational complexity, and the convergence of the proposed optimization algorithm is verified theoretically and experimentally. The comparative experiments are conducted on multiview datasets with several state-of-the-art algorithms, and the experimental results show that the proposed method gets better performance than the others.
- Published
- 2021
123. Weak-label-based global and local multi-view multi-label learning with three-way clustering
- Author
-
Changming Zhu, Lai Wei, Duoqian Miao, Ri-Gui Zhou, Dujuan Cao, YiLing Dong, and Shuaiping Guo
- Subjects
Exploit ,business.industry ,Computer science ,Computational intelligence ,Sample (statistics) ,Machine learning ,computer.software_genre ,Data set ,Core (game theory) ,Artificial Intelligence ,Pattern recognition (psychology) ,Convergence (routing) ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Cluster analysis ,business ,computer ,Software - Abstract
This paper develops a weak-label-based global and local multi-view multi-label learning with three-way clustering (WL-GLMVML-ATC) to solve multi-view multi-label data sets and exploit more authentic global and local label correlations of both the whole data set and each view simultaneously. Different from the traditional learning methods, WL-GLMVML-ATC pays more attention to the solutions of weak-label cases and uncertain relationships of clusters with the usage of Universum and active three-way clustering. According to Universum notion, even though the size of labeled instances is much more smaller than the unlabeled ones, the useful sample information can still be enhanced. Through the active three-way clustering strategy, the belongingness of instances to a cluster depend on the probabilities of uncertain instances belonging to core regions. This strategy brings a more authentic local label correlation since many traditional methods suppose that instances and the corresponding clusters always exhibit certain relationships such as belong-to definitely and not belong-to definitely. This hypothesis is not ubiquitous in real-world applications. According to the experiments, we can see WL-GLMVML-ATC (1) achieves a better performance, be superior to the classical multi-view learning methods and multi-label learning methods in statistical, advances the development of these learning methods in final; (2) won’t add too much running time; (3) has a good convergence and ability to process multi-view multi-label data sets.
- Published
- 2021
124. Segmentation-based multi-scale attention model for KRAS mutation prediction in rectal cancer
- Author
-
Jiawen Wang, Muhammad Bilal Zia, Zijuan Zhao, Juanjuan Zhao, Kai Song, and Yan Qiang
- Subjects
Computer science ,business.industry ,Deep learning ,Computational intelligence ,Machine learning ,computer.software_genre ,Identification (information) ,Discriminative model ,Artificial Intelligence ,Mutation (genetic algorithm) ,Pattern recognition (psychology) ,Segmentation ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,computer ,Encoder ,Software - Abstract
Kirsten Ras (KRAS) mutation identification has great clinical significance to formulate the rectal cancer treatment scheme. Recently, the development of deep learning does much help to improve the computer-aided diagnosis technology. However, deep learning models are usually designed for only one task, ignoring the potential benefits in jointly performing both tasks. In this paper, we proposed a joint network named segmentation-based multi-scale attention model (SMSAM) to predict the mutation status of KRAS gene in rectal cancer. More specifically, the network performs segmentation and prediction tasks at the same time. The two tasks mutually transfer knowledge between each other by sharing the same encoder. Meanwhile, two universal multi-scale attention blocks are introduced to ensure that the network more focuses on the region of interest. Besides, we also proposed an entropy branch to provide more discriminative features for the model. Finally, the method is evaluated on internal and external datasets. The results show that the comprehensive performance of SMSAM is better than the existing methods. The code and model have been publicly available.
- Published
- 2021
125. Evidence theory based optimal scale selection for multi-scale ordered decision systems
- Author
-
Wei-Zhi Wu, Anhui Tan, Jia-Wen Zheng, and Han Bao
- Subjects
Mathematical optimization ,Artificial Intelligence ,Decision system ,Pattern recognition (psychology) ,Complex system ,Information system ,Equivalence relation ,Scale (descriptive set theory) ,Computational intelligence ,Computer Vision and Pattern Recognition ,Rough set ,Software ,Mathematics - Abstract
In real data sets, objects are usually measured by multiple scales under the same attribute. Many information systems are given dominance relations on account of various factors which make classical equivalence relations change accordingly. This paper investigates the optimal scale selection for multi-scale ordered decision systems based on evidence theory. Five concepts of optimal scales related to rough set theory and the Dempster–Shafer theory of evidence in multi-scale ordered information/decision systems are first defined. Relationships are then clarified among $$\ge$$ -optimal scale, $$\ge$$ -lower approximation and $$\ge$$ -upper approximation optimal scales as well as $$\ge$$ -belief and $$\ge$$ -plausibility optimal scales in multi-scale ordered information systems and consistent multi-scale ordered decision systems respectively. Finally, in inconsistent multi-scale ordered decision systems, by introducing a notion of $$\ge$$ -generalized decision optimal scale, relationships among different types of optimal scales are also examined.
- Published
- 2021
126. Feature definition and comprehensive analysis on the robust identification of intraretinal cystoid regions using optical coherence tomography images
- Author
-
Plácido L. Vidal, Manuel G. Penedo, Marcos Ortega, Jorge Novo, Joaquim de Moura, and José Rouco
- Subjects
Modality (human–computer interaction) ,Optical coherence tomography ,medicine.diagnostic_test ,business.industry ,Computer science ,Feature selection ,Pattern recognition ,Computer-aided diagnosis ,Classification ,Identification (information) ,Statistical classification ,Texture analysis ,Artificial Intelligence ,Feature (computer vision) ,Pattern recognition (psychology) ,Medical imaging ,medicine ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,Feature analysis - Abstract
Financiado para publicación en acceso aberto: Universidade da Coruña/CISUG [Abstract] Currently, optical coherence tomography is one of the most used medical imaging modalities, offering cross-sectional representations of the studied tissues. This image modality is specially relevant for the analysis of the retina, since it is the internal part of the human body that allows an almost direct examination without invasive techniques. One of the most representative cases of use of this medical imaging modality is for the identification and characterization of intraretinal fluid accumulations, critical for the diagnosis of one of the main causes of blindness in developed countries: the Diabetic Macular Edema. The study of these fluid accumulations is particularly interesting, both from the point of view of pattern recognition and from the different branches of health sciences. As these fluid accumulations are intermingled with retinal tissues, they present numerous variants according to their severity, and change their appearance depending on the configuration of the device; they are a perfect subject for an in-depth research, as they are considered to be a problem without a strict solution. In this work, we propose a comprehensive and detailed analysis of the patterns that characterize them. We employed a pool of 11 different texture and intensity feature families (giving a total of 510 markers) which we have analyzed using three different feature selection strategies and seven complementary classification algorithms. By doing so, we have been able to narrow down and explain the factors affecting this kind of accumulations and tissue lesions by means of machine learning techniques with a pipeline specially designed for this purpose. Instituto de Salud Carlos III, Government of Spain, DTS18/00136 research project; Ministerio de Ciencia e Innovación y Universidades, Government of Spain, RTI2018-095894-B-I00 research project, Ayudas para la formación de profesorado universitario (FPU), grant ref. FPU18/02271; Ministerio de Ciencia e Innovación, Government of Spain through the research project with reference PID2019-108435RB-I00; Consellería de Cultura, Educación e Universidade, Xunta de Galicia, Grupos de Referencia Competitiva, grant ref. ED431C 2020/24 and through the postdoctoral grant contract ref. ED481B 2021/059; Axencia Galega de Innovación (GAIN), Xunta de Galicia, grant ref. IN845D 2020/38; CITIC, Centro de Investigación de Galicia ref. ED431G 2019/01, receives financial support from Consellería de Educación, Universidade e Formación Profesional, Xunta de Galicia, through the ERDF (80%) and Secretaría Xeral de Universidades (20%). Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. Funding for open access charge: Universidade da Coruña/CISUG Xunta de Galicia; ED431C 2020/24 Xunta de Galicia; ED481B 2021/059 Xunta de Galicia; IN845D 2020/38 Xunta de Galicia; ED431G 2019/01
- Published
- 2021
127. An end-to-end network for irregular printed Mongolian recognition
- Author
-
YaTu Ji, ShaoDong Cui, Ren Qing dao er ji, and YiLa Su
- Subjects
Sequence ,business.industry ,Computer science ,Process (computing) ,Context (language use) ,Pattern recognition ,Computer Science Applications ,Image (mathematics) ,Convolution ,End-to-end principle ,Feature (computer vision) ,Pattern recognition (psychology) ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,Software - Abstract
Mongolian is a language spoken in Inner Mongolia, China. In the recognition process, due to the shooting angle and other reasons, the image and text will be deformed, which will cause certain difficulties in recognition. This paper propose a triplet attention Mogrifier network (TAMN) for print Mongolian text recognition. The network uses a spatial transformation network to correct deformed Mongolian images. It uses gated recurrent convolution layers (GRCL) combine with triplet attention module to extract image features for the corrected images. The Mogrifier long short-term memory (LSTM) network gets the context sequence information in the feature and finally uses the decoder’s LSTM attention to get the prediction result. Experimental results show the spatial transformation network can effectively recognize deformed Mongolian images, and the recognition accuracy can reach 90.30%. This network achieves good performance in Mongolian text recognition compare with the current mainstream text recognition network. The dataset has been publicly available at https://github.com/ShaoDonCui/Mongolian-recognition .
- Published
- 2021
128. Optimal Channel-set and Feature-set Assessment for Foot Movement Based EMG Pattern Recognition
- Author
-
Neelesh Kumar and Neha Hooda
- Subjects
Foot (prosody) ,Set (abstract data type) ,Artificial Intelligence ,Computer science ,Movement (music) ,business.industry ,Pattern recognition (psychology) ,Pattern recognition ,Artificial intelligence ,business ,Feature set ,Communication channel - Published
- 2021
129. Physical Representation Learning and Parameter Identification from Video Using Differentiable Physics
- Author
-
Michael Moeller, Rama Krishna Kandukuri, Jan Achterhold, and Joerg Stueckler
- Subjects
business.industry ,Machine learning ,computer.software_genre ,ENCODE ,Field (computer science) ,Identification (information) ,Artificial Intelligence ,Pattern recognition (psychology) ,Computer Vision and Pattern Recognition ,Observability ,Artificial intelligence ,Differentiable function ,Representation (mathematics) ,business ,Feature learning ,computer ,Software - Abstract
Representation learning for video is increasingly gaining attention in the field of computer vision. For instance, video prediction models enable activity and scene forecasting or vision-based planning and control. In this article, we investigate the combination of differentiable physics and spatial transformers in a deep action conditional video representation network. By this combination our model learns a physically interpretable latent representation and can identify physical parameters. We propose supervised and self-supervised learning methods for our architecture. In experiments, we consider simulated scenarios with pushing, sliding and colliding objects, for which we also analyze the observability of the physical properties. We demonstrate that our network can learn to encode images and identify physical properties like mass and friction from videos and action sequences. We evaluate the accuracy of our training methods, and demonstrate the ability of our method to predict future video frames from input images and actions.
- Published
- 2021
130. DeMoCap: Low-Cost Marker-Based Motion Capture
- Author
-
Stefanos Kollias, Petros Daras, Anargyros Chatzitofis, and Dimitrios Zarpalas
- Subjects
Ground truth ,Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Motion capture ,Rendering (computer graphics) ,Set (abstract data type) ,Artificial Intelligence ,Position (vector) ,Pattern recognition (psychology) ,Computer vision ,Computer Vision and Pattern Recognition ,Differentiable function ,Artificial intelligence ,Noise (video) ,business ,Software ,ComputingMethodologies_COMPUTERGRAPHICS - Abstract
Optical marker-based motion capture (MoCap) remains the predominant way to acquire high-fidelity articulated body motions. We introduce DeMoCap, the first data-driven approach for end-to-end marker-based MoCap, using only a sparse setup of spatio-temporally aligned, consumer-grade infrared-depth cameras. Trading off some of their typical features, our approach is the sole robust option for far lower-cost marker-based MoCap than high-end solutions. We introduce an end-to-end differentiable markers-to-pose model to solve a set of challenges such as under-constrained position estimates, noisy input data and spatial configuration invariance. We simultaneously handle depth and marker detection noise, label and localize the markers, and estimate the 3D pose by introducing a novel spatial 3D coordinate regression technique under a multi-view rendering and supervision concept. DeMoCap is driven by a special dataset captured with 4 spatio-temporally aligned low-cost Intel RealSense D415 sensors and a 24 MXT40S camera professional MoCap system, used as input and ground truth, respectively.
- Published
- 2021
131. Deep Trajectory Post-Processing and Position Projection for Single & Multiple Camera Multiple Object Tracking
- Author
-
Xiaodong Xie, Wen Gao, Huizhu Jia, Cong Ma, Yuan Li, and Fan Yang
- Subjects
Matching (graph theory) ,Computer science ,business.industry ,Tracking (particle physics) ,Artificial Intelligence ,Position (vector) ,Video tracking ,Pattern recognition (psychology) ,Trajectory ,Computer vision ,Anomaly detection ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,Projection (set theory) ,Software - Abstract
Multiple Object Tracking (MOT) has attracted increasing interests in recent years, which plays a significant role in video analysis. MOT aims to track the specific targets as whole trajectories and locate the positions of the trajectory at different times. These trajectories are usually applied in Action Recognition, Anomaly Detection, Crowd Analysis and Multiple-Camera Tracking, etc. However, existing methods are still a challenge in complex scene. Generating false (impure, incomplete) tracklets directly affects the performance of subsequent tasks. Therefore, we propose a novel architecture, Siamese Bi-directional GRU, to construct Cleaving Network and Re-connection Network as trajectory post-processing. Cleaving Network is able to split the impure tracklets as several pure sub-tracklets, and Re-connection Network aims to re-connect the tracklets which belong to same person as whole trajectory. In addition, our methods are extended to Multiple-Camera Tracking, however, current methods rarely consider the spatial-temporal constraint, which increases redundant trajectory matching. Therefore, we present Position Projection Network (PPN) to convert trajectory position from local camera-coordinate to global world-coodrinate, which provides adequate and accurate temporal-spatial information for trajectory association. The proposed technique is evaluated over two widely used datasets MOT16 and Duke-MTMCT, and experiments demonstrate its superior effectiveness as compared with the state-of-the-arts.
- Published
- 2021
132. Selection of data products: a hybrid AFSA-MABAC approach
- Author
-
Lining Xing, Suizhi Luo, and Witold Pedrycz
- Subjects
Computer science ,Swarm behaviour ,Particle swarm optimization ,Computational intelligence ,computer.software_genre ,Artificial Intelligence ,Robustness (computer science) ,Pattern recognition (psychology) ,Genetic algorithm ,Fuzzy number ,Computer Vision and Pattern Recognition ,Data mining ,computer ,Software ,Selection (genetic algorithm) - Abstract
With the growing demands of data products, the selection of satellite image data products becomes a challenging decision issue for customers. The objective of this study is to propose a practically sound decision-making approach for solving the satellite image data products selection problems. First, the influencing factors of selecting satellite image data products are identified. Then, hybrid evaluation information is recommended to represent these criteria. That is, numerical and interval-valued quantification is used for quantitative criteria, and picture fuzzy numbers (PFNs) are considered to express qualitative criteria. To reflect decision makers’ preferences, a non-linear optimization is implemented to treat criteria weights with constraints. Thereafter, some penalty functions are defined and the artificial fish swarm algorithm (AFSA) is improved to calculate weight values. Furthermore, six main parameters of AFSA are analyzed. Compared with other commonly used algorithms (such as genetic algorithm (GA) and particle swarm optimization (PSO)), the largest advantage of AFSA is its high robustness of parameters and initial values. Finally, the traditional multi-attributive border approximation area comparison (MABAC) is modified with likelihood measures to obtain the best data product in hybrid evaluation environments. Furthermore, the feasibility and effectiveness of the proposed approach is validated by comparing with existing methods in some representative literature. The results demonstrate that the proposed method is feasible and can provide useful guidelines for the selection and pricing of satellite image data products.
- Published
- 2021
133. 3D-FUTURE: 3D Furniture Shape with TextURE
- Author
-
Stephen J. Maybank, Binqiang Zhao, Rongfei Jia, Dacheng Tao, Mingming Gong, Lin Gao, and Huan Fu
- Subjects
FOS: Computer and information sciences ,Online model ,Computer science ,business.industry ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Object (computer science) ,3D modeling ,Texture (geology) ,Artificial Intelligence ,Pattern recognition (psychology) ,Computer vision ,Polygon mesh ,Segmentation ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,Pose ,Software ,ComputingMethodologies_COMPUTERGRAPHICS - Abstract
The 3D CAD shapes in current 3D benchmarks are mostly collected from online model repositories. Thus, they typically have insufficient geometric details and less informative textures, making them less attractive for comprehensive and subtle research in areas such as high-quality 3D mesh and texture recovery. This paper presents 3D Furniture shape with TextURE (3D-FUTURE): a richly-annotated and large-scale repository of 3D furniture shapes in the household scenario. At the time of this technical report, 3D-FUTURE contains 20,240 clean and realistic synthetic images of 5,000 different rooms. There are 9,992 unique detailed 3D instances of furniture with high-resolution textures. Experienced designers developed the room scenes, and the 3D CAD shapes in the scene are used for industrial production. Given the well-organized 3D-FUTURE, we provide baseline experiments on several widely studied tasks, such as joint 2D instance segmentation and 3D object pose estimation, image-based 3D shape retrieval, 3D object reconstruction from a single image, and texture recovery for 3D shapes, to facilitate related future researches on our database., Project Page: https://tianchi.aliyun.com/specials/promotion/alibaba-3d-future
- Published
- 2021
134. Analytical Comparison of Clustering Techniques for the Recognition of Communication Patterns
- Author
-
Mareike Schoop and Muhammed-Fatih Kaya
- Subjects
Computer science ,Process (engineering) ,Strategy and Management ,media_common.quotation_subject ,General Social Sciences ,General Decision Sciences ,computer.software_genre ,Term (time) ,Negotiation ,Arts and Humanities (miscellaneous) ,Management of Technology and Innovation ,Pattern recognition (psychology) ,Cluster (physics) ,Milestone (project management) ,Data mining ,Cluster analysis ,computer ,media_common ,Curse of dimensionality - Abstract
The systematic processing of unstructured communication data as well as the milestone of pattern recognition in order to determine communication groups in negotiations bears many challenges in Machine Learning. In particular, the so-called curse of dimensionality makes the pattern recognition process demanding and requires further research in the negotiation environment. In this paper, various selected renowned clustering approaches are evaluated with regard to their pattern recognition potential based on high-dimensional negotiation communication data. A research approach is presented to evaluate the application potential of selected methods via a holistic framework including three main evaluation milestones: the determination of optimal number of clusters, the main clustering application, and the performance evaluation. Hence, quantified Term Document Matrices are initially pre-processed and afterwards used as underlying databases to investigate the pattern recognition potential of clustering techniques by considering the information regarding the optimal number of clusters and by measuring the respective internal as well as external performances. The overall research results show that certain cluster separations are recommended by internal and external performance measures by means of a holistic evaluation approach, whereas three of the clustering separations are eliminated based on the evaluation results.
- Published
- 2021
135. Multi-label space reshape for semantic-rich label-specific features learning
- Author
-
Shufang Pang, Yusheng Cheng, and Chao Zhang
- Subjects
Computer science ,Covariance matrix ,business.industry ,Cosine similarity ,Pattern recognition ,Design structure matrix ,Matrix (mathematics) ,ComputingMethodologies_PATTERNRECOGNITION ,Artificial Intelligence ,Pattern recognition (psychology) ,Local consistency ,Embedding ,Logical matrix ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,Software - Abstract
Existing label-specific features learning techniques mainly use embedding-based researching methods. However, there exist many problems such as inadequate consideration of label semantics, the sparseness of selected features and so on. Herein, the LSR-LSF (multi-label space reshape for semantic-rich label-specific features learning) algorithm is proposed in this paper to solve these problems. Firstly, the sparse logical matrix is constructed into a numerical label matrix through the label propagation dependency matrix. Secondly, constraint propagation is added to avoid the differences that may exist in the label matrix before or after the reshaping process. The alternate iteration method is used to obtain the numerical label vector. At the same time, the reshaped label correlation matrix is constructed by the cosine similarity to constrain the solution space. Then, measuring whether the learning ability of label-specific features has been improved. Finally, extensive experiments on benchmark datasets show the superiority of LSR-LSF over other state-of-the-art label-specific features learning methods.
- Published
- 2021
136. Applications of Machine Learning in Ambulatory ECG
- Author
-
Long Yu and Joel Xue
- Subjects
noise reduction ,Signal processing ,Computer science ,business.industry ,Event (computing) ,Deep learning ,pattern recognition ,SIGNAL (programming language) ,deep learning ,Machine learning ,computer.software_genre ,Beat detection ,machine learning ,Pattern recognition (psychology) ,Key (cryptography) ,Medicine ,Preprocessor ,Artificial intelligence ,ambulatory ECG ,business ,computer ,Holter ECG - Abstract
The ambulatory ECG (AECG) is an important diagnostic tool for many heart electrophysiology-related cases. AECG covers a wide spectrum of devices and applications. At the core of these devices and applications are the algorithms responsible for signal conditioning, ECG beat detection and classification, and event detections. Over the years, there has been huge progress for algorithm development and implementation thanks to great efforts by researchers, engineers, and physicians, alongside the rapid development of electronics and signal processing, especially machine learning (ML). The current efforts and progress in machine learning fields are unprecedented, and many of these ML algorithms have also been successfully applied to AECG applications. This review covers some key AECG applications of ML algorithms. However, instead of doing a general review of ML algorithms, we are focusing on the central tasks of AECG and discussing what ML can bring to solve the key challenges AECG is facing. The center tasks of AECG signal processing listed in the review include signal preprocessing, beat detection and classification, event detection, and event prediction. Each AECG device/system might have different portions and forms of those signal components depending on its application and the target, but these are the topics most relevant and of greatest concern to the people working in this area.
- Published
- 2021
137. The design of error-correcting output codes algorithm for the open-set recognition
- Author
-
Qingqi Hong, Yi-Fan Liang, Wang-Ping Zhan, Junfeng Yao, Kun-Hong Liu, Hong-Zhou Guo, Qingqiang Wu, and Ya-Nan Zhang
- Subjects
Class (computer programming) ,Artificial Intelligence ,Computer science ,Pattern recognition (psychology) ,Code (cryptography) ,Feature (machine learning) ,Cluster analysis ,Row ,Algorithm ,Field (computer science) ,Test data - Abstract
The Open-Set recognition is an important topic in the pattern recognition research field. Different from the close-set recognition task, in the open-set recognition problem, the test data contains unknown classes that do not appear in the training phase. Consequently, the recognition of the open-set data is much more difficult than that of the close-set problem. This study applies the Error-Correcting Output Codes (ECOC) framework to handle the open-set problem by dynamically adding new functions to deal with the unknown classes, named ECOC-OS. Our algorithm includes two steps: (1) the unknown data discovery step based on a rejection strategy; (2) the code matrix expanding step for the separation of the unknown classes from the known classes. Due to the wide and chaotic distribution of the unknown class samples, this paper refines the unknown class into multiple sub-classes, and each sub-class has its own feature distribution. After preliminary row and column expansion and class splitting for the unknown class, the clustering algorithm is used to continuously refine the characteristics of the unknown class, dividing it into several sub-classes. Then the algorithm adds multiple coding rows and multiple "one-to-all" basic classifiers, so as to distinguish each unknown sub-class from multiple known classes. Finally, without re-training the existing learners, the zero symbols in the code matrix are selectively re-encoded according to the basic learners’ preference. The experiments deploy 16 data sets for the test, and the results confirm that ECOC-OS algorithm effectively improves the performance compared with other open-set recognition methods.
- Published
- 2021
138. Two‐parametric generalized fuzzy knowledge measure and accuracy measure with applications
- Author
-
Surender Singh and Abdul Haseeb Ganie
- Subjects
Computer science ,business.industry ,Fuzzy set ,Measure (physics) ,Pattern recognition ,Fuzzy logic ,Theoretical Computer Science ,Human-Computer Interaction ,Fuzzy entropy ,Artificial Intelligence ,Pattern recognition (psychology) ,Artificial intelligence ,business ,Software ,Parametric statistics - Published
- 2021
139. LineSeg: line segmentation of scanned newspaper documents
- Author
-
Simpel Rani Jindal, Rupinder Pal Kaur, M. K. Jindal, Shikha Tuteja, and Munish Kumar
- Subjects
Space (punctuation) ,Computer science ,business.industry ,computer.software_genre ,Field (computer science) ,Newspaper ,Artificial Intelligence ,Pattern recognition (psychology) ,Font ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,Segmentation ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Line (text file) ,business ,computer ,Natural language processing ,Block (data storage) - Abstract
Segmentation is a significant stage for the recognition of old newspapers. Text-line extraction in the documents like newspaper pages which have very complex layouts poses a significant challenge. Old newspaper documents printed in Gurumukhi script present several forms of hurdles in segmentation due to noise, degradation, bleed-through of ink, multiple font styles and sizes, little space between neighboring text lines, overlapping of lines, etc. Because of the low quality and the complexity of these documents, automatic text line segmentation remains an open research field. Very few researches are available in the literature to segment news articles in Gurumukhi script. This is one of the first few attempts to recognize Gurumukhi newspaper text. The goal of this paper is to present a new methodology for text-line extraction by integrating median calculation and strip height calculation techniques. Non-suitability of existing techniques to segment newspaper text lines have also been discussed with results in the article. The efficiency of the proposed algorithm is demonstrated by experimentation directed on two diverse own made datasets: (a) on the data set of single-column documents with headlines block (b) on the dataset of multi-column documents with headlines block.
- Published
- 2021
140. Towards unified on-road object detection and depth estimation from a single image
- Author
-
Guancheng Chen, Huabiao Qin, Yan Wang, and Guofei Lian
- Subjects
Computer science ,business.industry ,Context (language use) ,Object (computer science) ,Convolutional neural network ,Object detection ,Artificial Intelligence ,Feature (computer vision) ,Pattern recognition (psychology) ,Computer vision ,Computer Vision and Pattern Recognition ,Pyramid (image processing) ,Artificial intelligence ,business ,Feature learning ,Software - Abstract
On-road object detection based on convolutional neural network (CNN) is an important problem in the field of automatic driving. However, traditional 2D object detection aims to accomplish object classification and location in image space, lacking the ability to acquire the depth information. Besides, it is inefficient to cascade the object detection and monocular depth estimation network for realizing 2.5D object detection. To address this problem, we propose a unified multi-task learning mechanism of object detection and depth estimation. Firstly, we propose an innovative loss function, namely projective consistency loss, which uses the perspective projection principle to model the transformation relationship between the target size and the depth value. Therefore, the object detection task and the depth estimation task can be mutually constrained. Then, we propose a global multi-scale feature extracting scheme by combining the Global Context (GC) and Atrous Spatial Pyramid Pooling (ASPP) block in an appropriate way, which can promote effective feature learning and collaborative learning between object detection and depth estimation. Comprehensive experiments conducted on KITTI and Cityscapes dataset show that our approach achieves high mAP and low distance estimation error, outperforming other state-of-the-art methods.
- Published
- 2021
141. A new method for textile pattern recognition and recoloring
- Author
-
Dejun Zheng
- Subjects
business.industry ,Computer science ,General Chemical Engineering ,Pattern recognition (psychology) ,Human Factors and Ergonomics ,Pattern recognition ,General Chemistry ,Artificial intelligence ,Textile (markup language) ,business ,Automation - Published
- 2021
142. Semi-supervised label enhancement via structured semantic extraction
- Author
-
Xiuyi Jia, Tao Wen, Lei Chen, and Weiwei Li
- Subjects
Computer science ,business.industry ,Feature vector ,media_common.quotation_subject ,Rank (computer programming) ,Pattern recognition ,Sample (statistics) ,Computational intelligence ,Ambiguity ,ComputingMethodologies_PATTERNRECOGNITION ,Artificial Intelligence ,Pattern recognition (psychology) ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Focus (optics) ,business ,Representation (mathematics) ,Software ,media_common - Abstract
Label enhancement (LE) is a process of recovering the label distribution from logical labels in the datasets, the goal of which is to better express the label ambiguity through the form of label distribution. Existing LE work mainly focus on exploring the data distribution in the feature space based on complete features and complete logical labels. However, it is not always easy to obtain multi-label datasets with logical labels for all samples in real world, most of datasets have only a few samples with annotated labels. To this end, we propose a novel semi-supervised label enhancement method via structured semantic extraction (SLE-SSE), which can recover the complete label distribution from only a few logical labels. Firstly, we extract self-semantic of samples by expressing inherent ambiguity of each sample in the input space appropriately, and fill in the missing labels based on this kind of information. Secondly, we take advantage of low rank representation to extract the inter-semantics of between samples and between labels, respectively. Finally, we apply a simple but effective linear model to recover the complete label distribution by utilizing the structured semantic information including intra-sample, inter-sample and inter-label based information. Extensive comparative experiments validate the effectiveness of the proposed method.
- Published
- 2021
143. A Loser-Take-All DNA Circuit
- Author
-
Lulu Qian, Namita Sarraf, and Kellen R. Rodriguez
- Subjects
Neurons ,Artificial neural network ,Computer science ,Biomedical Engineering ,DNA ,General Medicine ,Biochemistry, Genetics and Molecular Biology (miscellaneous) ,Signal ,Winner-take-all ,Set (abstract data type) ,Genetic Techniques ,Scalability ,Pattern recognition (psychology) ,Neural Networks, Computer ,Representation (mathematics) ,Algorithm ,Electronic circuit - Abstract
DNA-based neural networks are a type of DNA circuit capable of molecular pattern recognition tasks. Winner-take-all DNA networks have been developed to scale up the complexity of molecular pattern recognition with a simple molecular implementation. This simplicity was achieved by replacing negative weights in individual neurons with lateral inhibition and competition across neurons, eliminating the need for dual-rail representation. Here we introduce a new type of DNA circuit that is called loser-take-all: an output signal is ON if and only if the corresponding input has the smallest analog value among all inputs. We develop a DNA strand-displacement implementation of loser-take-all circuits that is cascadable without dual-rail representation, maintaining the simplicity desired for scalability. We characterize the impact of effective signal concentrations and reaction rates on the circuit performance, and derive solutions for compensating undesired signal loss and rate differences. Using these approaches, we successfully demonstrate a three-input loser-take-all circuit with nine unique input combinations. Complementary to winner-take-all, loser-take-all DNA circuits could be used for recognition of molecular patterns based on their least similarities to a set of memories, allowing classification decisions for patterns that are extremely noisy. Moreover, the design principle of loser-take-all could be more generally applied in other DNA circuit implementations including k-winner-take-all.
- Published
- 2021
144. Multinomial Sampling of Latent Variables for Hierarchical Change-Point Detection
- Author
-
Pablo Moreno-Muñoz, Lorena Romero-Medrano, and Antonio Artés-Rodríguez
- Subjects
Computer science ,Bayesian inference ,Bayesian probability ,Inference ,Latent variable ,Change-point detection (CPD) ,Manifold ,Theoretical Computer Science ,Multinomial likelihoods ,Hardware and Architecture ,Control and Systems Engineering ,Modeling and Simulation ,Signal Processing ,Pattern recognition (psychology) ,Segmentation ,Baseline (configuration management) ,Latent variable models ,Algorithm ,Change detection ,Information Systems - Abstract
Bayesian change-point detection, with latent variable models, allows to perform segmentation of high-dimensional time-series with heterogeneous statistical nature. We assume that change-points lie on a lower-dimensional manifold where we aim to infer a discrete representation via subsets of latent variables. For this particular model, full inference is computationally unfeasible and pseudo-observations based on point-estimates of latent variables are used instead. However, if their estimation is not certain enough, change-point detection gets affected. To circumvent this problem, we propose a multinomial sampling methodology that improves the detection rate and reduces the delay while keeping complexity stable and inference analytically tractable. Our experiments show results that outperform the baseline method and we also provide an example oriented to a human behavioral study.
- Published
- 2021
145. THE METHOD OF STRUCTURAL ADJUSTMENT OF NEURAL NETWORK MODELS TO ENSURE INTERPRETATION
- Author
-
Andrii Oliinyk, O. V. Korniienko, Sergey Subbotin, Serhii Leoshchenko, and Ye. O. Gofman
- Subjects
Neuroevolution ,Speedup ,Artificial neural network ,business.industry ,Computer science ,Big data ,Context (language use) ,General Medicine ,Machine learning ,computer.software_genre ,Field (computer science) ,Encoding (memory) ,Pattern recognition (psychology) ,Artificial intelligence ,business ,computer - Abstract
Context. The problem of structural modification of pre-synthesized models based on artificial neural networks to ensure the property of interpretation when working with big data is considered. The object of the study is the process of structural modification of artificial neural networks using adaptive mechanisms. Objective of the work is to develop a method for structural modification of neural networks to increase their speed and reduce resource consumption when processing big data. Method. A method of structural adjustment of neural networks based on adaptive mechanisms borrowed from neuroevolutionary synthesis methods is proposed. At the beginning, the method uses a system of indicators to evaluate the existing structure of an artificial neural network. The assessment is based on the structural features of neuromodels. Then the obtained indicator estimates are compared with the criteria values for choosing the type of structural changes. Variants of mutational changes from the group of methods of neuroevolutionary modification of the topology and weights of the neural network are used as variants of structural change. The method allows to reduce the resource intensity during the operation of neuromodels, by accelerating the processing of big data, which expands the field of practical application of artificial neural networks. Results. The developed method is implemented and investigated by the example of using a recurrent artificial network of the long short-term memory type when solving the classification problem. The use of the developed method allowed speed up of the neuromodel with a test sample by 25.05%, depending on the computing resources used. Conclusions. The conducted experiments confirmed the operability of the proposed mathematical software and allow us to recommend it for use in practice in the structural adjustment of pre-synthesized neuromodels for further solving problems of diagnosis, forecasting, evaluation and pattern recognition using big data. The prospects for further research may consist in a more fine-tuning of the indicator system to determine the connections encoding noisy data in order to further improve the accuracy of models based on neural networks.
- Published
- 2021
146. DepNet: An automated industrial intelligent system using deep learning for video‐based depression analysis
- Author
-
Lang He, Prayag Tiwari, Wei Dang, Hari Mohan Pandey, Chenguang Guo, and Rui Su
- Subjects
Human-Computer Interaction ,Artificial Intelligence ,business.industry ,Computer science ,Speech recognition ,Deep learning ,Pattern recognition (psychology) ,Artificial intelligence ,business ,Video based ,Software ,Depression (differential diagnoses) ,Theoretical Computer Science - Published
- 2021
147. Hierarchical Domain-Adapted Feature Learning for Video Saliency Prediction
- Author
-
Concetto Spampinato, Daniela Giordano, Francesco Rundo, Giovanni Bellitto, Simone Palazzo, and F. Proietto Salanitri
- Subjects
FOS: Computer and information sciences ,Normalization (statistics) ,Source code ,Computer science ,Computer Vision and Pattern Recognition (cs.CV) ,media_common.quotation_subject ,Computer Science - Computer Vision and Pattern Recognition ,Video saliency Prediction ,Machine learning ,computer.software_genre ,Hierarchical database model ,Domain (software engineering) ,Artificial Intelligence ,media_common ,Domain adaptation ,Domain specific learning ,Gradient reversal layer ,Conspicuity networks ,business.industry ,Conspicuity maps ,Pattern recognition (psychology) ,Benchmark (computing) ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,computer ,Feature learning ,Software ,Smoothing - Abstract
In this work, we propose a 3D fully convolutional architecture for video saliency prediction that employs hierarchical supervision on intermediate maps (referred to as conspicuity maps) generated using features extracted at different abstraction levels. We provide the base hierarchical learning mechanism with two techniques for domain adaptation and domain-specific learning. For the former, we encourage the model to unsupervisedly learn hierarchical general features using gradient reversal at multiple scales, to enhance generalization capabilities on datasets for which no annotations are provided during training. As for domain specialization, we employ domain-specific operations (namely, priors, smoothing and batch normalization) by specializing the learned features on individual datasets in order to maximize performance. The results of our experiments show that the proposed model yields state-of-the-art accuracy on supervised saliency prediction. When the base hierarchical model is empowered with domain-specific modules, performance improves, outperforming state-of-the-art models on three out of five metrics on the DHF1K benchmark and reaching the second-best results on the other two. When, instead, we test it in an unsupervised domain adaptation setting, by enabling hierarchical gradient reversal layers, we obtain performance comparable to supervised state-of-the-art. Source code, trained models and example outputs are publicly available at https://github.com/perceivelab/hd2s.
- Published
- 2021
148. Really natural adversarial examples
- Author
-
Oscar Deniz, Gloria Bueno, and Anibal Pedraza
- Subjects
Computer science ,business.industry ,Deep learning ,Cognitive neuroscience of visual object recognition ,Contrast (statistics) ,Computational intelligence ,Machine learning ,computer.software_genre ,Adversarial system ,Artificial Intelligence ,Pattern recognition (psychology) ,Natural (music) ,Computer Vision and Pattern Recognition ,Noise (video) ,Artificial intelligence ,business ,computer ,Software - Abstract
The phenomenon of Adversarial Examples has become one of the most intriguing topics associated to deep learning. The so-called adversarial attacks have the ability to fool deep neural networks with inappreciable perturbations. While the effect is striking, it has been suggested that such carefully selected injected noise does not necessarily appear in real-world scenarios. In contrast to this, some authors have looked for ways to generate adversarial noise in physical scenarios (traffic signs, shirts, etc.), thus showing that attackers can indeed fool the networks. In this paper we go beyond that and show that adversarial examples also appear in the real-world without any attacker or maliciously selected noise involved. We show this by using images from tasks related to microscopy and also general object recognition with the well-known ImageNet dataset. A comparison between these natural and the artificially generated adversarial examples is performed using distance metrics and image quality metrics. We also show that the natural adversarial examples are in fact at a higher distance from the originals that in the case of artificially generated adversarial examples.
- Published
- 2021
149. Combining semi-supervised and active learning to rank algorithms: application to Document Retrieval
- Author
-
Faiza Dammak and Hager Kammoun
- Subjects
Active learning (machine learning) ,Computer science ,media_common.quotation_subject ,Rank (computer programming) ,Supervised learning ,Library and Information Sciences ,Ranking (information retrieval) ,Pattern recognition (psychology) ,Learning to rank ,Document retrieval ,Function (engineering) ,Algorithm ,Information Systems ,media_common - Abstract
Generally, the purpose of learning to rank methods is to combine the results from existing ranking models that within a single ranking function, applied to order the documents as efficiently as possible, improving the quality lists of results returned. However, learning to rank has several limitations namely the creation and size of the labeled database. We have considered the two frameworks of semi-supervised and active learning in order to look for solutions to these problems. We have been interested in semi-supervised, active and semi-active learning to rank algorithms for Document Retrieval (DR) which is a ranking application of alternatives. A good balance between exploration and exploitation has a positive impact on the performance of the learning. Thus, we have focused firstly on two active learning to rank algorithms that use supervised learning and semi-supervised learning as auxiliaries and use an automatic method for the labeling of unlabeled pairs selected. These algorithms are named “Semi-Active Learning to Rank: SAL2R” and “Active-Semi-Supervised Learning to Rank: ASSL2R”. We have been particulary interested in providing efficient and effective algorithms to handle a large set of unlabeled data. Second, we have considered improvement of these semi-active SAL2R and ASSL2R algorithms using a multi-pair in the selection step. Our contribution lies particulary in the in depth experimental study of the performance of these algorithms and precisely the influence of certain fixed parameters on the learned ranking function.
- Published
- 2021
150. Fault detection and diagnosis in photovoltaic panels by radiometric sensors embedded in unmanned aerial vehicles
- Author
-
Isaac Segovia Ramírez, Fausto Pedro García Márquez, and Bikramaditya Das
- Subjects
Renewable Energy, Sustainability and the Environment ,Computer science ,Pattern recognition (psychology) ,Photovoltaic system ,Radiometric dating ,Electrical and Electronic Engineering ,Condensed Matter Physics ,Fault detection and isolation ,Electronic, Optical and Magnetic Materials ,Remote sensing - Published
- 2021
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.