Descriptor: "I.5.5" / Language: undetermined - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"I.5.5"' showing total 38 results

Start Over Descriptor "I.5.5" Language undetermined

38 results on '"I.5.5"'

1. Magnification Invariant Medical Image Analysis: A Comparison of Convolutional Networks, Vision Transformers, and Token Mixers

Author: Jeevan, Pranav, Kurian, Nikhil Cherian, and Sethi, Amit
Subjects: I.4.0, FOS: Computer and information sciences, I.5.1, J.3, I.5.2, I.5.4, Computer Vision and Pattern Recognition (cs.CV), I.5.5, Image and Video Processing (eess.IV), Computer Science - Computer Vision and Pattern Recognition, Electrical Engineering and Systems Science - Image and Video Processing, I.4.10, I.2.1, I.4.8, I.4.9, FOS: Electrical engineering, electronic engineering, information engineering
Abstract: Convolution Neural Networks (CNNs) are widely used in medical image analysis, but their performance degrade when the magnification of testing images differ from the training images. The inability of CNNs to generalize across magnification scales can result in sub-optimal performance on external datasets. This study aims to evaluate the robustness of various deep learning architectures in the analysis of breast cancer histopathological images with varying magnification scales at training and testing stages. Here we explore and compare the performance of multiple deep learning architectures, including CNN-based ResNet and MobileNet, self-attention-based Vision Transformers and Swin Transformers, and token-mixing models, such as FNet, ConvMixer, MLP-Mixer, and WaveMix. The experiments are conducted using the BreakHis dataset, which contains breast cancer histopathological images at varying magnification levels. We show that performance of WaveMix is invariant to the magnification of training and testing data and can provide stable and good classification accuracy. These evaluations are critical in identifying deep learning architectures that can robustly handle changes in magnification scale, ensuring that scale changes across anatomical structures do not disturb the inference results., Comment: 6 pages, 3 figures
Published: 2023
Full Text: View/download PDF

2. Real-time Speech Emotion Recognition Based on Syllable-Level Feature Extraction

Author: Rehman, Abdul, Liu, Zhen-Tao, Wu, Min, Cao, Wei-Hua, and Jiang, Cheng-Shan
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Sound (cs.SD), I.5.2, I.5.5, Computer Science - Human-Computer Interaction, Computer Science - Sound, Human-Computer Interaction (cs.HC), Machine Learning (cs.LG), ComputingMethodologies_PATTERNRECOGNITION, Audio and Speech Processing (eess.AS), FOS: Electrical engineering, electronic engineering, information engineering, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Speech emotion recognition systems have high prediction latency because of the high computational requirements for deep learning models and low generalizability mainly because of the poor reliability of emotional measurements across multiple corpora. To solve these problems, we present a speech emotion recognition system based on a reductionist approach of decomposing and analyzing syllable-level features. Mel-spectrogram of an audio stream is decomposed into syllable-level components, which are then analyzed to extract statistical features. The proposed method uses formant attention, noise-gate filtering, and rolling normalization contexts to increase feature processing speed and tolerance to adversity. A set of syllable-level formant features is extracted and fed into a single hidden layer neural network that makes predictions for each syllable as opposed to the conventional approach of using a sophisticated deep learner to make sentence-wide predictions. The syllable level predictions help to achieve the real-time latency and lower the aggregated error in utterance level cross-corpus predictions. The experiments on IEMOCAP (IE), MSP-Improv (MI), and RAVDESS (RA) databases show that the method archives real-time latency while predicting with state-of-the-art cross-corpus unweighted accuracy of 47.6% for IE to MI and 56.2% for MI to IE., Comment: Significant revisions
Published: 2022
Full Text: View/download PDF

3. Needle In A Haystack, Fast: Benchmarking Image Perceptual Similarity Metrics At Scale

Author: Vallez, Cyril, Kucharavy, Andrei, and Dolamic, Ljiljana
Subjects: Performance (cs.PF), FOS: Computer and information sciences, Computer Science - Performance, I.5.4, Computer Vision and Pattern Recognition (cs.CV), I.5.5, Computer Science - Computer Vision and Pattern Recognition, I.4.7, H.3.1, I.4.10, K.4
Abstract: The advent of the internet, followed shortly by the social media made it ubiquitous in consuming and sharing information between anyone with access to it. The evolution in the consumption of media driven by this change, led to the emergence of images as means to express oneself, convey information and convince others efficiently. With computer vision algorithms progressing radically over the last decade, it is become easier and easier to study at scale the role of images in the flow of information online. While the research questions and overall pipelines differ radically, almost all start with a crucial first step - evaluation of global perceptual similarity between different images. That initial step is crucial for overall pipeline performance and processes most images. A number of algorithms are available and currently used to perform it, but so far no comprehensive review was available to guide the choice of researchers as to the choice of an algorithm best suited to their question, assumptions and computational resources. With this paper we aim to fill this gap, showing that classical computer vision methods are not necessarily the best approach, whereas a pair of relatively little used methods - Dhash perceptual hash and SimCLR v2 ResNets achieve excellent performance, scale well and are computationally efficient., Comment: 26 pages, 10 figures
Published: 2022
Full Text: View/download PDF

4. Differentiable Microscopy for Content and Task Aware Compressive Fluorescence Imaging

Author: Haputhanthri, Udith, Seeber, Andrew, and Wadduwage, Dushan
Subjects: FOS: Computer and information sciences, J.2, J.3, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, FOS: Physical sciences, Data_CODINGANDINFORMATIONTHEORY, Quantitative Biology - Quantitative Methods, FOS: Electrical engineering, electronic engineering, information engineering, Quantitative Methods (q-bio.QM), I.2.6, I.4.1, I.4.2, I.4.3, I.4.5, I.4.6, I.4.7, I.5.1, I.5.2, I.5.5, Image and Video Processing (eess.IV), Electrical Engineering and Systems Science - Image and Video Processing, FOS: Biological sciences, Physics - Optics, Optics (physics.optics)
Abstract: The trade-off between throughput and image quality is an inherent challenge in microscopy. To improve throughput, compressive imaging under-samples image signals; the images are then computationally reconstructed by solving a regularized inverse problem. Compared to traditional regularizers, Deep Learning based methods have achieved greater success in compression and image quality. However, the information loss in the acquisition process sets the compression bounds. Further improvement in compression, without compromising the reconstruction quality is thus a challenge. In this work, we propose differentiable compressive fluorescence microscopy ($\partial \mu$) which includes a realistic generalizable forward model with learnable-physical parameters (e.g. illumination patterns), and a novel physics-inspired inverse model. The cascaded model is end-to-end differentiable and can learn optimal compressive sampling schemes through training data. With our model, we performed thousands of numerical experiments on various compressive microscope configurations. We show that learned sampling encodes important information about the specimens in the illumination field of the microscope allowing higher compression up to $\times 1024$. We further utilize our framework for Task Aware Compression. The experimental results show superior performance on the cell segmentation task.
Published: 2022
Full Text: View/download PDF

5. GSGP-CUDA — A CUDA framework for Geometric Semantic Genetic Programming

Author: Leonardo Trujillo, Jose Manuel Muñoz Contreras, Daniel E. Hernandez, Mauro Castelli, Juan J. Tapia, NOVA Information Management School (NOVA IMS), and Information Management Research Center (MagIC) - NOVA Information Management School
Subjects: I.2.2, FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Science - Performance, I.5.5, Genetic Programming, GPU, Computer Science - Neural and Evolutionary Computing, CUDA, Machine Learning (cs.LG), Computer Science Applications, Performance (cs.PF), Geometric Semantic Genetic Programming, Neural and Evolutionary Computing (cs.NE), Software
Abstract: Geometric Semantic Genetic Programming (GSGP) is a state-of-the-art machine learning method based on evolutionary computation. GSGP performs search operations directly at the level of program semantics, which can be done more efficiently then operating at the syntax level like most GP systems. Efficient implementations of GSGP in C++ exploit this fact, but not to its full potential. This paper presents GSGP-CUDA, the first CUDA implementation of GSGP and the most efficient, which exploits the intrinsic parallelism of GSGP using GPUs. Results show speedups greater than 1,000X relative to the state-of-the-art sequential implementation., Comment: 14 pages, 3 figures
Published: 2022

6. Robust PDF Document Conversion Using Recurrent Neural Networks

Author: Livathinos, Nikolaos, Berrospi, Cesar, Lysak, Maksym, Kuropiatnyk, Viktor, Nassar, Ahmed, Carvalho, Andre, Dolfi, Michele, Auer, Christoph, Dinkla, Kasper, and Staar, Peter
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, I.5.1, I.5.2, I.5.4, Computer Vision and Pattern Recognition (cs.CV), I.7.5, I.5.5, I.2.1, Computer Science - Computer Vision and Pattern Recognition, General Medicine, Computer Science - Information Retrieval, Machine Learning (cs.LG), Information Retrieval (cs.IR)
Abstract: The number of published PDF documents has increased exponentially in recent decades. There is a growing need to make their rich content discoverable to information retrieval tools. In this paper, we present a novel approach to document structure recovery in PDF using recurrent neural networks to process the low-level PDF data representation directly, instead of relying on a visual re-interpretation of the rendered PDF page, as has been proposed in previous literature. We demonstrate how a sequence of PDF printing commands can be used as input into a neural network and how the network can learn to classify each printing command according to its structural function in the page. This approach has three advantages: First, it can distinguish among more fine-grained labels (typically 10-20 labels as opposed to 1-5 with visual methods), which results in a more accurate and detailed document structure resolution. Second, it can take into account the text flow across pages more naturally compared to visual methods because it can concatenate the printing commands of sequential pages. Last, our proposed method needs less memory and it is computationally less expensive than visual methods. This allows us to deploy such models in production environments at a much lower cost. Through extensive architectural search in combination with advanced feature engineering, we were able to implement a model that yields a weighted average F1 score of 97% across 17 distinct structural labels. The best model we achieved is currently served in production environments on our Corpus Conversion Service (CCS), which was presented at KDD18 (arXiv:1806.02284). This model enhances the capabilities of CCS significantly, as it eliminates the need for human annotated label ground-truth for every unseen document layout. This proved particularly useful when applied to a huge corpus of PDF articles related to COVID-19., Comment: 9 pages, 2 tables, 4 figures, uses aaai21.sty. Accepted at the "Thirty-Third Annual Conference on Innovative Applications of Artificial Intelligence (IAAI-21)". Received the "IAAI-21 Innovative Application Award"
Published: 2021
Full Text: View/download PDF

7. Counting Objects by Diffused Index: geometry-free and training-free approach

Author: Mengyi Tang, Maryam Yashtini, and Sung Ha Kang
Subjects: FOS: Computer and information sciences, 65Z05(Primary), 65S05(Secondary), Computer Vision and Pattern Recognition (cs.CV), I.4.6, I.5.5, Computer Science - Computer Vision and Pattern Recognition, I.4.9, Numerical Analysis (math.NA), Signal Processing, Media Technology, FOS: Mathematics, Mathematics - Numerical Analysis, Computer Vision and Pattern Recognition, Electrical and Electronic Engineering
Abstract: Counting objects is a fundamental but challenging problem. In this paper, we propose diffusion-based, geometry-free, and learning-free methodologies to count the number of objects in images. The main idea is to represent each object by a unique index value regardless of its intensity or size, and to simply count the number of index values. First, we place different vectors, refer to as seed vectors, uniformly throughout the mask image. The mask image has boundary information of the objects to be counted. Secondly, the seeds are diffused using an edge-weighted harmonic variational optimization model within each object. We propose an efficient algorithm based on an operator splitting approach and alternating direction minimization method, and theoretical analysis of this algorithm is given. An optimal solution of the model is obtained when the distributed seeds are completely diffused such that there is a unique intensity within each object, which we refer to as an index. For computational efficiency, we stop the diffusion process before a full convergence, and propose to cluster these diffused index values. We refer to this approach as Counting Objects by Diffused Index (CODI). We explore scalar and multi-dimensional seed vectors. For Scalar seeds, we use Gaussian fitting in histogram to count, while for vector seeds, we exploit a high-dimensional clustering method for the final step of counting via clustering. The proposed method is flexible even if the boundary of the object is not clear nor fully enclosed. We present counting results in various applications such as biological cells, agriculture, concert crowd, and transportation. Some comparisons with existing methods are presented.
Published: 2021
Full Text: View/download PDF

8. Soft Expectation and Deep Maximization for Image Feature Detection

Author: Mai, Alexander, Yang, Allen, and Meyer, Dominique E.
Subjects: FOS: Computer and information sciences, I.4.1, I.4.2, I.5.1, I.5.2, I.2.6, I.4.5, Computer Vision and Pattern Recognition (cs.CV), I.5.5, Computer Science - Computer Vision and Pattern Recognition
Abstract: Central to the application of many multi-view geometry algorithms is the extraction of matching points between multiple viewpoints, enabling classical tasks such as camera pose estimation and 3D reconstruction. Many approaches that characterize these points have been proposed based on hand-tuned appearance models or data-driven learning methods. We propose Soft Expectation and Deep Maximization (SEDM), an iterative unsupervised learning process that directly optimizes the repeatability of the features by posing the problem in a similar way to expectation maximization (EM). We found convergence to be reliable and the new model to be more lighting invariant and better at localize the underlying 3D points in a scene, improving SfM quality when compared to other state of the art deep learning detectors., Comment: 9 pages, 3 figures, 2 tables
Published: 2021
Full Text: View/download PDF

9. GRAPE for Fast and Scalable Graph Processing and random walk-based Embedding

Author: Cappelletti, Luca, Fontana, Tommaso, Casiraghi, Elena, Ravanmehr, Vida, Callahan, Tiffany J., Cano, Carlos, Joachimiak, Marcin P., Mungall, Christopher J., Robinson, Peter N., Reese, Justin, and Valentini, Giorgio
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Science - Distributed, Parallel, and Cluster Computing, E.2, I.2.6, I.5.5, Distributed, Parallel, and Cluster Computing (cs.DC), D.m, Machine Learning (cs.LG)
Abstract: Graph Representation Learning (GRL) methods opened new avenues for addressing complex, real-world problems represented by graphs. However, many graphs used in these applications comprise millions of nodes and billions of edges and are beyond the capabilities of current methods and software implementations. We present GRAPE, a software resource for graph processing and embedding that can scale with big graphs by using specialized and smart data structures, algorithms, and a fast parallel implementation of random walk-based methods. Compared with state-of-the-art software resources, GRAPE shows an improvement of orders of magnitude in empirical space and time complexity, as well as a competitive edge and node label prediction performance. GRAPE comprises about 1.7 million well-documented lines of Python and Rust code and provides 69 node embedding methods, 25 inference models, a collection of efficient graph processing utilities and over 80,000 graphs from the literature and other sources. Standardized interfaces allow seamless integration of third-party libraries, while ready-to-use and modular pipelines permit an easy-to-use evaluation of GRL methods, therefore also positioning GRAPE as a software resource to perform a fair comparison between methods and libraries for graph processing and embedding.
Published: 2021
Full Text: View/download PDF

10. HeAT – a Distributed and GPU-accelerated Tensor Framework for Data Analytics

Author: Björn Hagemeier, Claudia Comito, Kai Krajsek, Achim Streit, Simon Hanselmann, Daniel Coquelin, Martin Siggel, Achim Basermann, Philipp Knechtges, Michael Tarnawa, Charlotte Debus, and Markus Götz
Subjects: FOS: Computer and information sciences, Data Analysis, 0301 basic medicine, Computer Science - Machine Learning, Computer science, Big data, GPU, G.1.3, 02 engineering and technology, Parallel computing, Dask, computer.software_genre, Machine Learning (cs.LG), Machine Learning, NumPy, C.2.4, 0202 electrical engineering, electronic engineering, information engineering, Parallel Application Frameworks, computer.programming_language, I.5.5, Message Passing Interface, High-performance Computing, Computer Science - Distributed, Parallel, and Cluster Computing, Parallel processing (DSP implementation), Data analysis, Tensor Framework, Distributed memory, G.4, Model Parallelism, Neural Networks, 03 medical and health sciences, C.1.2, Big Data Analytics, High-performanceComputing, 020204 information systems, D.1.3, I.2.5, business.industry, Node (networking), DATA processing & computer science, I.2.0, HeAT, Software framework, 030104 developmental biology, PyTorch, Computer Science - Mathematical Software, Distributed, Parallel, and Cluster Computing (cs.DC), ddc:004, business, Mathematical Software (cs.MS), computer
Abstract: To cope with the rapid growth in available data, the efficiency of data analysis and machine learning libraries has recently received increased attention. Although great advancements have been made in traditional array-based computations, most are limited by the resources available on a single computation node. Consequently, novel approaches must be made to exploit distributed resources, e.g. distributed memory architectures. To this end, we introduce HeAT, an array-based numerical programming framework for large-scale parallel processing with an easy-to-use NumPy-like API. HeAT utilizes PyTorch as a node-local eager execution engine and distributes the workload on arbitrarily large high-performance computing systems via MPI. It provides both low-level array computations, as well as assorted higher-level algorithms. With HeAT, it is possible for a NumPy user to take full advantage of their available resources, significantly lowering the barrier to distributed data analysis. When compared to similar frameworks, HeAT achieves speedups of up to two orders of magnitude., 10 pages, 8 figures, 5 listings, 1 table
Published: 2020

11. Synthetic Observational Health Data with GANs: from slow adoption to a boom in medical research and ultimately digital twins?

Author: Jeremy Georges-Filteau and Elisa Cirillo
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, G.3, I.2.1, I.2.6, I.5.1, I.5.2, I.5.5, I.6.4, I.6.5, I.6.8, J.3, Computer science, Machine Learning (stat.ML), Quantitative Biology - Quantitative Methods, Health informatics, Synthetic data, Machine Learning (cs.LG), Data modeling, Consistency (negotiation), Statistics - Machine Learning, Health care, Quantitative Methods (q-bio.QM), Reproducibility, business.industry, Benchmarking, Medical research, Data science, FOS: Biological sciences, Observational study, business, Healthcare system
Abstract: After being collected for patient care, Observational Health Data (OHD) can further benefit patient well-being by sustaining the development of health informatics and medical research. Vast potential is unexploited because of the fiercely private nature of patient-related data and regulations to protect it. Generative Adversarial Networks (GANs) have recently emerged as a groundbreaking way to learn generative models that produce realistic synthetic data. They have revolutionized practices in multiple domains such as self-driving cars, fraud detection, digital twin simulations in industrial sectors, and medical imaging. The digital twin concept could readily apply to modelling and quantifying disease progression. In addition, GANs posses many capabilities relevant to common problems in healthcare: lack of data, class imbalance, rare diseases, and preserving privacy. Unlocking open access to privacy-preserving OHD could be transformative for scientific research. In the midst of COVID-19, the healthcare system is facing unprecedented challenges, many of which of are data related for the reasons stated above. Considering these facts, publications concerning GAN applied to OHD seemed to be severely lacking. To uncover the reasons for this slow adoption, we broadly reviewed the published literature on the subject. Our findings show that the properties of OHD were initially challenging for the existing GAN algorithms (unlike medical imaging, for which state-of-the-art model were directly transferable) and the evaluation synthetic data lacked clear metrics. We find more publications on the subject than expected, starting slowly in 2017, and since then at an increasing rate. The difficulties of OHD remain, and we discuss issues relating to evaluation, consistency, benchmarking, data modelling, and reproducibility., Comment: 31 pages (10 in previous version), not including references and glossary, 51 in total. Inclusion of a large number of recent publications and expansion of the discussion accordingly
Published: 2020

12. Two-Stream Aural-Visual Affect Analysis in the Wild

Author: Jörn Ostermann, Felix Kuhnke, and Lars Rumberg
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Exploit, Computer science, Computer Vision and Pattern Recognition (cs.CV), Speech recognition, Feature extraction, Computer Science - Computer Vision and Pattern Recognition, Machine Learning (stat.ML), 02 engineering and technology, Convolutional neural network, Facial recognition system, Machine Learning (cs.LG), Statistics - Machine Learning, 0202 electrical engineering, electronic engineering, information engineering, Code (cryptography), Artificial neural network, business.industry, I.5.5, Visualization, I.4.9, Task analysis, 020201 artificial intelligence & image processing, Artificial intelligence, business
Abstract: Human affect recognition is an essential part of natural human-computer interaction. However, current methods are still in their infancy, especially for in-the-wild data. In this work, we introduce our submission to the Affective Behavior Analysis in-the-wild (ABAW) 2020 competition. We propose a two-stream aural-visual analysis model to recognize affective behavior from videos. Audio and image streams are first processed separately and fed into a convolutional neural network. Instead of applying recurrent architectures for temporal analysis we only use temporal convolutions. Furthermore, the model is given access to additional features extracted during face-alignment. At training time, we exploit correlations between different emotion representations to improve performance. Our model achieves promising results on the challenging Aff-Wild2 database., Comment: 6 pages, 2 figures, Face and Gesture 2020 Workshop Paper (ABAW2020 competition)
Published: 2020

13. Perfecting the Crime Machine

Author: Alparslan, Yigit, Panagiotou, Ioanna, Livengood, Willow, Kane, Robert, and Cohen, Andrew
Subjects: FOS: Computer and information sciences, Computer Science - Computers and Society, Computer Science - Machine Learning, I.5.1, I.5.3, I.5.4, I.5.5, Computers and Society (cs.CY), Applications (stat.AP), Statistics - Applications, Machine Learning (cs.LG)
Abstract: This study explores using different machine learning techniques and workflows to predict crime related statistics, specifically crime type in Philadelphia. We use crime location and time as main features, extract different features from the two features that our raw data has, and build models that would work with large number of class labels. We use different techniques to extract various features including combining unsupervised learning techniques and try to predict the crime type. Some of the models that we use are Support Vector Machines, Decision Trees, Random Forest, K-Nearest Neighbors. We report that the Random Forest as the best performing model to predict crime type with an error log loss of 2.3120., Comment: 11 pages, 55 figures, fixed typos, added references in Introduction section
Published: 2020
Full Text: View/download PDF

14. QReLU and m-QReLU: Two novel quantum activation functions to aid medical diagnostics

Author: Parisi, L., Neagu, D., Ma, R., and Campean, F.
Subjects: FOS: Computer and information sciences, I.2.1, I.2.10, I.4.9, I.5.1, I.5.4, I.5.5, Computer Science - Machine Learning, Computer Vision and Pattern Recognition (cs.CV), Image and Video Processing (eess.IV), Computer Science - Computer Vision and Pattern Recognition, Computer Science - Neural and Evolutionary Computing, Electrical Engineering and Systems Science - Image and Video Processing, Machine Learning (cs.LG), 68T07, 68T10, 68T45, 68U35, FOS: Electrical engineering, electronic engineering, information engineering, Neural and Evolutionary Computing (cs.NE)
Abstract: The ReLU activation function (AF) has been extensively applied in deep neural networks, in particular Convolutional Neural Networks (CNN), for image classification despite its unresolved dying ReLU problem, which poses challenges to reliable applications. This issue has obvious important implications for critical applications, such as those in healthcare. Recent approaches are just proposing variations of the activation function within the same unresolved dying ReLU challenge. This contribution reports a different research direction by investigating the development of an innovative quantum approach to the ReLU AF that avoids the dying ReLU problem by disruptive design. The Leaky ReLU was leveraged as a baseline on which the two quantum principles of entanglement and superposition were applied to derive the proposed Quantum ReLU (QReLU) and the modified-QReLU (m-QReLU) activation functions. Both QReLU and m-QReLU are implemented and made freely available in TensorFlow and Keras. This original approach is effective and validated extensively in case studies that facilitate the detection of COVID-19 and Parkinson Disease (PD) from medical images. The two novel AFs were evaluated in a two-layered CNN against nine ReLU-based AFs on seven benchmark datasets, including images of spiral drawings taken via graphic tablets from patients with Parkinson Disease and healthy subjects, and point-of-care ultrasound images on the lungs of patients with COVID-19, those with pneumonia and healthy controls. Despite a higher computational cost, results indicated an overall higher classification accuracy, precision, recall and F1-score brought about by either quantum AFs on five of the seven bench-mark datasets, thus demonstrating its potential to be the new benchmark or gold standard AF in CNNs and aid image classification tasks involved in critical applications, such as medical diagnoses of COVID-19 and PD., Comment: 30 pages, 4 listings/Python code snippets, 2 figures, 8 tables
Published: 2020
Full Text: View/download PDF

15. Data-Driven Neuromorphic DRAM-based CNN and RNN Accelerators

Author: Shih-Chii Liu and Tobi Delbruck
Subjects: Spiking neural network, FOS: Computer and information sciences, Hardware_MEMORYSTRUCTURES, Computer science, Computer Vision and Pattern Recognition (cs.CV), 020208 electrical & electronic engineering, I.5.5, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Neural and Evolutionary Computing, 02 engineering and technology, Data-driven, Computer architecture, Neuromorphic engineering, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Static random-access memory, Neural and Evolutionary Computing (cs.NE), Latency (engineering), Random access, Dram
Abstract: The energy consumed by running large deep neural networks (DNNs) on hardware accelerators is dominated by the need for lots of fast memory to store both states and weights. This large required memory is currently only economically viable through DRAM. Although DRAM is high-throughput and low-cost memory (costing 20X less than SRAM), its long random access latency is bad for the unpredictable access patterns in spiking neural networks (SNNs). In addition, accessing data from DRAM costs orders of magnitude more energy than doing arithmetic with that data. SNNs are energy-efficient if local memory is available and few spikes are generated. This paper reports on our developments over the last 5 years of convolutional and recurrent deep neural network hardware accelerators that exploit either spatial or temporal sparsity similar to SNNs but achieve SOA throughput, power efficiency and latency even with the use of DRAM for the required storage of the weights and states of large DNNs., Comment: To appear in 2019 IEEE Sig. Proc. Soc. Asilomar Conference on Signals, Systems, and Computers Session MP6b: Neuromorphic Computing (Invited)
Published: 2020
Full Text: View/download PDF

16. Adversarial Attacks on Convolutional Neural Networks in Facial Recognition Domain

Author: Alparslan, Yigit, Alparslan, Ken, Keim-Shenk, Jeremy, Khade, Shweta, and Greenstadt, Rachel
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, I.5.1, I.5.4, Computer Vision and Pattern Recognition (cs.CV), I.5.5, Image and Video Processing (eess.IV), Computer Science - Computer Vision and Pattern Recognition, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Machine Learning (stat.ML), Electrical Engineering and Systems Science - Image and Video Processing, Machine Learning (cs.LG), Statistics - Machine Learning, FOS: Electrical engineering, electronic engineering, information engineering
Abstract: Numerous recent studies have demonstrated how Deep Neural Network (DNN) classifiers can be fooled by adversarial examples, in which an attacker adds perturbations to an original sample, causing the classifier to misclassify the sample. Adversarial attacks that render DNNs vulnerable in real life represent a serious threat in autonomous vehicles, malware filters, or biometric authentication systems. In this paper, we apply Fast Gradient Sign Method to introduce perturbations to a facial image dataset and then test the output on a different classifier that we trained ourselves, to analyze transferability of this method. Next, we craft a variety of different black-box attack algorithms on a facial image dataset assuming minimal adversarial knowledge, to further assess the robustness of DNNs in facial recognition. While experimenting with different image distortion techniques, we focus on modifying single optimal pixels by a large amount, or modifying all pixels by a smaller amount, or combining these two attack approaches. While our single-pixel attacks achieved about a 15% average decrease in classifier confidence level for the actual class, the all-pixel attacks were more successful and achieved up to an 84% average decrease in confidence, along with an 81.6% misclassification rate, in the case of the attack that we tested with the highest levels of perturbation. Even with these high levels of perturbation, the face images remained identifiable to a human. Understanding how these noised and perturbed images baffle the classification algorithms can yield valuable advances in the training of DNNs against defense-aware adversarial attacks, as well as adaptive noise reduction techniques. We hope our research may help to advance the study of adversarial attacks on DNNs and defensive mechanisms to counteract them, particularly in the facial recognition domain., Comment: 18 pages, 8 figures, fixed typos, replotted figures, restyled the plots and tables
Published: 2020
Full Text: View/download PDF

17. PACSET (Packed Serialized Trees): Reducing Inference Latency for Tree Ensemble Deployment

Author: Madhyastha, Meghana, Lillaney, Kunal, Browne, James, Vogelstein, Joshua, and Burns, Randal
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Science - Distributed, Parallel, and Cluster Computing, I.5.5, Distributed, Parallel, and Cluster Computing (cs.DC), Machine Learning (cs.LG)
Abstract: We present methods to serialize and deserialize tree ensembles that optimize inference latency when models are not already loaded into memory. This arises whenever models are larger than memory, but also systematically when models are deployed on low-resource devices, such as in the Internet of Things, or run as Web micro-services where resources are allocated on demand. Our packed serialized trees (PACSET) encode reference locality in the layout of a tree ensemble using principles from external memory algorithms. The layout interleaves correlated nodes across multiple trees, uses leaf cardinality to collocate the nodes on the most popular paths and is optimized for the I/O blocksize. The result is that each I/O yields a higher fraction of useful data, leading to a 2-6 times reduction in classification latency for interactive workloads.
Published: 2020
Full Text: View/download PDF

18. Microsoft Recommenders: Tools to Accelerate Developing Recommender Systems

Author: Scott B. Graham, Jun-Ki Min, and Tao Wu
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Computer science, I.5.5, Python (programming language), Recommender system, H.3.3, H.3.4, Computer Science - Information Retrieval, Machine Learning (cs.LG), World Wide Web, Open source, computer, Information Retrieval (cs.IR), computer.programming_language
Abstract: The purpose of this work is to highlight the content of the Microsoft Recommenders repository and show how it can be used to reduce the time involved in developing recommender systems. The open source repository provides python utilities to simplify common recommender-related data science work as well as example Jupyter notebooks that demonstrate use of the algorithms and tools under various environments., Comment: pages: 2; submitted to: RecSys '19
Published: 2020
Full Text: View/download PDF

19. Scalable Distributed Approximation of Internal Measures for Clustering Evaluation

Author: Fabio Vandin, Federico Altieri, Andrea Pietracaprina, and Geppino Pucci
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Sublinear function, I.5.3, Computer science, I.5.4, Computation, I.5.5, Approximation algorithm, Measure (mathematics), Silhouette, Machine Learning (cs.LG), Computer Science - Data Structures and Algorithms, Metric (mathematics), Data Structures and Algorithms (cs.DS), Cluster analysis, Heuristics, Algorithm
Abstract: The most widely used internal measure for clustering evaluation is the silhouette coefficient, whose naive computation requires a quadratic number of distance calculations, which is clearly unfeasible for massive datasets. Surprisingly, there are no known general methods to efficiently approximate the silhouette coefficient of a clustering with rigorously provable high accuracy. In this paper, we present the first scalable algorithm to compute such a rigorous approximation for the evaluation of clusterings based on any metric distances. Our algorithm hinges on a Probability Proportional to Size (PPS) sampling scheme, and, for any fixed $\varepsilon, \delta \in (0,1)$, it approximates the silhouette coefficient within a mere additive error $O(\varepsilon)$ with probability $1-\delta$, using a very small number of distance calculations. We also prove that the algorithm can be adapted to obtain rigorous approximations of other internal measures of clustering quality, such as cohesion and separation. Importantly, we provide a distributed implementation of the algorithm using the MapReduce model, which runs in constant rounds and requires only sublinear local space at each worker, which makes our estimation approach applicable to big data scenarios. We perform an extensive experimental evaluation of our silhouette approximation algorithm, comparing its performance to a number of baseline heuristics on real and synthetic datasets. The experiments provide evidence that, unlike other heuristics, our estimation strategy not only provides tight theoretical guarantees but is also able to return highly accurate estimations while running in a fraction of the time required by the exact computation, and that its distributed implementation is highly scalable, thus enabling the computation of internal measures for very large datasets for which the exact computation is prohibitive., Comment: 16 pages, 4 tables, 1 figure
Published: 2020
Full Text: View/download PDF

20. hyper-sinh: An Accurate and Reliable Function from Shallow to Deep Learning in TensorFlow and Keras

Author: Luca Parisi, Matteo Lanzillotta, Narrendar RaviChandran, and Renfei Ma
Subjects: FOS: Computer and information sciences, 68T07, 68T10, 68T45, 68T50, 68U35, Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer science, TensorFlow, Reliability (computer networking), Computer Vision and Pattern Recognition (cs.CV), Activation function, Computer Science - Computer Vision and Pattern Recognition, Activation, Convolutional Neural Network, I.2.1, I.2.7, I.2.10, I.4.9, I.5.1, I.5.4, I.5.5, Convolutional neural network, Machine Learning (cs.LG), Long short-term memory, Neural and Evolutionary Computing (cs.NE), computer.programming_language, Computer Science - Computation and Language, business.industry, Deep learning, Supervised learning, Computer Science - Neural and Evolutionary Computing, Pattern recognition, QA75.5-76.95, Python (programming language), Recurrent neural network, Artificial Intelligence (cs.AI), Electronic computers. Computer science, Benchmark (computing), Q300-390, Artificial intelligence, business, computer, Cybernetics, Computation and Language (cs.CL), Keras
Abstract: This paper presents the 'hyper-sinh', a variation of the m-arcsinh activation function suitable for Deep Learning (DL)-based algorithms for supervised learning, such as Convolutional Neural Networks (CNN). hyper-sinh, developed in the open source Python libraries TensorFlow and Keras, is thus described and validated as an accurate and reliable activation function for both shallow and deep neural networks. Improvements in accuracy and reliability in image and text classification tasks on five (N = 5) benchmark data sets available from Keras are discussed. Experimental results demonstrate the overall competitive classification performance of both shallow and deep neural networks, obtained via this novel function. This function is evaluated with respect to gold standard activation functions, demonstrating its overall competitive accuracy and reliability for both image and text classification., Comment: 19 pages, 6 listings/Python code snippets, 4 figures, 5 tables
Published: 2020
Full Text: View/download PDF

21. Static analysis of executable files by machine learning methods

Author: Prudkovskiy, Nikolay
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Science - Cryptography and Security, I.2.6, I.5.4, I.2.8, I.5.5, G.3, 62H30, 68T05, 68T09, 68W40, 91B12, Machine Learning (stat.ML), Machine Learning (cs.LG), Statistics - Machine Learning, Cryptography and Security (cs.CR)
Abstract: The paper describes how to detect malicious executable files based on static analysis of their binary content. The stages of pre-processing and cleaning data extracted from different areas of executable files are analyzed. Methods of encoding categorical attributes of executable files are considered, as are ways to reduce the feature field dimension and select characteristic features in order to effectively represent samples of binary executable files for further training classifiers. An ensemble training approach was applied in order to aggregate forecasts from each classifier, and an ensemble of classifiers of various feature groups of executable file attributes was created in order to subsequently develop a system for detecting malicious files in an uninsulated environment., Comment: 36 pages, 13 figures, 6 tables
Published: 2020
Full Text: View/download PDF

22. Large-Scale Location-Aware Services in Access: Hierarchical Building/Floor Classification and Location Estimation Using Wi-Fi Fingerprinting Based on Deep Neural Networks

Author: Haowei Song, Zikun Tan, Ruihao Wang, Zhenghang Zhong, Jaehoon Cha, Kyeong Soo Kim, and Sanghyuk Lee
Subjects: FOS: Computer and information sciences, Computer science, RSS, Feature vector, 02 engineering and technology, computer.software_genre, Computer Science - Networking and Internet Architecture, Multiclass classification, C.2.1, 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, Architecture, Networking and Internet Architecture (cs.NI), Multi-label classification, Service system, I.5.1, I.5.2, I.2.6, business.industry, Visitor pattern, I.5.4, Deep learning, Dimensionality reduction, I.5.5, Pattern recognition, computer.file_format, Autoencoder, Atomic and Molecular Physics, and Optics, Electronic, Optical and Magnetic Materials, Scalability, 020201 artificial intelligence & image processing, Data mining, Artificial intelligence, business, computer, Classifier (UML)
Abstract: One of key technologies for future large-scale location-aware services in access is a scalable indoor localization technique. In this paper, we report preliminary results from our investigation on the use of deep neural networks (DNNs) for hierarchical building/floor classification and floor-level location estimation based on Wi-Fi fingerprinting, which we carried out as part of a feasibility study project on Xi'an Jiaotong-Liverpool University (XJTLU) Campus Information and Visitor Service System. To take into account the hierarchical nature of the building/floor classification problem, we propose a new DNN architecture based on a stacked autoencoder for the reduction of feature space dimension and a feed-forward classifier for multi-label classification with argmax functions to convert multi-label classification results into multi-class classification ones. We also describe the demonstration of a prototype DNN-based indoor localization system for floor-level location estimation using real received signal strength (RSS) data collected at one of the buildings on the XJTLU campus. The preliminary results for both building/floor classification and floor-level location estimation clearly show the strengths of DNN-based approaches, which can provide near state-of-the-art performance with less parameter tuning and higher scalability., Comment: 5 pages, 6 figures, FOAN 2017 (Munich, Germany, Oct. 2017)
Published: 2018

23. DEEP BV: A FULLY AUTOMATED SYSTEM FOR BRAIN VENTRICLE LOCALIZATION AND SEGMENTATION IN 3D ULTRASOUND IMAGES OF EMBRYONIC MICE

Author: Daniel H. Turnbull, Nitin Nair, Jack Langerman, Ziming Qiu, Jeffrey A. Ketterling, Orlando Aristizabal, Jonathan Mamou, and Yao Wang
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, J.3, Computer science, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, Machine Learning (stat.ML), Convolutional neural network, Quantitative Biology - Quantitative Methods, Article, Machine Learning (cs.LG), 030218 nuclear medicine & medical imaging, 03 medical and health sciences, 0302 clinical medicine, Minimum bounding box, Statistics - Machine Learning, Sliding window protocol, FOS: Electrical engineering, electronic engineering, information engineering, medicine, 3D ultrasound, Segmentation, Quantitative Methods (q-bio.QM), Brain Ventricle, I.5.1, medicine.diagnostic_test, business.industry, I.2.6, Deep learning, I.5.4, Image and Video Processing (eess.IV), I.4.6, I.5.5, Pattern recognition, Electrical Engineering and Systems Science - Image and Video Processing, Fully automated, FOS: Biological sciences, Artificial intelligence, business, 030217 neurology & neurosurgery
Abstract: Volumetric analysis of brain ventricle (BV) structure is a key tool in the study of central nervous system development in embryonic mice. High-frequency ultrasound (HFU) is the only non-invasive, real-time modality available for rapid volumetric imaging of embryos in utero. However, manual segmentation of the BV from HFU volumes is tedious, time-consuming, and requires specialized expertise. In this paper, we propose a novel deep learning based BV segmentation system for whole-body HFU images of mouse embryos. Our fully automated system consists of two modules: localization and segmentation. It first applies a volumetric convolutional neural network on a 3D sliding window over the entire volume to identify a 3D bounding box containing the entire BV. It then employs a fully convolutional network to segment the detected bounding box into BV and background. The system achieves a Dice Similarity Coefficient (DSC) of 0.8956 for BV segmentation on an unseen 111 HFU volume test set surpassing the previous state-of-the-art method (DSC of 0.7119) by a margin of 25%., IEEE Signal Processing in Medicine and Biology Symposium - 2018, 6 pages, 5 figures
Published: 2019

24. A Winograd-based Integrated Photonics Accelerator for Convolutional Neural Networks

Author: Tarek El-Ghazawi, Mario Miscuglio, Volker J. Sorger, Armin Mehrabian, and Yousra Alkabani
Subjects: I.2, Signal Processing (eess.SP), FOS: Computer and information sciences, Computer Science - Machine Learning, I.4, I.5, Computer science, C.5, B.7, Computer Science - Emerging Technologies, 02 engineering and technology, I.6, Convolutional neural network, Convolution, Machine Learning (cs.LG), C.1.4, C.1.2, 020210 optoelectronics & photonics, Wavelength-division multiplexing, Specialization (functional), 0202 electrical engineering, electronic engineering, information engineering, FOS: Electrical engineering, electronic engineering, information engineering, Electrical Engineering and Systems Science - Signal Processing, Electrical and Electronic Engineering, I.2.10, I.2.11, I.2.5, I.5.2, Artificial neural network, business.industry, I.5.4, I.6.3, I.5.5, Atomic and Molecular Physics, and Optics, Power (physics), Emerging Technologies (cs.ET), Computer Science - Distributed, Parallel, and Cluster Computing, Computer engineering, C.1, Distributed, Parallel, and Cluster Computing (cs.DC), C.3, Photonics, business, B.0, B.7, C.1, C.1.2, C.1.4, C.3, C.5, I.2, I.2.5, I.2.10, I.2.11, I.4, I.5, I.5.2, I.5.4, I.5.5, I.6, I.6.3, B.0, Efficient energy use
Abstract: Neural Networks (NNs) have become the mainstream technology in the artificial intelligence (AI) renaissance over the past decade. Among different types of neural networks, convolutional neural networks (CNNs) have been widely adopted as they have achieved leading results in many fields such as computer vision and speech recognition. This success in part is due to the widespread availability of capable underlying hardware platforms. Applications have always been a driving factor for design of such hardware architectures. Hardware specialization can expose us to novel architectural solutions, which can outperform general purpose computers for tasks at hand. Although different applications demand for different performance measures, they all share speed and energy efficiency as high priorities. Meanwhile, photonics processing has seen a resurgence due to its inherited high speed and low power nature. Here, we investigate the potential of using photonics in CNNs by proposing a CNN accelerator design based on Winograd filtering algorithm. Our evaluation results show that while a photonic accelerator can compete with current-state-of-the-art electronic platforms in terms of both speed and power, it has the potential to improve the energy efficiency by up to three orders of magnitude., Comment: 12 pages, photonics, artificial intelligence, convolutional neural networks, Winograd
Published: 2019
Full Text: View/download PDF

25. Significance of parallel computing on the performance of Digital Image Correlation algorithms in MATLAB

Author: Thoma, Andreas and Ravi, Sridhar
Subjects: Performance (cs.PF), FOS: Computer and information sciences, G.1.6, I.5.5, J.2, Computer Science - Performance, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition
Abstract: Digital Image Correlation (DIC) is a powerful tool used to evaluate displacements and deformations in a non-intrusive manner. By comparing two images, one of the undeformed reference state of a specimen and another of the deformed target state, the relative displacement between those two states is determined. DIC is well known and often used for post-processing analysis of in-plane displacements and deformation of specimen. Increasing the analysis speed to enable real-time DIC analysis will be beneficial and extend the field of use of this technique. Here we tested several combinations of the most common DIC methods in combination with different parallelization approaches in MATLAB and evaluated their performance to determine whether real-time analysis is possible with these methods. To reflect improvements in computing technology different hardware settings were also analysed. We found that implementation problems can reduce the efficiency of a theoretically superior algorithm such that it becomes practically slower than a sub-optimal algorithm. The Newton-Raphson algorithm in combination with a modified Particle Swarm algorithm in parallel image computation was found to be most effective. This is contrary to theory, suggesting that the inverse-compositional Gauss-Newton algorithm is superior. As expected, the Brute Force Search algorithm is the least effective method. We also found that the correct choice of parallelization tasks is crucial to achieve improvements in computing speed. A poorly chosen parallelisation approach with high parallel overhead leads to inferior performance. Finally, irrespective of the computing mode the correct choice of combinations of integer-pixel and sub-pixel search algorithms is decisive for an efficient analysis. Using currently available hardware real-time analysis at high framerates remains an aspiration., Comment: 17 pages, 5 figures, 6 tables
Published: 2019
Full Text: View/download PDF

26. Movement Coordination in Human–Robot Teams: A Dynamical Systems Approach

Author: Laurel D. Riek, Samantha Rack, and Tariq Iqbal
Subjects: FOS: Computer and information sciences, 0209 industrial biotechnology, Dynamical systems theory, Computer Science - Artificial Intelligence, I.2.9, I.2.11, H.5.3, I.5.5, J.5, Computer science, Context (language use), 02 engineering and technology, Motion (physics), Human–robot interaction, Computer Science - Robotics, 03 medical and health sciences, 020901 industrial engineering & automation, 0302 clinical medicine, Human–computer interaction, Electrical and Electronic Engineering, business.industry, Mobile robot, Robotics, Computer Science Applications, Artificial Intelligence (cs.AI), Control and Systems Engineering, Anticipation (artificial intelligence), Robot, Artificial intelligence, business, Robotics (cs.RO), 030217 neurology & neurosurgery
Abstract: In order to be effective teammates, robots need to be able to understand high-level human behavior to recognize, anticipate, and adapt to human motion. We have designed a new approach to enable robots to perceive human group motion in real-time, anticipate future actions, and synthesize their own motion accordingly. We explore this within the context of joint action, where humans and robots move together synchronously. In this paper, we present an anticipation method which takes high-level group behavior into account. We validate the method within a human-robot interaction scenario, where an autonomous mobile robot observes a team of human dancers, and then successfully and contingently coordinates its movements to "join the dance". We compared the results of our anticipation method to move the robot with another method which did not rely on high-level group behavior, and found our method performed better both in terms of more closely synchronizing the robot's motion to the team, and also exhibiting more contingent and fluent motion. These findings suggest that the robot performs better when it has an understanding of high-level group behavior than when it does not. This work will help enable others in the robotics community to build more fluent and adaptable robots in the future., Comment: 11 pages, 7 figures, IEEE Transactions on Robotics 2016 preprint
Published: 2016

27. Naive Dictionary On Musical Corpora: From Knowledge Representation To Pattern Recognition

Author: Wu, Qiuyi and Fokoue, Ernest
Subjects: I.1.3, FOS: Computer and information sciences, Computer Science - Machine Learning, I.1.4, Sound (cs.SD), I.2.4, I.7.0, E.2, I.2.6, I.5.5, Machine Learning (stat.ML), F.1.1, F.2.0, I.2.1, Computer Science - Sound, Computer Science - Information Retrieval, Machine Learning (cs.LG), 62P15, 62P25, 62P99, 68W40, 68W01, 91E10, 91E45, 82-08, 62-07, Statistics - Machine Learning, Audio and Speech Processing (eess.AS), FOS: Electrical engineering, electronic engineering, information engineering, Information Retrieval (cs.IR), Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: In this paper, we propose and develop the novel idea of treating musical sheets as literary documents in the traditional text analytics parlance, to fully benefit from the vast amount of research already existing in statistical text mining and topic modelling. We specifically introduce the idea of representing any given piece of music as a collection of "musical words" that we codenamed "muselets", which are essentially musical words of various lengths. Given the novelty and therefore the extremely difficulty of properly forming a complete version of a dictionary of muselets, the present paper focuses on a simpler albeit naive version of the ultimate dictionary, which we refer to as a Naive Dictionary because of the fact that all the words are of the same length. We specifically herein construct a naive dictionary featuring a corpus made up of African American, Chinese, Japanese and Arabic music, on which we perform both topic modelling and pattern recognition. Although some of the results based on the Naive Dictionary are reasonably good, we anticipate phenomenal predictive performances once we get around to actually building a full scale complete version of our intended dictionary of muselets., Comment: 25 pages
Published: 2018
Full Text: View/download PDF

28. Robots Learning to Say 'No': Prohibition and Rejective Mechanisms in Acquisition of Linguistic Negation

Author: Joe Saunders, Frank Förster, Chrystopher L. Nehaniv, and Hagen Lehmann
Subjects: FOS: Computer and information sciences, Computer Science - Artificial Intelligence, Developmental robotics, 02 engineering and technology, Settore M-PED/04 - Pedagogia Sperimentale, 050105 experimental psychology, Human–robot interaction, H.5.2, Computer Science - Robotics, human-robot interaction, Negation, Artificial Intelligence, Volition (linguistics), Developmental robotics, language acquisition, symbol grounding, human-robot interaction, 0202 electrical engineering, electronic engineering, information engineering, 0501 psychology and cognitive sciences, symbol grounding, Computer Science - Computation and Language, I.2.6, I.2.7, I.5.5, 05 social sciences, language acquisition, Language acquisition, Linguistics, Human-Computer Interaction, Symbol grounding, Artificial Intelligence (cs.AI), Salient, 020201 artificial intelligence & image processing, Psychology, Computation and Language (cs.CL), Robotics (cs.RO), iCub
Abstract: `No' belongs to the first ten words used by children and embodies the first active form of linguistic negation. Despite its early occurrence the details of its acquisition process remain largely unknown. The circumstance that `no' cannot be construed as a label for perceptible objects or events puts it outside of the scope of most modern accounts of language acquisition. Moreover, most symbol grounding architectures will struggle to ground the word due to its non-referential character. In an experimental study involving the child-like humanoid robot iCub that was designed to illuminate the acquisition process of negation words, the robot is deployed in several rounds of speech-wise unconstrained interaction with na\"ive participants acting as its language teachers. The results corroborate the hypothesis that affect or volition plays a pivotal role in the socially distributed acquisition process. Negation words are prosodically salient within prohibitive utterances and negative intent interpretations such that they can be easily isolated from the teacher's speech signal. These words subsequently may be grounded in negative affective states. However, observations of the nature of prohibitive acts and the temporal relationships between its linguistic and extra-linguistic components raise serious questions over the suitability of Hebbian-type algorithms for language grounding., Comment: Submitted journal article. 21 pages main paper plus 28 pages supplementary information / appendix. 8 figures in main paper
Published: 2018
Full Text: View/download PDF

29. VideoKifu, or the automatic transcription of a Go game

Author: Corsolini, Mario and Carta, Andrea
Subjects: FOS: Computer and information sciences, I.2.10, Computer Vision and Pattern Recognition (cs.CV), I.5.5, Computer Science - Computer Vision and Pattern Recognition, I.4.8
Abstract: In two previous papers [arXiv:1508.03269, arXiv:1701.05419] we described the techniques we employed for reconstructing the whole move sequence of a Go game. That task was at first accomplished by means of a series of photographs, manually shot, as explained during the scientific conference held within the LIX European Go Congress (Liberec, CZ). The photographs were subsequently replaced by a possibly unattended video live stream (provided by webcams, videocameras, smartphones and so on) or, were the live stream not available, by means of a pre-recorded video of the game itself, on condition that the goban and the stones were clearly visible more often than not. As we hinted in the latter paper, in the last two years we have improved both the algorithms employed for reconstructing the grid and detecting the stones, making extensive usage of the multicore capabilities offered by modern CPUs. Those capabilities prompted us to develop some asynchronous routines, capable of double-checking the position of the grid and the number and colour of any stone previously detected, in order to get rid of minor errors possibly occurred during the main analysis, and that may pass undetected especially in the course of an unattended live streaming. Those routines will be described in details, as they address some problems that are of general interest when reconstructing the move sequence, for example what to do when large movements of the whole goban occur (deliberate or not) and how to deal with captures of dead stones $-$ that could be wrongly detected and recorded as "fresh" moves if not promptly removed., Comment: 14 pages, 6 figures. Accepted for the "International Conference on Research in Mind Games" (August 7-8, 2018) at the EGC in Pisa, Italy. Datasets available from http://www.oipaz.net/VideoKifu.html
Published: 2018
Full Text: View/download PDF

30. Moving to VideoKifu: the last steps toward a fully automatic record-keeping of a Go game

Author: Corsolini, Mario and Carta, Andrea
Subjects: FOS: Computer and information sciences, I.2.10, Computer Vision and Pattern Recognition (cs.CV), I.5.5, Computer Science - Computer Vision and Pattern Recognition, I.4.8
Abstract: In a previous paper [ arXiv:1508.03269 ] we described the techniques we successfully employed for automatically reconstructing the whole move sequence of a Go game by means of a set of pictures. Now we describe how it is possible to reconstruct the move sequence by means of a video stream (which may be provided by an unattended webcam), possibly in real-time. Although the basic algorithms remain the same, we will discuss the new problems that arise when dealing with videos, with special care for the ones that could block a real-time analysis and require an improvement of our previous techniques or even a completely brand new approach. Eventually we present a number of preliminary but positive experimental results supporting the effectiveness of the software we are developing, built on the ideas here outlined., Comment: 20 pages, 14 figures. Accepted for publication in the "Journal of Baduk Studies", datasets available from http://www.oipaz.net/PhotoKifu.html
Published: 2017
Full Text: View/download PDF

31. UI-Net: Interactive Artificial Neural Networks for Iterative Image Segmentation Based on a User Model

Author: Amrehn, Mario, Gaube, Sven, Unberath, Mathias, Schebesch, Frank, Horz, Tim, Strumia, Maddalena, Steidl, Stefan, Kowarschik, Markus, and Maier, Andreas
Subjects: FOS: Computer and information sciences, Computer Science - Artificial Intelligence, I.2.6, Computer Vision and Pattern Recognition (cs.CV), I.4.6, I.5.5, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Neural and Evolutionary Computing, 68T05, 68T45, Machine Learning (cs.LG), Computer Science - Learning, Artificial Intelligence (cs.AI), Neural and Evolutionary Computing (cs.NE)
Abstract: For complex segmentation tasks, fully automatic systems are inherently limited in their achievable accuracy for extracting relevant objects. Especially in cases where only few data sets need to be processed for a highly accurate result, semi-automatic segmentation techniques exhibit a clear benefit for the user. One area of application is medical image processing during an intervention for a single patient. We propose a learning-based cooperative segmentation approach which includes the computing entity as well as the user into the task. Our system builds upon a state-of-the-art fully convolutional artificial neural network (FCN) as well as an active user model for training. During the segmentation process, a user of the trained system can iteratively add additional hints in form of pictorial scribbles as seed points into the FCN system to achieve an interactive and precise segmentation result. The segmentation quality of interactive FCNs is evaluated. Iterative FCN approaches can yield superior results compared to networks without the user input channel component, due to a consistent improvement in segmentation quality after each interaction., Comment: This work is submitted to the 2017 Eurographics Workshop on Visual Computing for Biology and Medicine
Published: 2017
Full Text: View/download PDF

32. Synchronization Detection in Networks of Coupled Oscillators for Pattern Recognition

Author: Damien Querlioz, Julie Grollier, Damir Vodenicarevic, and Nicolas Locatelli
Subjects: FOS: Computer and information sciences, Signal processing, B.8.1, Artificial neural network, Computer science, Oscillation, Noise (signal processing), I.5.5, Computer Science - Emerging Technologies, 02 engineering and technology, 021001 nanoscience & nanotechnology, Synchronization, Emerging Technologies (cs.ET), CMOS, Synchronization (computer science), Pattern recognition (psychology), 0202 electrical engineering, electronic engineering, information engineering, Electronic engineering, 020201 artificial intelligence & image processing, State (computer science), 0210 nano-technology, Realization (systems), Electronic circuit
Abstract: Coupled oscillator-based networks are an attractive approach for implementing hardware neural networks based on emerging nanotechnologies. However, the readout of the state of a coupled oscillator network is a difficult challenge in hardware implementations, as it necessitates complex signal processing to evaluate the degree of synchronization between oscillators, possibly more complicated than the coupled oscillator network itself. In this work, we focus on a coupled oscillator network particularly adapted to emerging technologies, and evaluate two schemes for reading synchronization patterns that can be readily implemented with basic CMOS circuits. Through simulation of a simple generic coupled oscillator network, we compare the operation of these readout techniques with a previously proposed full statistics evaluation scheme. Our approaches provide results nearly identical to the mathematical method, but also show better resilience to moderate noise, which is a major concern for hardware implementations. These results open the door to widespread realization of hardware coupled oscillator-based neural systems., Comment: Accepted to 2016 IEEE World Congress on Computational Intelligence 8 pages, 8 figures
Published: 2016
Full Text: View/download PDF

33. Hardware Architecture for Large Parallel Array of Random Feature Extractors applied to Image Recognition

Author: Shanlan Shen, Aakash Patil, Enyi Yao, and Arindam Basu
Subjects: FOS: Computer and information sciences, Adder, Computer science, Cognitive Neuroscience, Computer Science - Emerging Technologies, 02 engineering and technology, C.5.4, Artificial Intelligence, 0202 electrical engineering, electronic engineering, information engineering, Computer vision, Pruning (decision trees), Neural and Evolutionary Computing (cs.NE), Extreme learning machine, Hardware architecture, C.3, I.5.5, Artificial neural network, business.industry, 020208 electrical & electronic engineering, Computer Science - Neural and Evolutionary Computing, Computer Science Applications, Emerging Technologies (cs.ET), Feature (computer vision), 020201 artificial intelligence & image processing, Artificial intelligence, business, MNIST database
Abstract: We demonstrate a low-power and compact hardware implementation of Random Feature Extractor (RFE) core. With complex tasks like Image Recognition requiring a large set of features, we show how weight reuse technique can allow to virtually expand the random features available from RFE core. Further, we show how to avoid computation cost wasted for propagating "incognizant" or redundant random features. For proof of concept, we validated our approach by using our RFE core as the first stage of Extreme Learning Machine (ELM)--a two layer neural network--and were able to achieve $>97\%$ accuracy on MNIST database of handwritten digits. ELM's first stage of RFE is done on an analog ASIC occupying $5$mm$\times5$mm area in $0.35\mu$m CMOS and consuming $5.95$ $\mu$J/classify while using $\approx 5000$ effective hidden neurons. The ELM second stage consisting of just adders can be implemented as digital circuit with estimated power consumption of $20.9$ nJ/classify. With a total energy consumption of only $5.97$ $\mu$J/classify, this low-power mixed signal ASIC can act as a co-processor in portable electronic gadgets with cameras., Comment: Submitted for ELM special issue in Neurocomputing, 18 pages, 7 figures, 3 tables. ACM class: "Hardware/Emerging Technologies"
Published: 2015
Full Text: View/download PDF

34. Delegating Custom Object Detection Tasks to a Universal Classification System

Author: Gleibman, Andrew
Subjects: FOS: Computer and information sciences, I.2.10, I.5, I.5.2, I.4.7, I.4.8, I.4.9, I.5.4, I.5.5, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, 68T10
Abstract: In this paper, a concept of multipurpose object detection system, recently introduced in our previous work, is clarified. The business aspect of this method is transformation of a classifier into an object detector/locator via an image grid. This is a universal framework for locating objects of interest through classification. The framework standardizes and simplifies implementation of custom systems by doing only a custom analysis of the classification results on the image grid., Comment: 3 pages, 2 figures, 6 refs. arXiv admin note: substantial text overlap with arXiv:1310.7170
Published: 2014
Full Text: View/download PDF

35. Squiggle - A Glyph Recognizer for Gesture Input

Author: Lee, Jeremy
Subjects: H.5.2, FOS: Computer and information sciences, I.4.7, I.5.5, G.1.3, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Human-Computer Interaction, Computer Science - Computer Vision and Pattern Recognition, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, ComputingMethodologies_COMPUTERGRAPHICS, Human-Computer Interaction (cs.HC)
Abstract: Squiggle is a template-based glyph recognizer in the lineage of `$1 Recognizer' and `Protractor'. It seeks a good fit linear affine mapping between the input and template glyphs which are represented as a list of milestone points along the glyph path. The algorithm can recognize input glyphs invariant of rotation, scaling, skew, and reflection symmetries. In practice the algorithm is fast and robust enough to recognize user-generated glyphs as they are being drawn in real time, and to project `shadows' of the matching templates as feedback., Comment: 10 pages
Published: 2011
Full Text: View/download PDF

36. The Cyborg Astrobiologist: First Field Experience

Author: Markus Oesker, Javier Gómez-Elvira, Joerg Ontrup, Patrick C. McGuire, Helge Ritter, Enrique Díaz-Martínez, José Antonio Rodríguez-Manfredi, Jens Ormö, Ministerio de Ciencia y Tecnología (España), and CSIC-INTA - Centro de Astrobiología (CAB)
Subjects: FOS: Computer and information sciences, Physics and Astronomy (miscellaneous), Computer science, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, Computer Science - Human-Computer Interaction, Wearable computer, Astrophysics, Computational Engineering, Finance, and Science (cs.CE), Computer Science - Software Engineering, Software, Computer graphics (images), Earth and Planetary Sciences (miscellaneous), Wearable computers, Computer Science - Computational Engineering, Finance, and Science, Function (engineering), Interest map, media_common, I.2.10, Image segmentation, I.5.4, Co-occurrence histograms, I.5.5, Astrophysics (astro-ph), Robotics, I.4.8, I.4.6, I.4.0, I.2.9, J.2, I.4.9, Neurons and Cognition (q-bio.NC), Uncommon map, Robotics (cs.RO), Computer Science - Artificial Intelligence, media_common.quotation_subject, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, FOS: Physical sciences, Mars, Human-Computer Interaction (cs.HC), Image (mathematics), Computer Science - Robotics, Field experience, Software system, Ecology, Evolution, Behavior and Systematics, business.industry, Miocene, Gypsum, Software Engineering (cs.SE), Artificial Intelligence (cs.AI), Space and Planetary Science, Quantitative Biology - Neurons and Cognition, FOS: Biological sciences, Computer vision, business, Field Geology
Abstract: 29 pages, 10 figures.-- Final editor version available at: http://dx.doi.org/10.1017/S147355040500220X, We present results from the first geological field tests of the `Cyborg Astrobiologist', which is a wearable computer and video camcorder system that we are using to test and train a computer-vision system towards having some of the autonomous decision-making capabilities of a field-geologist and field-astrobiologist. The Cyborg Astrobiologist platform has thus far been used for testing and development of these algorithms and systems: robotic acquisition of quasi-mosaics of images, real-time image segmentation, and real-time determination of interesting points in the image mosaics. The hardware and software systems function reliably, and the computer-vision algorithms are adequate for the first field tests. In addition to the proof-of-concept aspect of these field tests, the main result of these field tests is the enumeration of those issues that we can improve in the future, including: first, detection and accounting for shadows caused by 3D jagged edges in the outcrop; second, reincorporation of more sophisticated texture-analysis algorithms into the system; third, creation of hardware and software capabilities to control the camera's zoom lens in an intelligent manner; and fourth, development of algorithms for interpretation of complex geological scenery. Nonetheless, despite these technical inadequacies, this Cyborg Astrobiologist system, consisting of a camera-equipped wearable-computer and its computer-vision algorithms, has demonstrated its ability of finding genuinely interesting points in real-time in the geological scenery, and then gathering more information about these interest points in an automated manner., P. McGuire, J. Ormö and E. Díaz Martínez would all like to thank the Ramon y Cajal Fellowship program in Spain. The work by J. Ormö was partially supported by a grant from the Spanish Ministry for Science and Technology (AYA2003-01203). The equipment used in this work was purchased by grants to our Center for Astrobiology from its sponsoring research organizations, CSIC and INTA.
Published: 2004

37. Programming in Alma-0, or Imperative and Declarative Programming Reconciled

Author: Apt, Krzysztof R. and Schaerf, Andrea
Subjects: FOS: Computer and information sciences, Computer Science - Logic in Computer Science, Computer Science - Programming Languages, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, D.3.2, I.2.8, I.5.5, F.3.2, F.3.3, Logic in Computer Science (cs.LO), Programming Languages (cs.PL)
Abstract: In (Apt et al, TOPLAS 1998) we introduced the imperative programming language Alma-0 that supports declarative programming. In this paper we illustrate the hybrid programming style of Alma-0 by means of various examples that complement those presented in (Apt et al, TOPLAS 1998). The presented Alma-0 programs illustrate the versatility of the language and show that ``don't know'' nondeterminism can be naturally combined with assignment., Comment: With updated references with respect to the published version
Published: 2000
Full Text: View/download PDF

38. The Cyborg Astrobiologist: scouting red beds for uncommon features with geological significance

Author: Eduardo Sebastián-Martínez, Enrique Díaz-Martínez, Jens Ormö, Markus Oesker, Javier Gómez-Elvira, Jörg Ontrup, Helge Ritter, Robert Haschke, Patrick C. McGuire, and Jose Antonio Rodriguez-Manfredi
Subjects: FOS: Computer and information sciences, J.2, Physics - Instrumentation and Detectors, J.3, Physics and Astronomy (miscellaneous), Computer Science - Artificial Intelligence, Computer science, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, Computer Science - Human-Computer Interaction, Geochemistry, FOS: Physical sciences, Red sandstone, Astrophysics, Human-Computer Interaction (cs.HC), Computational Engineering, Finance, and Science (cs.CE), Computer Science - Robotics, Computer Science - Software Engineering, Earth and Planetary Sciences (miscellaneous), Computer Science - Computational Engineering, Finance, and Science, Ecology, Evolution, Behavior and Systematics, I.2.10, I.4.6, I.4.8, I.4.9, I.2.9, I.5.4, I.5.5, D.2, D.1.7, D.4.7, Red beds, Astrophysics (astro-ph), Instrumentation and Detectors (physics.ins-det), Paleosol, Software Engineering (cs.SE), Artificial Intelligence (cs.AI), Space and Planetary Science, Homogeneous, Quantitative Biology - Neurons and Cognition, FOS: Biological sciences, Neurons and Cognition (q-bio.NC), Robotics (cs.RO), Geologist
Abstract: The `Cyborg Astrobiologist' (CA) has undergone a second geological field trial, at a red sandstone site in northern Guadalajara, Spain, near Riba de Santiuste. The Cyborg Astrobiologist is a wearable computer and video camera system that has demonstrated a capability to find uncommon interest points in geological imagery in real-time in the field. The first (of three) geological structures that we studied was an outcrop of nearly homogeneous sandstone, which exhibits oxidized-iron impurities in red and and an absence of these iron impurities in white. The white areas in these ``red beds'' have turned white because the iron has been removed by chemical reduction, perhaps by a biological agent. The computer vision system found in one instance several (iron-free) white spots to be uncommon and therefore interesting, as well as several small and dark nodules. The second geological structure contained white, textured mineral deposits on the surface of the sandstone, which were found by the CA to be interesting. The third geological structure was a 50 cm thick paleosol layer, with fossilized root structures of some plants, which were found by the CA to be interesting. A quasi-blind comparison of the Cyborg Astrobiologist's interest points for these images with the interest points determined afterwards by a human geologist shows that the Cyborg Astrobiologist concurred with the human geologist 68% of the time (true positive rate), with a 32% false positive rate and a 32% false negative rate. (abstract has been abridged)., Comment: to appear in Int'l J. Astrobiology, vol.4, iss.2 (June 2005); 19 pages, 7 figs
Published: 2005

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Journal

Database

Publisher

38 results on '"I.5.5"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources