Author: "Hideki Asoh" / Language: english - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Hideki Asoh"' showing total 20 results

Start Over Author "Hideki Asoh" Language english

20 results on '"Hideki Asoh"'

1. Multi-Sensor Integration for Key-Frame Extraction From First-Person Videos

Author: Yujie Li, Atsunori Kanemura, Hideki Asoh, Taiki Miyanishi, and Motoaki Kawanabe
Subjects: Video summarization, multi-sensors, key-frame extraction, sparse estimation, graph model, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Key-frame extraction for first-person vision (FPV) videos is a core technology for selecting important scenes and memorizing impressive life experiences in our daily activities. The difficulty of selecting key frames is the scene instability caused by head-mounted cameras used for capturing FPV videos. Because head-mounted cameras tend to frequently shake, the frames in an FPV video are noisier than those in a third-person vision (TPV) video. However, most existing algorithms for key-frame extraction mainly focus on handling the stable scenes in TPV videos. The technical development of key-frame extraction techniques for noisy FPV videos is currently immature. Moreover, most key-frame extraction algorithms mainly use visual information from FPV videos, even though our visual experience in daily activities is associated with human motions. To incorporate the features of dynamically changing scenes in FPV videos into our methods, integrating motions with visual scenes is essential. In this paper, we propose a novel key-frame extraction method for FPV videos that uses multi-modal sensor signals to reduce noise and detect salient activities via projecting multi-modal sensor signals onto a common space by canonical correlation analysis (CCA). We show that the two proposed multi-sensor integration models for key-frame extraction (a sparse-based model and a graph-based model) work well on the common space. The experimental results obtained using various datasets suggest that the proposed key-frame extraction techniques improve the precision of extraction and the coverage of entire video sequences.
Published: 2020
Full Text: View/download PDF

2. Generating In-Between Images Through Learned Latent Space Representation Using Variational Autoencoders

Author: Paulino Cristovao, Hidemoto Nakada, Yusuke Tanimura, and Hideki Asoh
Subjects: Image interpolation, latent variables, representation learning, variational autoencoder, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Image interpolation is often implemented using one of two methods: optical flow or convolutional neural networks. These methods are typically pixel-based; they do not work well on objects between images far apart. Because they either rely on a simple frame average or pixel motion, they do not have the required knowledge of the semantic structure of the data. In this paper, we propose a method for image interpolation based on latent representations. We use a simple network structure based on a variational autoencoder and an adjustable hyperparameter that imposes the latent space distribution to generate accurate interpolation. To visualize the effects of the proposed approach, we evaluate a synthetic dataset. We demonstrate that our method outperforms both pixel-based methods and a conventional variational autoencoder, with particular improvements in nonsuccessive images.
Published: 2020
Full Text: View/download PDF

3. Supervised Saliency Mapping for First-Person Videos With an Inverse Sparse Coding Framework

Author: Yujie Li, Shotaro Akaho, Hideki Asoh, and Benying Tan
Subjects: inverse sparse coding, orthogonal matching pursuit (OMP), salience superpixel matrix, saliency mapping, supervised, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Saliency mapping is an efficient means of processing large amounts of incoming visual information from images and videos. Existing methods based on feature integration and construction error provide a rough saliency mapping and are sensitive to real-world noise. Furthermore, for first-person-vision video related to human activities, existing methods identify objects in the image without considering the human actions being performed. To address this issue, we propose a novel supervised saliency mapping based on a sparse coding framework. Different from the normal sparse representation, which uses dictionary atoms to represent signals, we use the inverse expression, whereby the use original image or video frame signals provide the normal dictionary, and these signals are used to represent the salience superpixel matrix, which is learned from a class of training images. We then construct the salient map by inverse sparse coding. In this paper, we first describe how the salience superpixel matrix is extracted by supervised selection, and then explain how to obtain the salient map through an inverse sparse coding framework. The experimental results show the enhanced accuracy of the proposed method compared with previous methods, as well as the time efficiency of the proposed method.
Published: 2019
Full Text: View/download PDF

4. Independent Low-Rank Matrix Analysis-Based Automatic Artifact Reduction Technique Applied to Three BCI Paradigms

Author: Suguru Kanoga, Takayuki Hoshino, and Hideki Asoh
Subjects: electroencephalogram, brain–computer interface, independent component analysis, artifact reduction, independent low-rank matrix analysis, Neurosciences. Biological psychiatry. Neuropsychiatry, RC321-571
Abstract: Electroencephalogram (EEG)-based brain-computer interfaces (BCIs) can potentially enable people to non-invasively and directly communicate with others using brain activities. Artifacts generated from body activities (e.g., eyeblinks and teeth clenches) often contaminate EEGs and make EEG-based classification/identification hard. Although independent component analysis (ICA) is the gold-standard technique for attenuating the effects of such contamination, the estimated independent components are still mixed with artifactual and neuronal information because ICA relies only on the independence assumption. The same problem occurs when using independent vector analysis (IVA), an extended ICA method. To solve this problem, we designed an independent low-rank matrix analysis (ILRMA)-based automatic artifact reduction technique that clearly models sources from observations under the independence assumption and a low-rank nature in the frequency domain. For automatic artifact reduction, we combined the signal separation technique with an independent component classifier for EEGs named ICLabel. To assess the comparative efficiency of the proposed method, the discriminabilities of artifact-reduced EEGs using ICA, IVA, and ILRMA were determined using an open-access EEG dataset named OpenBMI, which contains EEG data obtained through three BCI paradigms [motor-imagery (MI), event-related potential (ERP), and steady-state visual evoked potential (SSVEP)]. BCI performances were obtained using these three paradigms after applying artifact reduction techniques, and the results suggested that our proposed method has the potential to achieve higher discriminability than ICA and IVA for BCIs. In addition, artifact reduction using the ILRMA approach clearly improved (by over 70%) the averaged BCI performances using artifact-reduced data sufficiently for most needs of the BCI community. The extension of ICA families to supervised separation that leaves the discriminative ability would further improve the usability of BCIs for real-life environments in which artifacts frequently contaminate EEGs.
Published: 2020
Full Text: View/download PDF

5. Improved Surprise Adequacy Tools for Corner Case Data Description and Detection

Author: Tinghui Ouyang, Vicent Sanz Marco, Yoshinao Isobe, Hideki Asoh, Yutaka Oiwa, and Yoshiki Seo
Subjects: corner case data detection, surprise adequacy, modified distanced-based SA, AI quality testing, Technology, Engineering (General). Civil engineering (General), TA1-2040, Biology (General), QH301-705.5, Physics, QC1-999, Chemistry, QD1-999
Abstract: Facing the increasing quantity of AI models applications, especially in life- and property-related fields, it is crucial for designers to construct safety- and security-critical systems. As a major factor affecting the safety of AI models, corner case data and its related description/detection techniques are important in the AI design phase and quality assurance. In this paper, inspired by surprise adequacy (SA), a tool having advantages on capture data behaviors, we developed three modified versions of distance-based-SA (DSA) for detecting corner cases in classification problems. Through the experiment analysis on MNIST, CIFAR, and industrial example data, the feasibility and usefulness of the proposed tools on corner case data detection are verified. Moreover, Qualitative and quantitative experiments validated that the developed DSA tools can achieve improved performance in describing corner cases’ behaviors.
Published: 2021
Full Text: View/download PDF

6. Segmenting Continuous Motions with Hidden Semi-markov Models and Gaussian Processes

Author: Tomoaki Nakamura, Takayuki Nagai, Daichi Mochihashi, Ichiro Kobayashi, Hideki Asoh, and Masahide Kaneko
Subjects: motion segmentation, Gaussian process, hidden semi-Markov model, motion capture data, Neurosciences. Biological psychiatry. Neuropsychiatry, RC321-571
Abstract: Humans divide perceived continuous information into segments to facilitate recognition. For example, humans can segment speech waves into recognizable morphemes. Analogously, continuous motions are segmented into recognizable unit actions. People can divide continuous information into segments without using explicit segment points. This capacity for unsupervised segmentation is also useful for robots, because it enables them to flexibly learn languages, gestures, and actions. In this paper, we propose a Gaussian process-hidden semi-Markov model (GP-HSMM) that can divide continuous time series data into segments in an unsupervised manner. Our proposed method consists of a generative model based on the hidden semi-Markov model (HSMM), the emission distributions of which are Gaussian processes (GPs). Continuous time series data is generated by connecting segments generated by the GP. Segmentation can be achieved by using forward filtering-backward sampling to estimate the model's parameters, including the lengths and classes of the segments. In an experiment using the CMU motion capture dataset, we tested GP-HSMM with motion capture data containing simple exercise motions; the results of this experiment showed that the proposed GP-HSMM was comparable with other methods. We also conducted an experiment using karate motion capture data, which is more complex than exercise motion capture data; in this experiment, the segmentation accuracy of GP-HSMM was 0.92, which outperformed other methods.
Published: 2017
Full Text: View/download PDF

7. Statistical detection of EEG synchrony using empirical bayesian inference.

Author: Archana K Singh, Hideki Asoh, Yuji Takeda, and Steven Phillips
Subjects: Medicine, Science
Abstract: There is growing interest in understanding how the brain utilizes synchronized oscillatory activity to integrate information across functionally connected regions. Computing phase-locking values (PLV) between EEG signals is a popular method for quantifying such synchronizations and elucidating their role in cognitive tasks. However, high-dimensionality in PLV data incurs a serious multiple testing problem. Standard multiple testing methods in neuroimaging research (e.g., false discovery rate, FDR) suffer severe loss of power, because they fail to exploit complex dependence structure between hypotheses that vary in spectral, temporal and spatial dimension. Previously, we showed that a hierarchical FDR and optimal discovery procedures could be effectively applied for PLV analysis to provide better power than FDR. In this article, we revisit the multiple comparison problem from a new Empirical Bayes perspective and propose the application of the local FDR method (locFDR; Efron, 2001) for PLV synchrony analysis to compute FDR as a posterior probability that an observed statistic belongs to a null hypothesis. We demonstrate the application of Efron's Empirical Bayes approach for PLV synchrony analysis for the first time. We use simulations to validate the specificity and sensitivity of locFDR and a real EEG dataset from a visual search study for experimental validation. We also compare locFDR with hierarchical FDR and optimal discovery procedures in both simulation and experimental analyses. Our simulation results showed that the locFDR can effectively control false positives without compromising on the power of PLV synchrony inference. Our results from the application locFDR on experiment data detected more significant discoveries than our previously proposed methods whereas the standard FDR method failed to detect any significant discoveries.
Published: 2015
Full Text: View/download PDF

8. Detection and Separation of Speech Event Using Audio and Video Information Fusion and Its Application to Robust Speech Interface

Author: Futoshi Asano, Kiyoshi Yamamoto, Isao Hara, Jun Ogata, Takashi Yoshimura, Yoichi Motomura, Naoyuki Ichimura, and Hideki Asoh
Subjects: information fusion, sound localization, human tracking, adaptive beamformer, speech recognition., Telecommunication, TK5101-6720, Electronics, TK7800-8360
Abstract: A method of detecting speech events in a multiple-sound-source condition using audio and video information is proposed. For detecting speech events, sound localization using a microphone array and human tracking by stereo vision is combined by a Bayesian network. From the inference results of the Bayesian network, information on the time and location of speech events can be known. The information on the detected speech events is then utilized in the robust speech interface. A maximum likelihood adaptive beamformer is employed as a preprocessor of the speech recognizer to separate the speech signal from environmental noise. The coefficients of the beamformer are kept updated based on the information of the speech events. The information on the speech events is also used by the speech recognizer for extracting the speech segment.
Published: 2004
Full Text: View/download PDF

9. Tracking Intermittently Speaking Multiple Speakers Using a Particle Filter

Author: Angela Quinlan, Mitsuru Kawamoto, Yosuke Matsusaka, Hideki Asoh, and Futoshi Asano
Subjects: Acoustics. Sound, QC221-246, Electronic computers. Computer science, QA75.5-76.95
Abstract: The problem of tracking multiple intermittently speaking speakers is difficult as some distinct problems must be addressed. The number of active speakers must be estimated, these active speakers must be identified, and the locations of all speakers including inactive speakers must be tracked. In this paper we propose a method for tracking intermittently speaking multiple speakers using a particle filter. In the proposed algorithm the number of active speakers is firstly estimated based on the Exponential Fitting Test (EFT), a source number estimation technique which we have proposed. The locations of the speakers are then tracked using a particle filtering framework within which the decomposed likelihood is used in order to decouple the observed audio signal and associate each element of the decomposed signal with an active speaker. The tracking accuracy is then further improved by the inclusion of a silence region detection step and estimation of the noise-only covariance matrix. The method was evaluated using live recordings of 3 speakers and the results show that the method produces highly accurate tracking results.
Published: 2009
Full Text: View/download PDF

10. Generating In-Between Images Through Learned Latent Space Representation Using Variational Autoencoders

Author: Hidemoto Nakada, Paulino Cristovao, Hideki Asoh, and Yusuke Tanimura
Subjects: General Computer Science, Computer science, Optical flow, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, 02 engineering and technology, Convolutional neural network, representation learning, 0202 electrical engineering, electronic engineering, information engineering, Image scaling, Image interpolation, General Materials Science, variational autoencoder, Representation (mathematics), Hyperparameter, latent variables, Pixel, business.industry, General Engineering, Pattern recognition, 021001 nanoscience & nanotechnology, Autoencoder, Computer Science::Computer Vision and Pattern Recognition, 020201 artificial intelligence & image processing, Artificial intelligence, lcsh:Electrical engineering. Electronics. Nuclear engineering, 0210 nano-technology, business, lcsh:TK1-9971, Interpolation
Abstract: Image interpolation is often implemented using one of two methods: optical flow or convolutional neural networks. These methods are typically pixel-based; they do not work well on objects between images far apart. Because they either rely on a simple frame average or pixel motion, they do not have the required knowledge of the semantic structure of the data. In this paper, we propose a method for image interpolation based on latent representations. We use a simple network structure based on a variational autoencoder and an adjustable hyperparameter that imposes the latent space distribution to generate accurate interpolation. To visualize the effects of the proposed approach, we evaluate a synthetic dataset. We demonstrate that our method outperforms both pixel-based methods and a conventional variational autoencoder, with particular improvements in nonsuccessive images.
Published: 2020

11. Improved Surprise Adequacy Tools for Corner Case Data Description and Detection

Author: Yutaka Oiwa, Tinghui Ouyang, Yoshinao Isobe, Vicent Sanz Marco, Hideki Asoh, and Yoshiki Seo
Subjects: Technology, Computer science, QH301-705.5, 020209 energy, media_common.quotation_subject, QC1-999, AI quality testing, 02 engineering and technology, computer.software_genre, Data description, Corner case, Factor (programming language), surprise adequacy, 0202 electrical engineering, electronic engineering, information engineering, General Materials Science, Biology (General), Instrumentation, QD1-999, media_common, computer.programming_language, Fluid Flow and Transfer Processes, business.industry, Process Chemistry and Technology, Physics, General Engineering, Construct (python library), 021001 nanoscience & nanotechnology, Engineering (General). Civil engineering (General), corner case data detection, Computer Science Applications, Surprise, Improved performance, Chemistry, Data mining, TA1-2040, 0210 nano-technology, business, modified distanced-based SA, Quality assurance, computer, MNIST database
Abstract: Facing the increasing quantity of AI models applications, especially in life- and property-related fields, it is crucial for designers to construct safety- and security-critical systems. As a major factor affecting the safety of AI models, corner case data and its related description/detection techniques are important in the AI design phase and quality assurance. In this paper, inspired by surprise adequacy (SA), a tool having advantages on capture data behaviors, we developed three modified versions of distance-based-SA (DSA) for detecting corner cases in classification problems. Through the experiment analysis on MNIST, CIFAR, and industrial example data, the feasibility and usefulness of the proposed tools on corner case data detection are verified. Moreover, Qualitative and quantitative experiments validated that the developed DSA tools can achieve improved performance in describing corner cases’ behaviors.
Published: 2021

12. Supervised Saliency Mapping for First-Person Videos With an Inverse Sparse Coding Framework

Author: Hideki Asoh, Yujie Li, Shotaro Akaho, and Benying Tan
Subjects: General Computer Science, Computer science, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Inverse, 02 engineering and technology, inverse sparse coding, Salience (neuroscience), orthogonal matching pursuit (OMP), 0202 electrical engineering, electronic engineering, information engineering, General Materials Science, salience superpixel matrix, saliency mapping, Sparse matrix, business.industry, General Engineering, 020206 networking & telecommunications, Pattern recognition, Sparse approximation, Salient, First person, 020201 artificial intelligence & image processing, Artificial intelligence, lcsh:Electrical engineering. Electronics. Nuclear engineering, Neural coding, business, supervised, lcsh:TK1-9971
Abstract: Saliency mapping is an efficient means of processing large amounts of incoming visual information from images and videos. Existing methods based on feature integration and construction error provide a rough saliency mapping and are sensitive to real-world noise. Furthermore, for first-person-vision video related to human activities, existing methods identify objects in the image without considering the human actions being performed. To address this issue, we propose a novel supervised saliency mapping based on a sparse coding framework. Different from the normal sparse representation, which uses dictionary atoms to represent signals, we use the inverse expression, whereby the use original image or video frame signals provide the normal dictionary, and these signals are used to represent the salience superpixel matrix, which is learned from a class of training images. We then construct the salient map by inverse sparse coding. In this paper, we first describe how the salience superpixel matrix is extracted by supervised selection, and then explain how to obtain the salient map through an inverse sparse coding framework. The experimental results show the enhanced accuracy of the proposed method compared with previous methods, as well as the time efficiency of the proposed method.
Published: 2019

13. Corner case data description and detection

Author: Yutaka Oiwa, Tinghui Ouyang, Hideki Asoh, Yoshinao Isobe, Yoshiki Seo, and Vicent Sanz Marco
Subjects: FOS: Computer and information sciences, Structure (mathematical logic), Computer Science - Machine Learning, business.industry, Computer science, Computer Science - Artificial Intelligence, Deep learning, computer.software_genre, Machine Learning (cs.LG), Software Engineering (cs.SE), Computer Science - Software Engineering, Artificial Intelligence (cs.AI), Corner case, Simple (abstract algebra), Robustness (computer science), Metric (mathematics), Data mining, Artificial intelligence, business, Quality assurance, computer, MNIST database
Abstract: As the major factors affecting the safety of deep learning models, corner cases and related detection are crucial in AI quality assurance for constructing safety- and security-critical systems. The generic corner case researches involve two interesting topics. One is to enhance DL models’ robustness to corner case data via the adjustment on parameters/structure. The other is to generate new corner cases for model retraining and improvement. However, the complex architecture and the huge amount of parameters make the robust adjustment of DL models not easy, meanwhile it is not possible to generate all real-world corner cases for DL training. Therefore, this paper proposes a simple and novel approach aiming at corner case data detection via a specific metric. This metric is developed on surprise adequacy (SA) which has advantages on capture data behaviors. Furthermore, targeting at characteristics of corner case data, three modifications on distanced-based SA are developed for classification applications in this paper. Consequently, through the experiment analysis on MNIST data and industrial data, the feasibility and usefulness of the proposed method on corner case data detection are verified.
Published: 2021

14. Model-based and actual independence for fairness-aware classification

Author: Shotaro Akaho, Toshihiro Kamishima, Jun Sakuma, and Hideki Asoh
Subjects: Computer Networks and Communications, business.industry, Computer science, 02 engineering and technology, Decision rule, Machine learning, computer.software_genre, Computer Science Applications, Support vector machine, ComputingMethodologies_PATTERNRECOGNITION, Categorization, 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, Neutrality, business, Classifier (UML), computer, Independence (probability theory), Generative grammar, Information Systems, Model bias
Abstract: The goal of fairness-aware classification is to categorize data while taking into account potential issues of fairness, discrimination, neutrality, and/or independence. For example, when applying data mining technologies to university admissions, admission criteria must be non-discriminatory and fair with regard to sensitive features, such as gender or race. In this context, such fairness can be formalized as statistical independence between classification results and sensitive features. The main purpose of this paper is to analyze this formal fairness in order to achieve better trade-offs between fairness and prediction accuracy, which is important for applying fairness-aware classifiers in practical use. We focus on a fairness-aware classifier, Calders and Verwer’s two-naive-Bayes (CV2NB) method, which has been shown to be superior to other classifiers in terms of fairness. We hypothesize that this superiority is due to the difference in types of independence. That is, because CV2NB achieves actual independence, rather than satisfying model-based independence like the other classifiers, it can account for model bias and a deterministic decision rule. We empirically validate this hypothesis by modifying two fairness-aware classifiers, a prejudice remover method and a reject option-based classification (ROC) method, so as to satisfy actual independence. The fairness of these two modified methods was drastically improved, showing the importance of maintaining actual independence, rather than model-based independence. We additionally extend an approach adopted in the ROC method so as to make it applicable to classifiers other than those with generative models, such as SVMs.
Published: 2017

15. Symbol Emergence in Robotics: A Survey

Author: Tetsuya Ogata, Tomoaki Nakamura, Naoto Iwahashi, Takayuki Nagai, Tadahiro Taniguchi, and Hideki Asoh
Subjects: FOS: Computer and information sciences, 0209 industrial biotechnology, Computer science, Computer Science - Artificial Intelligence, Computer Vision and Pattern Recognition (cs.CV), media_common.quotation_subject, Computer Science - Computer Vision and Pattern Recognition, 02 engineering and technology, Constructive, Multimodal interaction, Developmental robotics, Computer Science - Robotics, 020901 industrial engineering & automation, Human–computer interaction, 0202 electrical engineering, electronic engineering, information engineering, media_common, Computer Science - Computation and Language, business.industry, Language acquisition, Computer Science Applications, Human-Computer Interaction, Symbol, Artificial Intelligence (cs.AI), Symbol grounding, Hardware and Architecture, Control and Systems Engineering, Embodied cognition, Robot, 020201 artificial intelligence & image processing, Artificial intelligence, business, Computation and Language (cs.CL), Robotics (cs.RO), Software
Abstract: Humans can learn the use of language through physical interaction with their environment and semiotic communication with other people. It is very important to obtain a computational understanding of how humans can form a symbol system and obtain semiotic skills through their autonomous mental development. Recently, many studies have been conducted on the construction of robotic systems and machine-learning methods that can learn the use of language through embodied multimodal interaction with their environment and other systems. Understanding human social interactions and developing a robot that can smoothly communicate with human users in the long term, requires an understanding of the dynamics of symbol systems and is crucially important. The embodied cognition and social interaction of participants gradually change a symbol system in a constructive manner. In this paper, we introduce a field of research called symbol emergence in robotics (SER). SER is a constructive approach towards an emergent symbol system. The emergent symbol system is socially self-organized through both semiotic communications and physical interactions with autonomous cognitive developmental agents, i.e., humans and developmental robots. Specifically, we describe some state-of-art research topics concerning SER, e.g., multimodal categorization, word discovery, and a double articulation analysis, that enable a robot to obtain words and their embodied meanings from raw sensory--motor information, including visual information, haptic information, auditory information, and acoustic speech signals, in a totally unsupervised manner. Finally, we suggest future directions of research in SER., submitted to Advanced Robotics
Published: 2015

16. Statistical detection of EEG synchrony using empirical bayesian inference

Author: Hideki Asoh, Yuji Takeda, Archana K. Singh, and Steven Phillips
Subjects: False discovery rate, Computer science, Science, Posterior probability, Inference, Probability density function, Bayesian inference, Bioinformatics, Bayes' theorem, False positive paradox, Humans, Computer Simulation, False Positive Reactions, Statistic, Probability, Statistical hypothesis testing, Models, Statistical, Multidisciplinary, business.industry, Bayes Theorem, Electroencephalography, Pattern recognition, Data Interpretation, Statistical, Multiple comparisons problem, Probability distribution, Medicine, Artificial intelligence, business, Null hypothesis, Research Article
Abstract: There is growing interest in understanding how the brain utilizes synchronized oscillatory activity to integrate information across functionally connected regions. Computing phase-locking values (PLV) between EEG signals is a popular method for quantifying such synchronizations and elucidating their role in cognitive tasks. However, high-dimensionality in PLV data incurs a serious multiple testing problem. Standard multiple testing methods in neuroimaging research (e.g., false discovery rate, FDR) suffer severe loss of power, because they fail to exploit complex dependence structure between hypotheses that vary in spectral, temporal and spatial dimension. Previously, we showed that a hierarchical FDR and optimal discovery procedures could be effectively applied for PLV analysis to provide better power than FDR. In this article, we revisit the multiple comparison problem from a new Empirical Bayes perspective and propose the application of the local FDR method (locFDR; Efron, 2001) for PLV synchrony analysis to compute FDR as a posterior probability that an observed statistic belongs to a null hypothesis. We demonstrate the application of Efron's Empirical Bayes approach for PLV synchrony analysis for the first time. We use simulations to validate the specificity and sensitivity of locFDR and a real EEG dataset from a visual search study for experimental validation. We also compare locFDR with hierarchical FDR and optimal discovery procedures in both simulation and experimental analyses. Our simulation results showed that the locFDR can effectively control false positives without compromising on the power of PLV synchrony inference. Our results from the application locFDR on experiment data detected more significant discoveries than our previously proposed methods whereas the standard FDR method failed to detect any significant discoveries.
Published: 2015

17. Detection and Separation of Speech Event Using Audio and Video Information Fusion and Its Application to Robust Speech Interface

Author: Kiyoshi Yamamoto, Jun Ogata, Naoyuki Ichimura, Yoichi Motomura, Futoshi Asano, Hideki Asoh, Isao Hara, and Takashi Yoshimura
Subjects: Audio mining, Microphone array, Voice activity detection, Computer science, adaptive beamformer, Speech recognition, Speech coding, lcsh:Electronics, Acoustic model, lcsh:TK7800-8360, sound localization, Linear predictive coding, Speech processing, lcsh:Telecommunication, Codec2, Hardware and Architecture, lcsh:TK5101-6720, Signal Processing, information fusion, Electrical and Electronic Engineering, Environmental noise, human tracking, Adaptive beamformer, speech recognition
Abstract: A method of detecting speech events in a multiple-sound-source condition using audio and video information is proposed. For detecting speech events, sound localization using a microphone array and human tracking by stereo vision is combined by a Bayesian network. From the inference results of the Bayesian network, information on the time and location of speech events can be known. The information on the detected speech events is then utilized in the robust speech interface. A maximum likelihood adaptive beamformer is employed as a preprocessor of the speech recognizer to separate the speech signal from environmental noise. The coefficients of the beamformer are kept updated based on the information of the speech events. The information on the speech events is also used by the speech recognizer for extracting the speech segment.
Published: 2004

18. Tracking Intermittently Speaking Multiple Speakers Using a Particle Filter

Author: Yosuke Matsusaka, Angela Quinlan, Hideki Asoh, Futoshi Asano, and Mitsuru Kawamoto
Subjects: Audio signal, Acoustics and Ultrasonics, Computer science, Covariance matrix, Speech recognition, Region detection, lcsh:QC221-246, Tracking (particle physics), Signal, lcsh:QA75.5-76.95, Exponential function, lcsh:Acoustics. Sound, lcsh:Electronic computers. Computer science, Electrical and Electronic Engineering, Particle filter
Abstract: The problem of tracking multiple intermittently speaking speakers is difficult as some distinct problems must be addressed. The number of active speakers must be estimated, these active speakers must be identified, and the locations of all speakers including inactive speakers must be tracked. In this paper we propose a method for tracking intermittently speaking multiple speakers using a particle filter. In the proposed algorithm the number of active speakers is firstly estimated based on the Exponential Fitting Test (EFT), a source number estimation technique which we have proposed. The locations of the speakers are then tracked using a particle filtering framework within which the decomposed likelihood is used in order to decouple the observed audio signal and associate each element of the decomposed signal with an active speaker. The tracking accuracy is then further improved by the inclusion of a silence region detection step and estimation of the noise-only covariance matrix. The method was evaluated using live recordings of 3 speakers and the results show that the method produces highly accurate tracking results.
Published: 2009

19. Verification of Effectiveness of a Probabilistic Algorithm for Latent Structure Extraction Using an Associative Memory Model.

Author: Kensuke Wakasugi, Tatsu Kuwatani, Kenji Nagata, Hideki Asoh, and Masato Okada
Abstract: Multivariate analysis techniques are widely used to analyze high-dimensional data in many fields of scientific research. In most cases, however, analysis is conducted under the assumption that there is a priori knowledge of the form of the latent structures in the data. Recently, Kemp and Tenenbaum proposed a new method of analysis that can select the forms from a set of several primitive forms and structures from a set of latent structures for those forms. It is important to evaluate the validity and the effectiveness of the proposed method by using synthetic data sets so that we can control their form. In this study, we apply the Kemp-Tenenbaum method to synthetic data sets that are artificially generated by an associative memory model. The forms and the structures that had been embedded in the data sets were successfully reconstructed, which demonstrates the validity of the Kemp-Tenenbaum method. [ABSTRACT FROM AUTHOR]
Published: 2014
Full Text: View/download PDF

20. Detection and Separation of Speech Event Using Audio and Video Information Fusion and Its Application to Robust Speech Interface.

Author: Futoshi Asano, Kiyoshi Yamamoto, Isao Hara, Jun Ogata, Takashi Yoshimura, Yoichi Motomura, Naoyuki Ichimura, and Hideki Asoh
Abstract: A method of detecting speech events in a multiple-sound-source condition using audio and video information is proposed. For detecting speech events, sound localization using a microphone array and human tracking by stereo vision is combined by a Bayesian network. From the inference results of the Bayesian network, information on the time and location of speech events can be known. The information on the detected speech events is then utilized in the robust speech interface. A maximum likelihood adaptive beamformer is employed as a preprocessor of the speech recognizer to separate the speech signal from environmental noise. The coefficients of the beamformer are kept updated based on the information of the speech events. The information on the speech events is also used by the speech recognizer for extracting the speech segment. [ABSTRACT FROM AUTHOR]
Published: 2004
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

20 results on '"Hideki Asoh"'

1. Multi-Sensor Integration for Key-Frame Extraction From First-Person Videos

2. Generating In-Between Images Through Learned Latent Space Representation Using Variational Autoencoders

3. Supervised Saliency Mapping for First-Person Videos With an Inverse Sparse Coding Framework

4. Independent Low-Rank Matrix Analysis-Based Automatic Artifact Reduction Technique Applied to Three BCI Paradigms

5. Improved Surprise Adequacy Tools for Corner Case Data Description and Detection

6. Segmenting Continuous Motions with Hidden Semi-markov Models and Gaussian Processes

7. Statistical detection of EEG synchrony using empirical bayesian inference.

8. Detection and Separation of Speech Event Using Audio and Video Information Fusion and Its Application to Robust Speech Interface

9. Tracking Intermittently Speaking Multiple Speakers Using a Particle Filter

10. Generating In-Between Images Through Learned Latent Space Representation Using Variational Autoencoders

11. Improved Surprise Adequacy Tools for Corner Case Data Description and Detection

12. Supervised Saliency Mapping for First-Person Videos With an Inverse Sparse Coding Framework

13. Corner case data description and detection

14. Model-based and actual independence for fairness-aware classification

15. Symbol Emergence in Robotics: A Survey

16. Statistical detection of EEG synchrony using empirical bayesian inference

17. Detection and Separation of Speech Event Using Audio and Video Information Fusion and Its Application to Robust Speech Interface

18. Tracking Intermittently Speaking Multiple Speakers Using a Particle Filter

19. Verification of Effectiveness of a Probabilistic Algorithm for Latent Structure Extraction Using an Associative Memory Model.

20. Detection and Separation of Speech Event Using Audio and Video Information Fusion and Its Application to Robust Speech Interface.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Publication Type

Journal

Database

Publisher

20 results on '"Hideki Asoh"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources