Author: "Sarkar, Achintya Kumar" / Database: OpenAIRE - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Sarkar, Achintya Kumar"' showing total 4 results

Start Over Author "Sarkar, Achintya Kumar" Database OpenAIRE

4 results on '"Sarkar, Achintya Kumar"'

1. Data Generation Using Pass-phrase-dependent Deep Auto-encoders for Text-Dependent Speaker Verification

Author: Sarkar, Achintya Kumar, Sahidullah, Md, and Tan, Zheng-Hua
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Sound (cs.SD), Audio and Speech Processing (eess.AS), FOS: Electrical engineering, electronic engineering, information engineering, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing, Machine Learning (cs.LG)
Abstract: In this paper, we propose a novel method that trains pass-phrase specific deep neural network (PP-DNN) based auto-encoders for creating augmented data for text-dependent speaker verification (TD-SV). Each PP-DNN auto-encoder is trained using the utterances of a particular pass-phrase available in the target enrollment set with two methods: (i) transfer learning and (ii) training from scratch. Next, feature vectors of a given utterance are fed to the PP-DNNs and the output from each PP-DNN at frame-level is considered one new set of generated data. The generated data from each PP-DNN is then used for building a TD-SV system in contrast to the conventional method that considers only the evaluation data available. The proposed approach can be considered as the transformation of data to the pass-phrase specific space using a non-linear transformation learned by each PP-DNN. The method develops several TD-SV systems with the number equal to the number of PP-DNNs separately trained for each pass-phrases for the evaluation. Finally, the scores of the different TD-SV systems are fused for decision making. Experiments are conducted on the RedDots challenge 2016 database for TD-SV using short utterances. Results show that the proposed method improves the performance for both conventional cepstral feature and deep bottleneck feature using both Gaussian mixture model - universal background model (GMM-UBM) and i-vector framework.
Published: 2021
Full Text: View/download PDF

2. On Bottleneck Features for Text-Dependent Speaker Verification Using X-vectors

Author: Sarkar, Achintya Kumar and Tan, Zheng-Hua
Subjects: FOS: Computer and information sciences, Sound (cs.SD), Computer Science - Machine Learning, Audio and Speech Processing (eess.AS), FOS: Electrical engineering, electronic engineering, information engineering, Computer Science - Sound, Machine Learning (cs.LG), Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Applying x-vectors for speaker verification has recently attracted great interest, with the focus being on text-independent speaker verification. In this paper, we study x-vectors for text-dependent speaker verification (TD-SV), which remains unexplored. We further investigate the impact of the different bottleneck (BN) features on the performance of x-vectors, including the recently-introduced time-contrastive-learning (TCL) BN features and phone-discriminant BN features. TCL is a weakly supervised learning approach that constructs training data by uniformly partitioning each utterance into a predefined number of segments and then assigning each segment a class label depending on their position in the utterance. We also compare TD-SV performance for different modeling techniques, including the Gaussian mixture models-universal background model (GMM-UBM), i-vector, and x-vector. Experiments are conducted on the RedDots 2016 challenge database. It is found that the type of features has a marginal impact on the performance of x-vectors with the TCL BN feature achieving the lowest equal error rate, while the impact of features is significant for i-vector and GMM-UBM. The fusion of x-vector and i-vector systems gives a large gain in performance. The GMM-UBM technique shows its advantage for TD-SV using short utterances.
Published: 2020

3. Time-Contrastive Learning Based DNN Bottleneck Features for Text-Dependent Speaker Verification

Author: Sarkar, Achintya Kumar and Tan, Zheng-Hua
Subjects: Time-contrastive learning, Bottleneck Features, Text-Dependent, Speaker Verification, DNN
Abstract: In this paper, we present a time-contrastive learning (TCL) based bottleneck (BN) feature extraction method for speech signals with an application to text-dependent (TD) speaker verification (SV). It is well-known that speech signals exhibit quasi-stationary behavior in and only in a short interval, and the TCL method aims to exploit this temporal structure. More specifically, it trains deep neural networks (DNNs) to discriminate temporal events obtained by uniformly segmenting speech signals, in contrast to existing DNN based BN feature extraction methods that train DNNs using labeled data to discriminate speakers or pass-phrases or phones or a combination of them. In the context of speaker verification, speech data of fixed pass-phrases are used for TCL-BN training, while the pass-phrases used for TCL-BN training are excluded from being used for SV, so that the learned features can be considered generic. The method is evaluated on the RedDots Challenge 2016 database. Experimental results show that TCL-BN is superior to the existing speaker and pass-phrase discriminant BN features and the Mel-frequency cepstral coefficient feature for text-dependent speaker verification.
Published: 2017

4. Multi-class UBM-Based MLLR m-Vector system for speaker verification

Author: Sarkar, Achintya Kumar, Barras, Claude, Publications, Limsi, Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur (LIMSI), Université Paris Saclay (COmUE)-Centre National de la Recherche Scientifique (CNRS)-Sorbonne Université - UFR d'Ingénierie (UFR 919), and Sorbonne Université (SU)-Sorbonne Université (SU)-Université Paris-Saclay-Université Paris-Sud - Paris 11 (UP11)
Subjects: UBM, [INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL], Statistical clustering al- gorithm, MLLR super-vector, Speaker verification, [INFO]Computer Science [cs], [INFO] Computer Science [cs], [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL], Multi-class m-vector
Abstract: International audience; In this paper, we extend the recently introduced Maximum Like- lihood Linear Regression (MLLR) super-vector based m-vector speaker verification system to a multi-class MLLR m-vector system. In the conventional case, global class MLLR transformation is es- timated with respect to Universal Background Model (UBM) for a given speech data, which is then used in the form of super-vector for m-vector system. In the proposed system, Gaussian mean vectors of the UBM are first clustered into several classes. Then, MLLR trans- formations are estimated (of a speech data) for each class, and are used in the form of super-vectors for speaker characterization using the m-vector technique. We consider two clustering approaches: one is based on the conventional K-means and the other is proposed based on Expectation Maximization (EM) and Maximum Likelihood (ML). Both systems yield better performance than the conventional m-vector system and allow for multiple MLLR transforms without additional temporal alignment of the data with respect to UBM. Furthermore, we show that, contrary to conventional K-means, the proposed clustering is not affected by the random initialization, and also provides equal or comparable system performance. The system performances are shown on NIST 2008 SRE core condition over various tasks.
Published: 2013

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

4 results on '"Sarkar, Achintya Kumar"'

1. Data Generation Using Pass-phrase-dependent Deep Auto-encoders for Text-Dependent Speaker Verification

2. On Bottleneck Features for Text-Dependent Speaker Verification Using X-vectors

3. Time-Contrastive Learning Based DNN Bottleneck Features for Text-Dependent Speaker Verification

4. Multi-class UBM-Based MLLR m-Vector system for speaker verification

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Database

Publisher

4 results on '"Sarkar, Achintya Kumar"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources