Author: "Mariño Acebal, José Bernardo" / Topic: telecomunicacio - Searchworks@Jio Institute Digital Library Search Results

1. Monolingual and bilingual spanish-catalan speech recognizers developed from SpeechDat databases

Author: Mariño Acebal, José Bernardo, Padrell, J, Moreno Bilbao, M. Asunción, Nadeu Camprubí, Climent, Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions, and Universitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla
Subjects: Telecomunicació, Telecommunication, Enginyeria de la telecomunicació [Àrees temàtiques de la UPC]
Abstract: Under the SpeechDat specifications, the Spanish member of SpeechDat consortium has recorded a Catalan database that includes one thousand speakers. This communication describes some experimental work that has been carried out using both the Spanish and the Catalan speech material. A speech recognition system has been trained for the Spanish language using a selection of the phonetically balanced utterances from the 4500 SpeechDat training sessions. Utterances with mispronounced or incomplete words and with intermittent noise were discarded. A set of 26 allophones was selected to account for the Spanish sounds and clustered demiphones have been used as context dependent sub-lexical units. Following the same methodology, a recognition system was trained from the Catalan SpeechDat database. Catalan sounds were described with 32 allophones. Additionally, a bilingual recognition system was built for both the Spanish and Catalan languages. By means of clustering techniques, the suitable set of allophones to cover simultaneously both languages was determined. Thus, 33 allophones were selected. The training material was built by the whole Catalan training material and the Spanish material coming from the Eastern region of Spain (the region where Catalan is spoken). The performance of the Spanish, Catalan and bilingual systems were assessed under the same framework. The Spanish system exhibits a significantly better performance than the rest of systems due to its better training. The bilingual system provides an equivalent performance to that afforded by both language specific systems trained with the Eastern Spanish material or the Catalan SpeechDat corpus.
Published: 2000

2. A second opinion approach for speech recognition verification

Author: Hernández-Ábrego, G, Mariño Acebal, José Bernardo, and Universitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla
Subjects: Telecomunicació, Telecommunication, Enginyeria de la telecomunicació [Àrees temàtiques de la UPC]
Abstract: In order to improve the reliability of speech recognition results, a verifying system, that takes profit of the information given from an alternative recognition step is proposed. The alternative results are considered as a second opinion about the nature of the speech recognition process. Some features are extracted from both opinion sources and compiled, through a fuzzy inference system, into a more discriminant confidence measure able to verify correct results and disregard wrong ones. This approach is tested in a keyword spotting task taken form the Spanish SpeechDat database. Results show a considerable reduction of false rejections at a fixed false alarm rate compared to baseline systems.
Published: 1999

3. Fuzzy reasoning in confidence evaluation of speech recognition

Author: Hernández-Abrego, G, Mariño Acebal, José Bernardo, and Universitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla
Subjects: Telecomunicació, Telecommunication, Enginyeria de la telecomunicació [Àrees temàtiques de la UPC]
Abstract: Confidence measures represent a systematic way to express reliability of speech recognition results. A common approach to confidence measuring is to take profit of the information that several recognition-related features offer and to combine them, through a given compilation mechanism , into a more effective way to distinguish between correct and incorrect recognition results. We propose to use a fuzzy reasoning scheme to perform the information compilation step. Our approach opposes the previously proposed ones because ours treats the uncertainty of recognition hypotheses in terms of
Published: 1999

4. Minimum confusibility training of context dependent demiphones

Author: Nogueiras Rodríguez, Albino|||0000-0002-3159-1718, Mariño Acebal, José Bernardo|||0000-0002-9471-8675, Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions, and Universitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla
Subjects: Telecomunicació, Telecommunication, Enginyeria de la telecomunicació [Àrees temàtiques de la UPC]
Abstract: During the last years two different approaches have been widely used in order to improve the acoustic modeling in continuous speech recognition systems: discriminative training algorithms and context dependent subword units. However, while the use of each of these techniques leads to much better results than standard maximum likelihood trained phone models, their combination, i.e. discriminative training of context dependent units, has revealed to be a much more dificult task. In this paper we deal with minimum confusibility training of demiphones using TIMIT database. By applying this approach recently introduced by the authors, the string error rate in the recognition of TIDIGITS using demiphones is reduced some 24% with respect to maximum likelihood training. This improvement is added to the 8% reduction already provided by demiphones with respect to minimum confusibility trained phones.
Published: 1999

5. Using x-gram for efficient speech recognition

Author: Bonafonte Cávez, Antonio, Mariño Acebal, José Bernardo, Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions, and Universitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla
Subjects: Telecomunicació, Telecommunication, Enginyeria de la telecomunicació [Àrees temàtiques de la UPC]
Abstract: X-grams are a generalization of the n-grams, where the number of previous conditioning words is different for each case and decided from the training data. X-grams reduce perplexity with respect to trigrams and need less number of parameters. In this paper, the representation of the x-grams using finite state automata is considered. This representation leads to a new model, the non-deterministic x-grams, an approximation that is much more efficient, suffering small degradation on the modeling capability. Empirical experiments for a continuous speech recognition task show how, for each ending word, the number of transitions is reduced from 1222 (the size of the lexicon) to around 66.
Published: 1998

6. Spanish dialects: phonetic transcription

Author: Moreno Bilbao, M. Asunción|||0000-0002-1823-5970, Mariño Acebal, José Bernardo|||0000-0002-9471-8675, Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions, and Universitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla
Subjects: Telecomunicació, Telecommunication, Enginyeria de la telecomunicació [Àrees temàtiques de la UPC]
Abstract: It is well known that canonical Spanish, the dialectal variant `central' of Spain, so called Castilian, can be transcribed by rules. This paper deals with the automatic grapheme to phoneme transcription rules in several Spanish dialects from Latin America. Spanish is a language spoken by more than 300 million people, has an important geographical dispersion compared among other languages and has been historically influenced by many native languages. In this paper authors expand the Castilian transcription rules to a set of different dialectal variants of Latin America. Transcriptions are based on SAMPA symbols. The paper includes an identification of sounds that doesn't appear in Castilian, extend accepted SAMPA symbols for Spanish (Castilian) to different dialectal variants, describes the necessary rules to implement an automatic Orthographic to Phonetic transcription in several dialectal Spanish variants and show some quantitative results of dialectal differences.
Published: 1998

7. Low delay phone recognition

Author: Rodríguez Fonollosa, José Adrián, Mariño Acebal, José Bernardo, Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions, and Universitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla
Subjects: Telecomunicació, education, Telecommunication, Enginyeria de la telecomunicació [Àrees temàtiques de la UPC]
Published: 1998

8. Maximum likelihood based discriminative training of acoustic models

Author: Nogueiras Rodríguez, Albino|||0000-0002-3159-1718, Mariño Acebal, José Bernardo, Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions, and Universitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla
Subjects: Telecomunicació, ComputingMethodologies_PATTERNRECOGNITION, Computer Science::Sound, Telecommunication, Computer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing), Enginyeria de la telecomunicació [Àrees temàtiques de la UPC]
Abstract: In this paper, a framework for discriminative training of acoustic models based on Generalised Probabilistic Descent (GPD) method is presented. The key feature of our proposal, Maximum Likelihood based Discriminative Training of Acoustic Models (MLDT), is the use of maximum likelihood trained HMM's instead of the original speech signal. We focus our attention in performing discriminative training applied to a discrete hidden Markov models continuos speech recogniser, achieving a 4.6% error rate reduction on a Spanish speaker-independent phoneme recognition task.
Published: 1995

9. Multiple multilabelling applied to hmm-based noisy speech recognition

Author: Hernando Pericás, Francisco Javier, Mariño Acebal, José Bernardo, Moreno Bilbao, M. Asunción, Nadeu Camprubí, Climent, Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions, and Universitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla
Subjects: Telecomunicació, education, Telecommunication, Enginyeria de la telecomunicació [Àrees temàtiques de la UPC]
Published: 1993

10. An efficient algorithm to find the best state sequence in hsmm

Author: Bonafonte Cávez, Antonio, Mariño Acebal, José Bernardo, Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions, and Universitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla
Subjects: Telecomunicació, Telecommunication, Enginyeria de la telecomunicació [Àrees temàtiques de la UPC]
Abstract: Hidden Markov Modeling (HMM) techniques have been applied successfully to speech analysis. However, it has been claimed [1-7] that a major weakness of HMM is that the state duration probability density functions (SDPDF) are exponential, which is not appropriate for modelling speech events. In order to cope with this deficiency some authors have proposed to model explicitly the state duration. In these models the first order Markov hypothesis is broken in the loop transitions. Thus, the new models have been called Hidden Semi-Markov Models (HSMM). Different solutions have been proposed being the main common drawback the increase of the computational time by a factor D, being D the maximum time allowed in each state. In this paper a modified Viterbi algorithm which finds the best state sequence of HSMM is proposed. The proposed algorithm deals with log-convex parametric SDPDF. The log-convex property is fulfilled by the parametric functions usually applied. This method increases the computational burden with respect to conventional HMM by an empirical factor of just 3.2 without losing optimality and without increasing the storage with respect to other approaches. A more efficient algorithm is presented for the case that the duration of the states is modeled by bounded functions.
Published: 1993

11. Reconocimiento del habla continua mediante modelos ocultos de Markov utilizando la técnica de búsqueda en haz

Author: Lleida Solano, Eduardo, Mariño Acebal, José Bernardo, Bonafonte Cávez, Antonio, Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions, and Universitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla
Subjects: Telecomunicació, education, Telecommunication, Enginyeria de la telecomunicació [Àrees temàtiques de la UPC]
Published: 1992

12. Two level continuous speech recognition using demisyllable-based HMM word spotting

Author: Lleida Solano, Eduardo, Mariño Acebal, José Bernardo, Nadeu Camprubí, Climent, Oliveras Vergés, Albert, Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions, Universitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla, and Universitat Politècnica de Catalunya. GPI - Grup de Processament d'Imatge i Vídeo
Subjects: Telecomunicació, Telecommunication, Computer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing), Enginyeria de la telecomunicació [Àrees temàtiques de la UPC]
Abstract: This paper describes a two level Spanish Continuous Speech Recognition System based on Demisyllable HMM modelling, word-spotting and finite-state lexical and syntactic knowledge. The first level, the word level, is based on a spotting algorithm which takes as input the unknown utterance, the HMM of the reference demisyllable and the lexical knowledge in terms of a finite-state network. The output of the word level is a lattice of word hypothesis [1]. The second level, the phrase level, searches in a time-synchronous procedure the best sentence that end at each time instant. It takes as input the word lattice and the syntactic knowledge in terms of a finite-state network, giving as output the best legal sentence. The proposal two-level system was tested recognizing the integers from 0 to 1000 in a speaker independent approach. We get a word accuracy of 93,2% with a sentence accuracy of 84. 5%. Keywords: Speech Recognition, Hidden Markov Model, Fuzzy Training, Demisyllable, Word-spotting, Multiple Hypothesis, Finite State Networks.
Published: 1991

13. 62: Una herramienta interactiva para el estudio del tratamiento de la señal

Author: Mariño Acebal, José Bernardo, Vallverdú Bayés, Sisco, Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions, and Universitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla
Subjects: Telecomunicació, education, Telecommunication, Enginyeria de la telecomunicació [Àrees temàtiques de la UPC]

14. The demiphone:an efficient subword unit for Continuous Speech Recognition

Author: Mariño Acebal, José Bernardo|||0000-0002-9471-8675, Nogueiras Rodríguez, Albino|||0000-0002-3159-1718, Bonafonte Cávez, Antonio|||0000-0002-6240-9915, Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions, and Universitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla
Subjects: Telecomunicació, Telecommunication, Enginyeria de la telecomunicació [Àrees temàtiques de la UPC]
Abstract: In this paper we introduce the demiphone as a contextual phonetic unit for continuous speech recognition. A phone is divided into two parts: a left demiphone that accounts for the left side coarticulation and a right demiphone that copes with the right side context. This new unit discards the dependence between the effects of both side contexts, but provides a better training of the transition between phones. The demiphone can be seen as a heuristic clustering of states that allows a more smoothed training of hidden Markov models and additionally supplies a simple way to create unseen triphones. We report experimental evidence that demiphones outperform the usual combination of triphones, right-side and left-side biphones and monophones.

15. Albayzin speech database: design of the phonetic corpus

Author: Moreno Bilbao, M. Asunción|||0000-0002-1823-5970, Poig, D, Bonafonte Cávez, Antonio|||0000-0002-6240-9915, Lleida, E, Llisterri, J, Mariño Acebal, José Bernardo|||0000-0002-9471-8675, Nadeu Camprubí, Climent|||0000-0002-5863-0983, Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions, and Universitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla
Subjects: Telecomunicació, ComputingMethodologies_PATTERNRECOGNITION, InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL, ComputingMethodologies_DOCUMENTANDTEXTPROCESSING, Telecommunication, Enginyeria de la telecomunicació [Àrees temàtiques de la UPC], ComputingMethodologies_ARTIFICIALINTELLIGENCE
Abstract: This paper describes the phonetic content of Albayzin, a spoken database for Spanish designed for speech recognition purposes. A statistical study of a large sample of spontaneous speech is presented, and the phonetic and statistical criteria for the final constitution of the database are discussed. Finally, the contents of the phonetic database are analyzed

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

15 results on '"Mariño Acebal, José Bernardo"'

1. Monolingual and bilingual spanish-catalan speech recognizers developed from SpeechDat databases

2. A second opinion approach for speech recognition verification

3. Fuzzy reasoning in confidence evaluation of speech recognition

4. Minimum confusibility training of context dependent demiphones

5. Using x-gram for efficient speech recognition

6. Spanish dialects: phonetic transcription

7. Low delay phone recognition

8. Maximum likelihood based discriminative training of acoustic models

9. Multiple multilabelling applied to hmm-based noisy speech recognition

10. An efficient algorithm to find the best state sequence in hsmm

11. Reconocimiento del habla continua mediante modelos ocultos de Markov utilizando la técnica de búsqueda en haz

12. Two level continuous speech recognition using demisyllable-based HMM word spotting

13. 62: Una herramienta interactiva para el estudio del tratamiento de la señal

14. The demiphone:an efficient subword unit for Continuous Speech Recognition

15. Albayzin speech database: design of the phonetic corpus

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Database

Publisher

15 results on '"Mariño Acebal, José Bernardo"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources