Author: "Brian Roark" / Search Limiters: Peer Reviewed - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Brian Roark"' showing total 18 results

Start Over Author "Brian Roark" Search Limiters Peer Reviewed

18 results on '"Brian Roark"'

1. Context-aware Transliteration of Romanized South Asian Languages

Author: Christo Kirov, Cibu Johny, Anna Katanova, Alexander Gutkin, and Brian Roark
Subjects: Computational linguistics. Natural language processing, P98-98.5
Published: 2024
Full Text: View/download PDF

2. Approximating Probabilistic Models as Weighted Finite Automata

Author: Ananda Theertha Suresh, Brian Roark, Michael Riley, and Vlad Schogol
Subjects: Computational linguistics. Natural language processing, P98-98.5
Abstract: AbstractWeighted finite automata (WFAs) are often used to represent probabilistic models, such as n-gram language models, because among other things, they are efficient for recognition tasks in time and space. The probabilistic source to be represented as a WFA, however, may come in many forms. Given a generic probabilistic model over sequences, we propose an algorithm to approximate it as a WFA such that the Kullback-Leibler divergence between the source model and the WFA target model is minimized. The proposed algorithm involves a counting step and a difference of convex optimization step, both of which can be performed efficiently. We demonstrate the usefulness of our approach on various tasks, including distilling n-gram models from neural models, building compact language models, and building open-vocabulary character models. The algorithms used for these experiments are available in an open-source software library.
Published: 2021
Full Text: View/download PDF

3. Graph-Based Word Alignment for Clinical Language Evaluation

Author: Emily Prud'hommeaux and Brian Roark
Subjects: Computational linguistics. Natural language processing, P98-98.5
Published: 2022
Full Text: View/download PDF

4. Phonotactic Complexity and Its Trade-offs

Author: Tiago Pimentel, Brian Roark, and Ryan Cotterell
Subjects: Computational linguistics. Natural language processing, P98-98.5
Abstract: AbstractWe present methods for calculating a measure of phonotactic complexity—bits per phoneme— that permits a straightforward cross-linguistic comparison. When given a word, represented as a sequence of phonemic segments such as symbols in the international phonetic alphabet, and a statistical model trained on a sample of word types from the language, we can approximately measure bits per phoneme using the negative log-probability of that word under the model. This simple measure allows us to compare the entropy across languages, giving insight into how complex a language’s phonotactics is. Using a collection of 1016 basic concept words across 106 languages, we demonstrate a very strong negative correlation of − 0.74 between bits per phoneme and the average length of words.
Published: 2020
Full Text: View/download PDF

5. Neural Models of Text Normalization for Speech Applications

Author: Hao Zhang, Richard Sproat, Axel H. Ng, Felix Stahlberg, Xiaochang Peng, Kyle Gorman, and Brian Roark
Subjects: Computational linguistics. Natural language processing, P98-98.5
Abstract: Machine learning, including neural network techniques, have been applied to virtually every domain in natural language processing. One problem that has been somewhat resistant to effective machine learning solutions is text normalization for speech applications such as text-to-speech synthesis (TTS). In this application, one must decide, for example, that 123 is verbalized as one hundred twenty three in 123 pages but as one twenty three in 123 King Ave. For this task, state-of-the-art industrial systems depend heavily on hand-written language-specific grammars. We propose neural network models that treat text normalization for TTS as a sequence-to-sequence problem, in which the input is a text token in context, and the output is the verbalization of that token. We find that the most effective model, in accuracy and efficiency, is one where the sentential context is computed once and the results of that computation are combined with the computation of each token in sequence to compute the verbalization. This model allows for a great deal of flexibility in terms of representing the context, and also allows us to integrate tagging and segmentation into the process. These models perform very well overall, but occasionally they will predict wildly inappropriate verbalizations, such as reading 3 cm as three kilometers. Although rare, such verbalizations are a major issue for TTS applications. We thus use finite-state covering grammars to guide the neural models, either during training and decoding, or just during decoding, away from such “unrecoverable” errors. Such grammars can largely be learned from data.
Published: 2019
Full Text: View/download PDF

6. Probabilistic Top-Down Parsing and Language Modeling

Author: Brian Roark
Subjects: Computational linguistics. Natural language processing, P98-98.5
Published: 2021
Full Text: View/download PDF

7. Applications of Lexicographic Semirings to Problems in Speech and Language Processing

Author: Richard Sproat, Mahsa Yarmohammadi, Izhak Shafran, and Brian Roark
Subjects: Computational linguistics. Natural language processing, P98-98.5
Published: 2021
Full Text: View/download PDF

8. Finite-State Chart Constraints for Reduced Complexity Context-Free Parsing Pipelines

Author: Brian Roark, Kristy Hollingshead, and Nathan Bodenstab
Subjects: Computational linguistics. Natural language processing, P98-98.5
Published: 2021
Full Text: View/download PDF

9. Putting Linguistics into Speech Recognition: The Regulus Grammar Compiler Manny Rayner, Beth Ann Hockey, and Pierette Bouillon (NASA Ames Research Center and University of Geneva) Stanford, CA: CSLI Publications (CSLI studies in computational linguistics, edited by Ann Copestake), 2006, xiv+305 pp; hardbound, ISBN 1-57586-525-4, $65.00; paperbound, ISBN 1-57586-526-2, $25.00

Author: Brian Roark
Subjects: Computational linguistics. Natural language processing, P98-98.5
Published: 2021
Full Text: View/download PDF

10. Huffman scanning: Using language models within fixed-grid keyboard emulation

Author: Brian Roark, Chris Gibbons, Russell Beckley, and Melanie Fried-Oken
Subjects: Emulation, Computer science, Speech recognition, Binary number, Huffman coding, Column (database), Article, Theoretical Computer Science, Human-Computer Interaction, symbols.namesake, Canonical Huffman code, Asynchronous communication, Symbol (programming), symbols, Binary code, Algorithm, Software
Abstract: Individuals with severe motor impairments commonly enter text using a single binary switch and symbol scanning methods. We present a new scanning method – Huffman scanning – which uses Huffman coding to select the symbols to highlight during scanning, thus minimizing the expected bits per symbol. With our method, the user can select the intended symbol even after switch activation errors. We describe two varieties of Huffman scanning – synchronous and asynchronous – and present experimental results, demonstrating speedups over row/column and linear scanning.
Published: 2013
Full Text: View/download PDF

11. Speech and Language processing as assistive technologies

Author: Kathleen F. McCoy, Leo Ferres, Brian Roark, Melanie Fried-Oken, and John L. Arnott
Subjects: Human-Computer Interaction, Focus (computing), Important research, Work (electrical), Multimedia, Computer science, Assistive technology, computer.software_genre, computer, Software, Linguistics, Theoretical Computer Science, Variety (cybernetics)
Abstract: We are delighted to bring you this special issue on speech and language processing for assistive technology. It addresses an important research area that is gaining increased recognition from researchers in speech and language processing as a rich and fulfilling area on which to focus their work, and by researchers in assistive technology as the means to dramatically improve communication technologies for individuals with disabilities. This special issue brings a wide swath of approaches and applications highlighting the variety this area offers.
Published: 2013
Full Text: View/download PDF

12. The Application of Natural Language Processing to Augmentative and Alternative Communication

Author: Gregory W. Lesher, Bryan Moulton, Brian Roark, and D. Jeffery Higginbotham
Subjects: Computer science, business.industry, Rehabilitation, ComputingMilieux_LEGALASPECTSOFCOMPUTING, Physical Therapy, Sports Therapy and Rehabilitation, computer.software_genre, Communication Aids for Disabled, Augmentative and alternative communication, Assistive technology, Humans, ComputingMilieux_COMPUTERSANDSOCIETY, Artificial intelligence, Speech Recognition Software, Interface design, business, computer, Word (computer architecture), Natural language processing, Natural Language Processing
Abstract: Significant progress has been made in the application of natural language processing (NLP) to augmentative and alternative communication (AAC), particularly in the areas of interface design and word prediction. This article will survey the current state-of-the-science of NLP in AAC and discuss its future applications for the development of next generation of AAC technology.
Published: 2012
Full Text: View/download PDF

13. Discriminative n-gram language modeling

Author: Michael Collins, Brian Roark, and Murat Saraclar
Subjects: Finite-state machine, Computer science, business.industry, Speech recognition, Computer Science::Neural and Evolutionary Computation, Word error rate, Initialization, Computer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing), Pattern recognition, Perceptron, Theoretical Computer Science, Human-Computer Interaction, Reduction (complexity), n-gram, Discriminative model, Language model, Artificial intelligence, business, Software
Abstract: This paper describes discriminative language modeling for a large vocabulary speech recognition task. We contrast two parameter estimation methods: the perceptron algorithm, and a method based on maximizing the regularized conditional log-likelihood. The models are encoded as deterministic weighted finite state automata, and are applied by intersecting the automata with word-lattices that are the output from a baseline recognizer. The perceptron algorithm has the benefit of automatically selecting a relatively small feature set in just a couple of passes over the training data. We describe a method based on regularized likelihood that makes use of the feature set given by the perceptron algorithm, and initialization with the perceptron's weights; this method gives an additional 0.5% reduction in word error rate (WER) over training with the perceptron alone. The final system achieves a 1.8% absolute reduction in WER for a baseline first-pass recognition system (from 39.2% to 37.4%), and a 0.9% absolute reduction in WER for a multi-pass recognition system (from 28.9% to 28.0%).
Published: 2007
Full Text: View/download PDF

14. Utterance classification with discriminative language modeling

Author: Murat Saraclar and Brian Roark
Subjects: Linguistics and Language, Vocabulary, business.industry, Computer science, Communication, Speech recognition, media_common.quotation_subject, Linear model, Word error rate, Computer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing), Pattern recognition, Perceptron, Linear discriminant analysis, Language and Linguistics, Computer Science Applications, Discriminative model, Modeling and Simulation, Computer Vision and Pattern Recognition, Language model, Artificial intelligence, business, Software, Utterance, media_common
Abstract: This paper investigates discriminative language modeling in a scenario with two kinds of observed errors: errors in ASR transcription and errors in utterance classification. We train joint language and class models either independently or simultaneously, under various parameter update conditions. On a large vocabulary customer service call-classification application, we show that simultaneous optimization of class, n-gram, and class/n-gram feature weights results in a significant WER reduction over a model using just n-gram features, while additionally significantly outperforming a deployed baseline in classification error rate. A range of parameter estimation approaches, based on either the perceptron algorithm or conditional log-linear models, for various feature sets are presented and evaluated. The resulting models are encoded as weighted finite-state automata, and are used by intersecting the model with word lattices.
Published: 2006
Full Text: View/download PDF

15. MAP adaptation of stochastic grammars

Author: Brian Roark, Michael Riley, Richard Sproat, and Michiel Bacchiani
Subjects: Domain adaptation, Parsing, business.industry, Computer science, Speech recognition, Probabilistic logic, computer.software_genre, Machine learning, Adaptation strategies, Theoretical Computer Science, Human-Computer Interaction, ComputingMethodologies_PATTERNRECOGNITION, Rule-based machine translation, Map adaptation, Maximum a posteriori estimation, Artificial intelligence, Language model, business, computer, Software
Abstract: This paper investigates supervised and unsupervised adaptation of stochastic grammars, including n-gram language models and probabilistic context-free grammars (PCFGs), to a new domain. It is shown that the commonly used approaches of count merging and model interpolation are special cases of a more general maximum a posteriori (MAP) framework, which additionally allows for alternate adaptation approaches. This paper investigates the effectiveness of different adaptation strategies, and, in particular, focuses on the need for supervision in the adaptation process. We show that n-gram models as well as PCFGs benefit from either supervised or unsupervised MAP adaptation in various tasks. For n-gram models, we compare the benefit from supervised adaptation with that of unsupervised adaptation on a speech recognition task with an adaptation sample of limited size (about 17h), and show that unsupervised adaptation can obtain 51% of the 7.7% adaptation gain obtained by supervised adaptation. We also investigate the benefit of using multiple word hypotheses (in the form of a word lattice) for unsupervised adaptation on a speech recognition task for which there was a much larger adaptation sample available. The use of word lattices for adaptation required the derivation of a generalization of the well-known Good-Turing estimate. Using this generalization, we derive a method that uses Monte Carlo sampling for building Katz backoff models. The adaptation results show that, for adaptation samples of limited size (several tens of hours), unsupervised adaptation on lattices gives a performance gain over using transcripts. The experimental results also show that with a very large adaptation sample (1050h), the benefit from transcript-based adaptation matches that of lattice-based adaptation. Finally, we show that PCFG domain adaptation using the MAP framework provides similar gains in F-measure accuracy on a parsing task as was seen in ASR accuracy improvements with n-gram adaptation. Experimental results show that unsupervised adaptation provides 37% of the 10.35% gain obtained by supervised adaptation.
Published: 2006
Full Text: View/download PDF

16. THE DESIGN PRINCIPLES AND ALGORITHMS OF A WEIGHTED GRAMMAR LIBRARY

Author: Cyril Allauzen, Mehryar Mohri, and Brian Roark
Subjects: Theoretical computer science, Grammar, Programming language, Computer science, media_common.quotation_subject, Biosequence, Design elements and principles, computer.software_genre, Variety (linguistics), Automaton, Rule-based machine translation, Computer Science (miscellaneous), Software design, Representation (mathematics), Algorithm, computer, media_common
Abstract: We present the software design principles, algorithms, and utilities of a general weighted grammar library, the GRM Library, that can be used in a variety of applications in text, speech, and biosequence processing. Several of the algorithms and utilities of this library are described, including in some cases their pseudocodes and pointers to their use in applications. The algorithms and the utilities were designed to support a wide variety of semirings and the representation and use of large grammars and automata of several hundred million rules or transitions.
Published: 2005
Full Text: View/download PDF

17. Robust garden path parsing

Author: Brian Roark
Subjects: Linguistics and Language, Parsing, business.industry, Computer science, Probabilistic logic, Recursive descent parser, computer.software_genre, Top-down parsing, Language and Linguistics, Canonical LR parser, Simple LR parser, Parser combinator, Artificial Intelligence, GLR parser, Artificial intelligence, business, computer, Software, Natural language processing
Abstract: This paper presents modifications to a standard probabilistic context-free grammar that enable a predictive parser to avoid garden pathing without resorting to any ad-hoc heuristic repair. The resulting parser is shown to apply efficiently to both newspaper text and telephone conversations with complete coverage and excellent accuracy. The distribution over trees is peaked enough to allow the parser to find parses efficiently, even with the much larger search space resulting from overgeneration. Empirical results are provided for both Wall St. Journal and Switchboard test corpora.
Published: 2004
Full Text: View/download PDF

18. Offline analysis of context contribution to ERP-based typing BCI performance

Author: Brian Roark, Barry Oken, Melanie Fried-Oken, Umut Orhan, and Deniz Erdogmus
Subjects: Male, medicine.diagnostic_test, Computer science, Speech recognition, Biomedical Engineering, Electroencephalography, Linear discriminant analysis, Article, Cellular and Molecular Neuroscience, Discriminant, Event-related potential, Brain-Computer Interfaces, medicine, Humans, Female, Typing, Language model, Symbol rate, Evoked Potentials, Photic Stimulation, Brain–computer interface
Abstract: Objective. We aim to increase the symbol rate of electroencephalography (EEG) based brain–computer interface (BCI) typing systems by utilizing context information. Approach. Event related potentials (ERP) corresponding to a stimulus in EEG can be used to detect the intended target of a person for BCI. This paradigm is widely utilized to build letter-by-letter BCI typing systems. Nevertheless currently available BCI typing systems still require improvement due to low typing speeds. This is mainly due to the reliance on multiple repetitions before making a decision to achieve higher typing accuracy. Another possible approach to increase the speed of typing while not significantly reducing the accuracy of typing is to use additional context information. In this paper, we study the effect of using a language model (LM) as additional evidence for intent detection. Bayesian fusion of an n-gram symbol model with EEG features is proposed, and a specifically regularized discriminant analysis ERP discriminant is used to obtain EEG-based features. The target detection accuracies are rigorously evaluated for varying LM orders, as well as the number of ERP-inducing repetitions. Main results. The results demonstrate that the LMs contribute significantly to letter classification accuracy. For instance, we find that a single-trial ERP detection supported by a 4-gram LM may achieve the same performance as using 3-trial ERP classification for the non-initial letters of words. Significance. Overall, the fusion of evidence from EEG and LMs yields a significant opportunity to increase the symbol rate of a BCI typing system.
Published: 2013
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

18 results on '"Brian Roark"'

1. Context-aware Transliteration of Romanized South Asian Languages

2. Approximating Probabilistic Models as Weighted Finite Automata

3. Graph-Based Word Alignment for Clinical Language Evaluation

4. Phonotactic Complexity and Its Trade-offs

5. Neural Models of Text Normalization for Speech Applications

6. Probabilistic Top-Down Parsing and Language Modeling

7. Applications of Lexicographic Semirings to Problems in Speech and Language Processing

8. Finite-State Chart Constraints for Reduced Complexity Context-Free Parsing Pipelines

10. Huffman scanning: Using language models within fixed-grid keyboard emulation

11. Speech and Language processing as assistive technologies

12. The Application of Natural Language Processing to Augmentative and Alternative Communication

13. Discriminative n-gram language modeling

14. Utterance classification with discriminative language modeling

15. MAP adaptation of stochastic grammars

16. THE DESIGN PRINCIPLES AND ALGORITHMS OF A WEIGHTED GRAMMAR LIBRARY

17. Robust garden path parsing

18. Offline analysis of context contribution to ERP-based typing BCI performance

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

18 results on '"Brian Roark"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources