20 results on '"Brian Roark"'
Search Results
2. Approximating Probabilistic Models as Weighted Finite Automata
- Author
-
Ananda Theertha Suresh, Brian Roark, Michael Riley, and Vlad Schogol
- Subjects
Computational linguistics. Natural language processing ,P98-98.5 - Abstract
AbstractWeighted finite automata (WFAs) are often used to represent probabilistic models, such as n-gram language models, because among other things, they are efficient for recognition tasks in time and space. The probabilistic source to be represented as a WFA, however, may come in many forms. Given a generic probabilistic model over sequences, we propose an algorithm to approximate it as a WFA such that the Kullback-Leibler divergence between the source model and the WFA target model is minimized. The proposed algorithm involves a counting step and a difference of convex optimization step, both of which can be performed efficiently. We demonstrate the usefulness of our approach on various tasks, including distilling n-gram models from neural models, building compact language models, and building open-vocabulary character models. The algorithms used for these experiments are available in an open-source software library.
- Published
- 2021
- Full Text
- View/download PDF
3. Graph-Based Word Alignment for Clinical Language Evaluation
- Author
-
Emily Prud'hommeaux and Brian Roark
- Subjects
Computational linguistics. Natural language processing ,P98-98.5 - Published
- 2022
- Full Text
- View/download PDF
4. Phonotactic Complexity and Its Trade-offs
- Author
-
Tiago Pimentel, Brian Roark, and Ryan Cotterell
- Subjects
Computational linguistics. Natural language processing ,P98-98.5 - Abstract
AbstractWe present methods for calculating a measure of phonotactic complexity—bits per phoneme— that permits a straightforward cross-linguistic comparison. When given a word, represented as a sequence of phonemic segments such as symbols in the international phonetic alphabet, and a statistical model trained on a sample of word types from the language, we can approximately measure bits per phoneme using the negative log-probability of that word under the model. This simple measure allows us to compare the entropy across languages, giving insight into how complex a language’s phonotactics is. Using a collection of 1016 basic concept words across 106 languages, we demonstrate a very strong negative correlation of − 0.74 between bits per phoneme and the average length of words.
- Published
- 2020
- Full Text
- View/download PDF
5. Neural Models of Text Normalization for Speech Applications
- Author
-
Hao Zhang, Richard Sproat, Axel H. Ng, Felix Stahlberg, Xiaochang Peng, Kyle Gorman, and Brian Roark
- Subjects
Computational linguistics. Natural language processing ,P98-98.5 - Abstract
Machine learning, including neural network techniques, have been applied to virtually every domain in natural language processing. One problem that has been somewhat resistant to effective machine learning solutions is text normalization for speech applications such as text-to-speech synthesis (TTS). In this application, one must decide, for example, that 123 is verbalized as one hundred twenty three in 123 pages but as one twenty three in 123 King Ave. For this task, state-of-the-art industrial systems depend heavily on hand-written language-specific grammars. We propose neural network models that treat text normalization for TTS as a sequence-to-sequence problem, in which the input is a text token in context, and the output is the verbalization of that token. We find that the most effective model, in accuracy and efficiency, is one where the sentential context is computed once and the results of that computation are combined with the computation of each token in sequence to compute the verbalization. This model allows for a great deal of flexibility in terms of representing the context, and also allows us to integrate tagging and segmentation into the process. These models perform very well overall, but occasionally they will predict wildly inappropriate verbalizations, such as reading 3 cm as three kilometers. Although rare, such verbalizations are a major issue for TTS applications. We thus use finite-state covering grammars to guide the neural models, either during training and decoding, or just during decoding, away from such “unrecoverable” errors. Such grammars can largely be learned from data.
- Published
- 2019
- Full Text
- View/download PDF
6. Probabilistic Top-Down Parsing and Language Modeling
- Author
-
Brian Roark
- Subjects
Computational linguistics. Natural language processing ,P98-98.5 - Published
- 2021
- Full Text
- View/download PDF
7. Applications of Lexicographic Semirings to Problems in Speech and Language Processing
- Author
-
Richard Sproat, Mahsa Yarmohammadi, Izhak Shafran, and Brian Roark
- Subjects
Computational linguistics. Natural language processing ,P98-98.5 - Published
- 2021
- Full Text
- View/download PDF
8. Finite-State Chart Constraints for Reduced Complexity Context-Free Parsing Pipelines
- Author
-
Brian Roark, Kristy Hollingshead, and Nathan Bodenstab
- Subjects
Computational linguistics. Natural language processing ,P98-98.5 - Published
- 2021
- Full Text
- View/download PDF
9. Putting Linguistics into Speech Recognition: The Regulus Grammar Compiler Manny Rayner, Beth Ann Hockey, and Pierette Bouillon (NASA Ames Research Center and University of Geneva) Stanford, CA: CSLI Publications (CSLI studies in computational linguistics, edited by Ann Copestake), 2006, xiv+305 pp; hardbound, ISBN 1-57586-525-4, $65.00; paperbound, ISBN 1-57586-526-2, $25.00
- Author
-
Brian Roark
- Subjects
Computational linguistics. Natural language processing ,P98-98.5 - Published
- 2021
- Full Text
- View/download PDF
10. Finding Concept-specific Biases in Form–Meaning Associations
- Author
-
Brian Roark, Damián E. Blasi, Ryan Cotterell, Tiago Pimentel, Søren Wichmann, Toutanova, Kristina, Rumshisky, Anna, Zettlemoyer, Luke, Hakkani-Tur, Dilek, Beltagy, Iz, Bethard, Steven, Cotterell, Ryan, Chakraborty, Tanmoy, and Zhou, Yichao
- Subjects
Computer science ,Phone ,Theory of Forms ,Scale (chemistry) ,Meaning (existential) ,Language family ,Lexicon ,Quarter (United States coin) ,Word (group theory) ,Cognitive psychology - Abstract
This work presents an information-theoretic operationalisation of cross-linguistic non-arbitrariness. It is not a new idea that there are small, cross-linguistic associations between the forms and meanings of words. For instance, it has been claimed (Blasi et al., 2016) that the word for “tongue” is more likely than chance to contain the phone [l]. By controlling for the influence of language family and geographic proximity within a very large concept-aligned, cross-lingual lexicon, we extend methods previously used to detect within language non-arbitrariness (Pimentel et al., 2019) to measure cross-linguistic associations. We find that there is a significant effect of non-arbitrariness, but it is unsurprisingly small (less than 0.5% on average according to our information-theoretic estimate). We also provide a concept-level analysis which shows that a quarter of the concepts considered in our work exhibit a significant level of cross-linguistic non-arbitrariness. In sum, the paper provides new methods to detect cross-linguistic associations at scale, and confirms their effects are minor., Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, ISBN:978-1-954085-46-6
- Published
- 2021
11. Language-agnostic Multilingual Modeling
- Author
-
Anjuli Kannan, Brian Roark, Bhuvana Ramabhadran, Jesse Emond, and Arindrima Datta
- Subjects
FOS: Computer and information sciences ,Sound (cs.SD) ,Computer science ,Word error rate ,Machine Learning (stat.ML) ,02 engineering and technology ,computer.software_genre ,01 natural sciences ,Computer Science - Sound ,Audio and Speech Processing (eess.AS) ,Statistics - Machine Learning ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Transliteration ,FOS: Electrical engineering, electronic engineering, information engineering ,010302 applied physics ,Hindi ,business.industry ,language.human_language ,020202 computer hardware & architecture ,Kannada ,Bengali ,Writing system ,Tamil ,language ,Artificial intelligence ,business ,computer ,Natural language processing ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
Multilingual Automated Speech Recognition (ASR) systems allow for the joint training of data-rich and data-scarce languages in a single model. This enables data and parameter sharing across languages, which is especially beneficial for the data-scarce languages. However, most state-of-the-art multilingual models require the encoding of language information and therefore are not as flexible or scalable when expanding to newer languages. Language-independent multilingual models help to address this issue, and are also better suited for multicultural societies where several languages are frequently used together (but often rendered with different writing systems). In this paper, we propose a new approach to building a language-agnostic multilingual ASR system which transforms all languages to one writing system through a many-to-one transliteration transducer. Thus, similar sounding acoustics are mapped to a single, canonical target sequence of graphemes, effectively separating the modeling and rendering problems. We show with four Indic languages, namely, Hindi, Bengali, Tamil and Kannada, that the language-agnostic multilingual model achieves up to 10% relative reduction in Word Error Rate (WER) over a language-dependent multilingual model.
- Published
- 2020
12. Phonotactic Complexity and Its Trade-offs
- Author
-
Ryan Cotterell, Tiago Pimentel, and Brian Roark
- Subjects
FOS: Computer and information sciences ,Phonotactics ,050101 languages & linguistics ,Linguistics and Language ,Computer Science - Computation and Language ,Computer science ,Communication ,Speech recognition ,05 social sciences ,Trade offs ,lcsh:P98-98.5 ,050105 experimental psychology ,Computer Science Applications ,Human-Computer Interaction ,Artificial Intelligence ,International Phonetic Alphabet ,Entropy (information theory) ,0501 psychology and cognitive sciences ,lcsh:Computational linguistics. Natural language processing ,Computation and Language (cs.CL) - Abstract
We present methods for calculating a measure of phonotactic complexity—bits per phoneme— that permits a straightforward cross-linguistic comparison. When given a word, represented as a sequence of phonemic segments such as symbols in the international phonetic alphabet, and a statistical model trained on a sample of word types from the language, we can approximately measure bits per phoneme using the negative log-probability of that word under the model. This simple measure allows us to compare the entropy across languages, giving insight into how complex a language’s phonotactics is. Using a collection of 1016 basic concept words across 106 languages, we demonstrate a very strong negative correlation of − 0.74 between bits per phoneme and the average length of words., Transactions of the Association for Computational Linguistics, 8, ISSN:2307-387X
- Published
- 2020
13. Meaning to Form: Measuring Systematicity as Information
- Author
-
Damián E. Blasi, Arya D. McCarthy, Tiago Pimentel, Ryan Cotterell, and Brian Roark
- Subjects
FOS: Computer and information sciences ,Computer Science - Computation and Language ,Computer science ,Bigram ,05 social sciences ,02 engineering and technology ,Mutual information ,Semantics ,050105 experimental psychology ,Linguistics ,0202 electrical engineering, electronic engineering, information engineering ,Semiotics ,Entropy (information theory) ,020201 artificial intelligence & image processing ,0501 psychology and cognitive sciences ,Semantic representation ,Computation and Language (cs.CL) - Abstract
A longstanding debate in semiotics centers on the relationship between linguistic signs and their corresponding semantics: is there an arbitrary relationship between a word form and its meaning, or does some systematic phenomenon pervade? For instance, does the character bigram \textit{gl} have any systematic relationship to the meaning of words like \textit{glisten}, \textit{gleam} and \textit{glow}? In this work, we offer a holistic quantification of the systematicity of the sign using mutual information and recurrent neural networks. We employ these in a data-driven and massively multilingual approach to the question, examining 106 languages. We find a statistically significant reduction in entropy when modeling a word form conditioned on its semantic representation. Encouragingly, we also recover well-attested English examples of systematic affixes. We conclude with the meta-point: Our approximate effect size (measured in bits) is quite small---despite some amount of systematicity between form and meaning, an arbitrary relationship and its resulting benefits dominate human language., Accepted for publication at ACL 2019
- Published
- 2019
14. Probabilistic Simulation Framework for EEG-Based BCI Design
- Author
-
Deniz Erdogmus, Brian Roark, Hooman Nezamfar, Andrew Fowler, Umut Orhan, Melanie Fried-Oken, Barry Oken, Matt Higger, Mohammad Moghadamfalahi, and Murat Akcakaya
- Subjects
Operational performance ,medicine.diagnostic_test ,Computer science ,Interface (computing) ,0206 medical engineering ,Monte Carlo method ,Probabilistic simulation ,Biomedical Engineering ,02 engineering and technology ,Replicate ,Electroencephalography ,020601 biomedical engineering ,Article ,Human-Computer Interaction ,03 medical and health sciences ,Behavioral Neuroscience ,0302 clinical medicine ,medicine ,Electrical and Electronic Engineering ,030217 neurology & neurosurgery ,Simulation ,Brain–computer interface - Abstract
A simulation framework could decrease the burden of attending long and tiring experimental sessions on the potential users of brain-computer interface (BCI) systems. Specifically during the initial design of a BCI, a simulation framework that could replicate the operational performance of the system would be a useful tool for designers to make design choices. In this manuscript, we develop a Monte Carlo-based probabilistic simulation framework for electroencephalography (EEG) based BCI design. We employ one event-related potential (ERP) based typing and one steady-state evoked potential (SSVEP) based control interface as testbeds. We compare the results of simulations with real-time experiments. Even though over- and underestimation of the performance is possible, the statistical results over the Monte Carlo simulations show that the developed framework generally provides a good approximation of the real-time system performance.
- Published
- 2016
15. Huffman and Linear Scanning Methods with Statistical Language Models
- Author
-
Brian Roark, Chris Gibbons, and Melanie Fried-Oken
- Subjects
Adult ,Male ,Brain Stem Infarctions ,Computer science ,Speech recognition ,Access method ,Huffman coding ,Column (database) ,Article ,Speech and Hearing ,Brain stem infarction ,symbols.namesake ,Communication Aids for Disabled ,Software ,Text generation ,Humans ,Language ,Natural Language Processing ,Models, Statistical ,business.industry ,Rehabilitation ,Usability ,symbols ,Language model ,business ,Algorithm - Abstract
Current scanning access methods for text generation in AAC devices are limited to relatively few options, most notably row/column variations within a matrix. We present Huffman scanning, a new method for applying statistical language models to binary-switch, static-grid typing AAC interfaces, and compare it to other scanning options under a variety of conditions. We present results for 16 adults without disabilities and one 36-year-old man with locked-in syndrome who presents with complex communication needs and uses AAC scanning devices for writing. Huffman scanning with a statistical language model yielded signifi cant typing speedups for the 16 participants without disabilities versus any of the other methods tested, including two row/column scanning methods. A similar pattern of results was found with the individual with locked-in syndrome. Interestingly, faster typing speeds were obtained with Huffman scanning using a more leisurely scan rate than relatively fast individually calibrated scan rates. Overall, the results reported here demonstrate great promise for the usability of Huffman scanning as a faster alternative to row/column scanning.
- Published
- 2015
16. Continuous space discriminative language modeling
- Author
-
Puyang Xu, Matt Post, Chris Callison-Burch, Kenji Sagae, Daniel M. Bikel, Maider Lehr, Izhak Shafran, Damianos Karakos, Brian Roark, Sanjeev Khudanpur, Murat Saraclar, Eva Hasler, Keith Hall, Nathan Glenn, Darcey Riley, Adam Lopez, Emily Prud'hommeaux, Yuan Cao, and Philipp Koehn
- Subjects
Signal processing ,Computer science ,Speech recognition - Abstract
Discriminative language modeling is a structured classification problem. Log-linear models have been previously used to address this problem. In this paper, the standard dot-product feature representation used in log-linear models is replaced by a non-linear function parameterized by a neural network. Embeddings are learned for each word and features are extracted automatically through the use of convolutional layers. Experimental results show that as a stand-alone model the continuous space model yields significantly lower word error rate (1% absolute), while having a much more compact parameterization (60%-90% smaller). If the baseline scores are combined, our approach performs equally well.
- Published
- 2012
- Full Text
- View/download PDF
17. Efficient probabilistic top-down and left-corner parsing
- Author
-
Mark Johnson and Brian Roark
- Subjects
FOS: Computer and information sciences ,Parsing ,Computer Science - Computation and Language ,Computer science ,Memoization ,business.industry ,Semantic interpretation ,I.2.7 ,Context (language use) ,02 engineering and technology ,Top-down and bottom-up design ,Top-down parsing ,computer.software_genre ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,Parser combinator ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,S-attributed grammar ,Artificial intelligence ,business ,Computation and Language (cs.CL) ,computer ,Natural language processing ,Bottom-up parsing - Abstract
This paper examines efficient predictive broad-coverage parsing without dynamic programming. In contrast to bottom-up methods, depth-first top-down parsing produces partial parses that are fully connected trees spanning the entire left context, from which any kind of non-local dependency or partial semantic interpretation can in principle be read. We contrast two predictive parsing approaches, top-down and left-corner parsing, and find both to be viable. In addition, we find that enhancement with non-local information not only improves parser accuracy, but also substantially improves the search efficiency., 8 pages, 3 tables, 3 figures
- Published
- 2000
18. Computational Approaches to Morphology and Syntax
- Author
-
Brian Roark, Richard Sproat, Brian Roark, and Richard Sproat
- Subjects
- Grammar, Comparative and general--Syntax--Data processing, Grammar, Comparative and general--Morphology--Data processing, Computational linguistics
- Abstract
The book will appeal to scholars and advanced students of morphology, syntax, computational linguistics and natural language processing (NLP). It provides a critical and practical guide to computational techniques for handling morphological and syntactic phenomena, showing how these techniques have been used and modified in practice. The authors discuss the nature and uses of syntactic parsers and examine the problems and opportunities of parsing algorithms for finite-state, context-free and various context-sensitive grammars. They relate approaches for describing syntax and morphology to formal mechanisms and algorithms, and present well-motivated approaches for augmenting grammars with weights or probabilities.
- Published
- 2007
19. THE DESIGN PRINCIPLES AND ALGORITHMS OF A WEIGHTED GRAMMAR LIBRARY.
- Author
-
Cyril Allaijzen, Mehryar Morri, and Brian Roark
- Subjects
COMPUTER software ,ALGORITHMS ,GRAMMAR ,PSEUDOCODE (Computer program language) ,PROGRAMMING languages ,COMPUTER systems - Abstract
We present the software design principles, algorithms, and utilities of a general weighted grammar library, the GRM Library, that can be used in a variety of applications in text, speech, and biosequence processing. Several of the algorithms and utilities of this library are described, including in some cases their pseudocodes and pointers to their use in applications. The algorithms and the utilities were designed to support a wide variety of semirings and the representation and use of large grammars and automata of several hundred million rules or transitions. [ABSTRACT FROM AUTHOR]
- Published
- 2005
- Full Text
- View/download PDF
20. Robust garden path parsing.
- Author
-
BRIAN ROARK
- Published
- 2004
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.