Author: "O’Neill, Patrick K." - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"O’Neill, Patrick K."' showing total 12 results

Start Over Author "O’Neill, Patrick K."

12 results on '"O’Neill, Patrick K."'

1. SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition

Author: O'Neill, Patrick K., Lavrukhin, Vitaly, Majumdar, Somshubra, Noroozi, Vahid, Zhang, Yuekai, Kuchaiev, Oleksii, Balam, Jagadeesh, Dovzhenko, Yuliya, Freyberg, Keenan, Shulman, Michael D., Ginsburg, Boris, Watanabe, Shinji, and Kucsko, Georg
Subjects: Computer Science - Computation and Language, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: In the English speech-to-text (STT) machine learning task, acoustic models are conventionally trained on uncased Latin characters, and any necessary orthography (such as capitalization, punctuation, and denormalization of non-standard words) is imputed by separate post-processing models. This adds complexity and limits performance, as many formatting tasks benefit from semantic information present in the acoustic signal but absent in transcription. Here we propose a new STT task: end-to-end neural transcription with fully formatted text for target labels. We present baseline Conformer-based models trained on a corpus of 5,000 hours of professionally transcribed earnings calls, achieving a CER of 1.7. As a contribution to the STT research community, we release the corpus free for non-commercial use at https://datasets.kensho.com/datasets/scribe., Comment: 5 pages, 1 figure. Submitted to INTERSPEECH 2021
Published: 2021

2. Diagnostic yield of targeted next generation sequencing in various cancer types: An information-theoretic approach

Author: Hagemann, Ian S., O'Neill, Patrick K., Erill, Ivan, and Pfeifer, John D.
Published: 2015
Full Text: View/download PDF

3. SPGISpeech: 5,000 Hours of Transcribed Financial Audio for Fully Formatted End-to-End Speech Recognition

Author: O’Neill, Patrick K., primary, Lavrukhin, Vitaly, additional, Majumdar, Somshubra, additional, Noroozi, Vahid, additional, Zhang, Yuekai, additional, Kuchaiev, Oleksii, additional, Balam, Jagadeesh, additional, Dovzhenko, Yuliya, additional, Freyberg, Keenan, additional, Shulman, Michael D., additional, Ginsburg, Boris, additional, Watanabe, Shinji, additional, and Kucsko, Georg, additional
Published: 2021
Full Text: View/download PDF

4. Gene Regulation: from Information Theory to Biophysics

Author: O'Neill, Patrick K.
Subjects: biophysics, transcriptional regulation, information theory
Abstract: In this thesis we explore applications of information theory to gene regulatory systems. Considering molecular recognition as an abstract communication channel, we are led to explore several biological problems in transcriptional regulation from an informational point of view. In particular, we study design principles for transcription factor binding motifs using protein-DNA coevolutionary models, maximum entropy methods for motif statistics, and exact sampling strategies for hybrid biophysical / evolutionary simulations. In Chapter 1 we review the necessary background and equip ourselves with a common stock of concepts from molecular biology and information theory. In Chapter 2 we present ESTReMo, a hybrid biophysical / evolutionary model of transcriptional regulation in which a DNA-binding domain and its cognate binding sites are co-evolved in silico. Simulating the behavior of the system over the cellular timescale and selecting its desired behavior over the evolutionary timescale, we provide a general framework for the exploration of hypothesized design principles of transcriptional regulation. For example, we confirm and generalize an argument relating the information content of a binding motif to its target occupancy, and find that active selection of pseudosites can serve to modulate occupancy. In Chapter 3 we consider a problem raised in the previous chapter: how can one determine whether a certain observed function of a binding motif is statistically unusual? Put another way, what is the correct null hypothesis for gapless alignments? To answer this question we develop two distributions over the space of binding motifs of given dimensions which can serve in this role. The first is a MaxEnt distribution over the space of motifs of given dimensions, subject to mean information content, and the second is a truncated uniform distribution over all motifs having information content within a specified interval. With efficient sampling algorithms for these distributions in hand, we show that both prokaryotic and eukaryotic binding motifs tend to exhibit larger informational Gini coefficients than would be expected by chance, and that this feature can be used to distinguish true binding motifs from false positives with matching information content. In Chapter 4 we yoke together information theory, biophysics and evolutionary biology in order to consider the problem of optimal molecular recognition strategies in binding motifs. Comparing linear models to those that incorporate pairwise correlations in a broad collection of prokaryotic and eukaryotic motifs and subjecting them to formal model comparison, we find that while simple linear models suffice for a large majority, most still contain a greater amount of pairwise mutual information content than would be expected by chance. In Chapter 5 we take stock and indicate some directions for future research.
Published: 2016
Full Text: View/download PDF

5. Parametric bootstrapping for biological sequence motifs

Author: O’Neill, Patrick K., primary and Erill, Ivan, additional
Published: 2016
Full Text: View/download PDF

6. A Bayesian inference method for the analysis of transcriptional regulatory networks in metagenomic data

Author: Hobbs, Elizabeth T., primary, Pereira, Talmo, additional, O’Neill, Patrick K., additional, and Erill, Ivan, additional
Published: 2016
Full Text: View/download PDF

7. BIITE: A Tool to Determine HLA Class II Epitopes from T Cell ELISpot Data

Author: Boelen, Lies, primary, O’Neill, Patrick K., additional, Quigley, Kathryn J., additional, Reynolds, Catherine J., additional, Maillere, Bernard, additional, Robinson, John H., additional, Lertmemongkolchai, Ganjana, additional, Altmann, Daniel M., additional, Boyton, Rosemary J., additional, and Asquith, Becca, additional
Published: 2016
Full Text: View/download PDF

8. Parametric bootstrapping for biological sequence motifs.

Author: O'Neill, Patrick K. and Erill, Ivan
Subjects: *DNA, *STATISTICAL bootstrapping, *GENE regulatory networks, *MAXIMUM entropy method, *BIOINFORMATICS, *GINI coefficient
Abstract: Background: Biological sequence motifs drive the specific interactions of proteins and nucleic acids. Accordingly, the effective computational discovery and analysis of such motifs is a central theme in bioinformatics. Many practical questions about the properties of motifs can be recast as random sampling problems. In this light, the task is to determine for a given motif whether a certain feature of interest is statistically unusual among relevantly similar alternatives. Despite the generality of this framework, its use has been frustrated by the difficulties of defining an appropriate reference class of motifs for comparison and of sampling from it effectively. Results: We define two distributions over the space of all motifs of given dimension. The first is the maximum entropy distribution subject to mean information content, and the second is the truncated uniform distribution over all motifs having information content within a given interval. We derive exact sampling algorithms for each. As a proof of concept, we employ these sampling methods to analyze a broad collection of prokaryotic and eukaryotic transcription factor binding site motifs. In addition to positional information content, we consider the informational Gini coefficient of the motif, a measure of the degree to which information is evenly distributed throughout a motif’s positions. We find that both prokaryotic and eukaryotic motifs tend to exhibit higher informational Gini coefficients (IGC) than would be expected by chance under either reference distribution. As a second application, we apply maximum entropy sampling to the motif p-value problem and use it to give elementary derivations of two new estimators. Conclusions: Despite the historical centrality of biological sequence motif analysis, this study constitutes to our knowledge the first use of principled null hypotheses for sequence motifs given information content. Through their use, we are able to characterize for the first time differerences in global motif statistics between biological motifs and their null distributions. In particular, we observe that biological sequence motifs show an unusual distribution of IGC, presumably due to biochemical constraints on the mechanisms of direct read-out. [ABSTRACT FROM AUTHOR]
Published: 2016
Full Text: View/download PDF

9. Informational Requirements for Transcriptional Regulation

Author: O'Neill, Patrick K., primary, Forder, Robert, additional, and Erill, Ivan, additional
Published: 2014
Full Text: View/download PDF

10. Characterization of the SOS meta-regulon in the human gut microbiome

Author: Cornish, Joseph P., primary, Sanchez-Alberola, Neus, additional, O’Neill, Patrick K., additional, O'Keefe, Ronald, additional, Gheba, Jameel, additional, and Erill, Ivan, additional
Published: 2014
Full Text: View/download PDF

11. scnRCA: A Novel Method to Detect Consistent Patterns of Translational Selection in Mutationally-Biased Genomes

Author: O'Neill, Patrick K., primary, Or, Mindy, additional, and Erill, Ivan, additional
Published: 2013
Full Text: View/download PDF

12. A Laser Unequal Path Interferometer (LUPI)* For The Optical Shop

Author: Houston, Jr., Joseph B., primary, Buccini, C. J., additional, and O'Neill, Patrick K., additional
Published: 1967
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

12 results on '"O’Neill, Patrick K."'

1. SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition

2. Diagnostic yield of targeted next generation sequencing in various cancer types: An information-theoretic approach

3. SPGISpeech: 5,000 Hours of Transcribed Financial Audio for Fully Formatted End-to-End Speech Recognition

4. Gene Regulation: from Information Theory to Biophysics

5. Parametric bootstrapping for biological sequence motifs

6. A Bayesian inference method for the analysis of transcriptional regulatory networks in metagenomic data

7. BIITE: A Tool to Determine HLA Class II Epitopes from T Cell ELISpot Data

8. Parametric bootstrapping for biological sequence motifs.

9. Informational Requirements for Transcriptional Regulation

10. Characterization of the SOS meta-regulon in the human gut microbiome

11. scnRCA: A Novel Method to Detect Consistent Patterns of Translational Selection in Mutationally-Biased Genomes

12. A Laser Unequal Path Interferometer (LUPI)* For The Optical Shop

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

12 results on '"O’Neill, Patrick K."'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources