Author: "Uthus, David" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Uthus, David"' showing total 32 results

Start Over Author "Uthus, David"

32 results on '"Uthus, David"'

1. Reconsidering Sentence-Level Sign Language Translation

Author: Tanzer, Garrett, Shengelia, Maximus, Harrenstien, Ken, and Uthus, David
Subjects: Computer Science - Computation and Language
Abstract: Historically, sign language machine translation has been posed as a sentence-level task: datasets consisting of continuous narratives are chopped up and presented to the model as isolated clips. In this work, we explore the limitations of this task framing. First, we survey a number of linguistic phenomena in sign languages that depend on discourse-level context. Then as a case study, we perform the first human baseline for sign language translation that actually substitutes a human into the machine learning task framing, rather than provide the human with the entire document as context. This human baseline -- for ASL to English translation on the How2Sign dataset -- shows that for 33% of sentences in our sample, our fluent Deaf signer annotators were only able to understand key parts of the clip in light of additional discourse-level context. These results underscore the importance of understanding and sanity checking examples when adapting machine learning to new domains.
Published: 2024

2. Memory Augmented Language Models through Mixture of Word Experts

Author: Santos, Cicero Nogueira dos, Lee-Thorp, James, Noble, Isaac, Chang, Chung-Ching, and Uthus, David
Subjects: Computer Science - Computation and Language
Abstract: Scaling up the number of parameters of language models has proven to be an effective approach to improve performance. For dense models, increasing model size proportionally increases the model's computation footprint. In this work, we seek to aggressively decouple learning capacity and FLOPs through Mixture-of-Experts (MoE) style models with large knowledge-rich vocabulary based routing functions and experts. Our proposed approach, dubbed Mixture of Word Experts (MoWE), can be seen as a memory augmented model, where a large set of word-specific experts play the role of a sparse memory. We demonstrate that MoWE performs significantly better than the T5 family of models with similar number of FLOPs in a variety of NLP tasks. Additionally, MoWE outperforms regular MoE models on knowledge intensive tasks and has similar performance to more complex memory augmented approaches that often require to invoke custom mechanisms to search the sparse memory., Comment: 14 pages
Published: 2023

3. YouTube-ASL: A Large-Scale, Open-Domain American Sign Language-English Parallel Corpus

Author: Uthus, David, Tanzer, Garrett, and Georg, Manfred
Subjects: Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition
Abstract: Machine learning for sign languages is bottlenecked by data. In this paper, we present YouTube-ASL, a large-scale, open-domain corpus of American Sign Language (ASL) videos and accompanying English captions drawn from YouTube. With ~1000 hours of videos and >2500 unique signers, YouTube-ASL is ~3x as large and has ~10x as many unique signers as the largest prior ASL dataset. We train baseline models for ASL to English translation on YouTube-ASL and evaluate them on How2Sign, where we achieve a new finetuned state of the art of 12.39 BLEU and, for the first time, report zero-shot results.
Published: 2023

4. mLongT5: A Multilingual and Efficient Text-To-Text Transformer for Longer Sequences

Author: Uthus, David, Ontañón, Santiago, Ainslie, Joshua, and Guo, Mandy
Subjects: Computer Science - Computation and Language
Abstract: We present our work on developing a multilingual, efficient text-to-text transformer that is suitable for handling long inputs. This model, called mLongT5, builds upon the architecture of LongT5, while leveraging the multilingual datasets used for pretraining mT5 and the pretraining tasks of UL2. We evaluate this model on a variety of multilingual summarization and question-answering tasks, and the results show stronger performance for mLongT5 when compared to existing multilingual models such as mBART or M-BERT.
Published: 2023

5. CoLT5: Faster Long-Range Transformers with Conditional Computation

Author: Ainslie, Joshua, Lei, Tao, de Jong, Michiel, Ontañón, Santiago, Brahma, Siddhartha, Zemlyanskiy, Yury, Uthus, David, Guo, Mandy, Lee-Thorp, James, Tay, Yi, Sung, Yun-Hsuan, and Sanghai, Sumit
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Many natural language processing tasks benefit from long inputs, but processing long documents with Transformers is expensive -- not only due to quadratic attention complexity but also from applying feedforward and projection layers to every token. However, not all tokens are equally important, especially for longer documents. We propose CoLT5, a long-input Transformer model that builds on this intuition by employing conditional computation, devoting more resources to important tokens in both feedforward and attention layers. We show that CoLT5 achieves stronger performance than LongT5 with much faster training and inference, achieving SOTA on the long-input SCROLLS benchmark. Moreover, CoLT5 can effectively and tractably make use of extremely long inputs, showing strong gains up to 64k input length., Comment: Accepted at EMNLP 2023
Published: 2023

6. RISE: Leveraging Retrieval Techniques for Summarization Evaluation

Author: Uthus, David and Ni, Jianmo
Subjects: Computer Science - Computation and Language
Abstract: Evaluating automatically-generated text summaries is a challenging task. While there have been many interesting approaches, they still fall short of human evaluations. We present RISE, a new approach for evaluating summaries by leveraging techniques from information retrieval. RISE is first trained as a retrieval task using a dual-encoder retrieval setup, and can then be subsequently utilized for evaluating a generated summary given an input document, without gold reference summaries. RISE is especially well suited when working on new datasets where one may not have reference summaries available for evaluation. We conduct comprehensive experiments on the SummEval benchmark (Fabbri et al., 2021) and the results show that RISE has higher correlation with human evaluations compared to many past approaches to summarization evaluation. Furthermore, RISE also demonstrates data-efficiency and generalizability across languages.
Published: 2022

7. LongT5: Efficient Text-To-Text Transformer for Long Sequences

Author: Guo, Mandy, Ainslie, Joshua, Uthus, David, Ontanon, Santiago, Ni, Jianmo, Sung, Yun-Hsuan, and Yang, Yinfei
Subjects: Computer Science - Computation and Language
Abstract: Recent work has shown that either (1) increasing the input length or (2) increasing model size can improve the performance of Transformer-based neural models. In this paper, we present a new model, called LongT5, with which we explore the effects of scaling both the input length and model size at the same time. Specifically, we integrated attention ideas from long-input transformers (ETC), and adopted pre-training strategies from summarization pre-training (PEGASUS) into the scalable T5 architecture. The result is a new attention mechanism we call {\em Transient Global} (TGlobal), which mimics ETC's local/global attention mechanism, but without requiring additional side-inputs. We are able to achieve state-of-the-art results on several summarization tasks and outperform the original T5 models on question answering tasks., Comment: Accepted in NAACL 2022
Published: 2021

8. Augmenting Poetry Composition with Verse by Verse

Author: Uthus, David, Voitovich, Maria, and Mical, R. J.
Subjects: Computer Science - Computation and Language
Abstract: We describe Verse by Verse, our experiment in augmenting the creative process of writing poetry with an AI. We have created a group of AI poets, styled after various American classic poets, that are able to offer as suggestions generated lines of verse while a user is composing a poem. In this paper, we describe the underlying system to offer these suggestions. This includes a generative model, which is tasked with generating a large corpus of lines of verse offline and which are then stored in an index, and a dual-encoder model that is tasked with recommending the next possible set of verses from our index given the previous line of verse., Comment: NAACL 2022 Industry Track
Published: 2021

9. Investigating Societal Biases in a Poetry Composition System

Author: Sheng, Emily and Uthus, David
Subjects: Computer Science - Computation and Language
Abstract: There is a growing collection of work analyzing and mitigating societal biases in language understanding, generation, and retrieval tasks, though examining biases in creative tasks remains underexplored. Creative language applications are meant for direct interaction with users, so it is important to quantify and mitigate societal biases in these applications. We introduce a novel study on a pipeline to mitigate societal biases when retrieving next verse suggestions in a poetry composition system. Our results suggest that data augmentation through sentiment style transfer has potential for mitigating societal biases., Comment: 14 pages, 2nd Workshop on Gender Bias in NLP
Published: 2020

10. TextSETTR: Few-Shot Text Style Extraction and Tunable Targeted Restyling

Author: Riley, Parker, Constant, Noah, Guo, Mandy, Kumar, Girish, Uthus, David, and Parekh, Zarana
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: We present a novel approach to the problem of text style transfer. Unlike previous approaches requiring style-labeled training data, our method makes use of readily-available unlabeled text by relying on the implicit connection in style between adjacent sentences, and uses labeled data only at inference time. We adapt T5 (Raffel et al., 2020), a strong pretrained text-to-text model, to extract a style vector from text and use it to condition the decoder to perform style transfer. As our label-free training results in a style vector space encoding many facets of style, we recast transfers as "targeted restyling" vector operations that adjust specific attributes of the input while preserving others. We demonstrate that training on unlabeled Amazon reviews data results in a model that is competitive on sentiment transfer, even compared to models trained fully on labeled data. Furthermore, applying our novel method to a diverse corpus of unlabeled web text results in a single model capable of transferring along multiple dimensions of style (dialect, emotiveness, formality, politeness, sentiment) despite no additional training and using only a handful of exemplars at inference time.
Published: 2020

11. mLongT5: A Multilingual and Efficient Text-To-Text Transformer for Longer Sequences

Author: Uthus, David, primary, Ontanon, Santiago, additional, Ainslie, Joshua, additional, and Guo, Mandy, additional
Published: 2023
Full Text: View/download PDF

12. RISE: Leveraging Retrieval Techniques for Summarization Evaluation

Author: Uthus, David, primary and Ni, Jianmo, additional
Published: 2023
Full Text: View/download PDF

13. CoLT5: Faster Long-Range Transformers with Conditional Computation

Author: Ainslie, Joshua, primary, Lei, Tao, additional, de Jong, Michiel, additional, Ontanon, Santiago, additional, Brahma, Siddhartha, additional, Zemlyanskiy, Yury, additional, Uthus, David, additional, Guo, Mandy, additional, Lee-Thorp, James, additional, Tay, Yi, additional, Sung, Yun-Hsuan, additional, and Sanghai, Sumit, additional
Published: 2023
Full Text: View/download PDF

14. DFS* and the Traveling Tournament Problem

Author: Uthus, David C., Riddle, Patricia J., Guesgen, Hans W., Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Nierstrasz, Oscar, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Sudan, Madhu, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Vardi, Moshe Y., Series editor, Weikum, Gerhard, Series editor, van Hoeve, Willem-Jan, editor, and Hooker, John N., editor
Published: 2009
Full Text: View/download PDF

15. Ant Colony Optimization and the Single Round Robin Maximum Value Problem

Author: Uthus, David C., Riddle, Patricia J., Guesgen, Hans W., Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Doug, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, Dorigo, Marco, editor, Birattari, Mauro, editor, Blum, Christian, editor, Clerc, Maurice, editor, Stützle, Thomas, editor, and Winfield, Alan F. T., editor
Published: 2008
Full Text: View/download PDF

16. Multiparticipant chat analysis: A survey

Author: Uthus, David C. and Aha, David W.
Published: 2013
Full Text: View/download PDF

17. Augmenting Poetry Composition with Verse by Verse

Author: Uthus, David, primary, Voitovich, Maria, additional, and Mical, R.j., additional
Published: 2022
Full Text: View/download PDF

18. LongT5: Efficient Text-To-Text Transformer for Long Sequences

Author: Guo, Mandy, primary, Ainslie, Joshua, additional, Uthus, David, additional, Ontanon, Santiago, additional, Ni, Jianmo, additional, Sung, Yun-Hsuan, additional, and Yang, Yinfei, additional
Published: 2022
Full Text: View/download PDF

19. Solving the traveling tournament problem with iterative-deepening A∗

Author: Uthus, David C., Riddle, Patricia J., and Guesgen, Hans W.
Published: 2012
Full Text: View/download PDF

20. TextSETTR: Few-Shot Text Style Extraction and Tunable Targeted Restyling

Author: Riley, Parker, primary, Constant, Noah, additional, Guo, Mandy, additional, Kumar, Girish, additional, Uthus, David, additional, and Parekh, Zarana, additional
Published: 2021
Full Text: View/download PDF

21. DFS* and the Traveling Tournament Problem

Author: Uthus, David C., primary, Riddle, Patricia J., additional, and Guesgen, Hans W., additional
Published: 2009
Full Text: View/download PDF

22. Ant Colony Optimization and the Single Round Robin Maximum Value Problem

Author: Uthus, David C., primary, Riddle, Patricia J., additional, and Guesgen, Hans W., additional
Full Text: View/download PDF

23. Detecting Bot-Answerable Questions in Ubuntu Chat

Author: NAVAL RESEARCH LAB WASHINGTON DC NAVY CENTER FOR APPLIED RESEARCH IN ARTIFICIAL INTELLIGENCE, Uthus, David C, Aha, David W, NAVAL RESEARCH LAB WASHINGTON DC NAVY CENTER FOR APPLIED RESEARCH IN ARTIFICIAL INTELLIGENCE, Uthus, David C, and Aha, David W
Abstract: Ubuntu's Internet Relay Chat technical support channel has bots that output specific messages in response to command words from other channel users. These messages can be used to answer frequently-asked questions instead of requiring an expert to (repeatedly) type a lengthy reply. We describe an approach to automatically distinguish bot-answerable questions, which would mitigate this problem. To the best of our knowledge, this is the first work on investigating question answering in a multiparticipant chat domain. Our results indicate that for some types of questions, supervised learning algorithms perform well on this task and, in addition, that character n-grams are a better representation than traditional bag-of-words for this task and domain., in International Joint Conference on Natural Language Processing, Nagoya, Japan, 14-19 Oct 2013.
Published: 2013

24. Extending Word Highlighting in Multiparticipant Chat

Author: NAVAL RESEARCH LAB WASHINGTON DC NAVY CENTER FOR APPLIED RESEARCH IN ARTIFICIAL INTELLIGENCE, Uthus, David C, Aha, David W, NAVAL RESEARCH LAB WASHINGTON DC NAVY CENTER FOR APPLIED RESEARCH IN ARTIFICIAL INTELLIGENCE, Uthus, David C, and Aha, David W
Abstract: We describe initial work on extensions to word highlighting for multiparticipant chat to aid users in finding messages of interest, especially during times of high traffic in chat rooms. We have annotated a corpus of chat messages from a technical chat domain (Ubuntu's technical support), indicating whether they are related to Ubuntu's new desktop environment Unity. We also created an unsupervised learning algorithm, in which relations are represented with a graph, and applied this to find words related to Unity so they can be highlighted in new, unseen chat messages. On the task of finding relevant messages, our approach outperformed two baseline approaches that are similar to current state-of-the-art word highlighting methods in chat clients., 2013 Florida Artificial Intelligence Research Society Conference, St. Pete Beach, FL, 22-24 May.
Published: 2013

25. The Ubuntu Chat Corpus for Multiparticipant Chat Analysis

Author: NAVAL RESEARCH LAB WASHINGTON DC, Uthus, David C, Aha, David W, NAVAL RESEARCH LAB WASHINGTON DC, Uthus, David C, and Aha, David W
Abstract: We present the Ubuntu Chat Corpus as a data source for multiparticipant chat analysis. This addresses the problem of the lack of a large, publicly suitable corpora for research in this medium. The advantages of using this corpus for research is its large number of chat messages its multiple languages, its technical nature, and all of the original chat messages are in the public domain., in AAAI Spring Symposium on Analyzing Microtext, Stanford, CA, 25-27 Mar 2013.
Published: 2013

26. Multiparticipant Chat Analysis: A Survey

Author: NAVAL RESEARCH LAB WASHINGTON DC NAVY CENTER FOR APPLIED RESEARCH IN ARTIFICIAL INTELLIGENCE, Uthus, David C, Aha, David W, NAVAL RESEARCH LAB WASHINGTON DC NAVY CENTER FOR APPLIED RESEARCH IN ARTIFICIAL INTELLIGENCE, Uthus, David C, and Aha, David W
Abstract: We survey research on the analysis of multiparticipant chat. Multiple research and applied communities (e.g., AI, educational, law enforcement, military) have interest in this topic. After introducing some context, we describe relevant problems and how these have been addressed using AI techniques. We also identify recent research trends and unresolved issues that could benefit from more attention., Published in Artificial Intelligence, v199-200 p106-121, 2013.
Published: 2013

27. Automated Chat Generator

Author: NAVAL RESEARCH LAB WASHINGTON DC NAVY CENTER FOR APPLIED RESEARCH IN ARTIFICIAL INTELLIGENCE, Williams, Bryan, Uthus, David C, Aha, David W, NAVAL RESEARCH LAB WASHINGTON DC NAVY CENTER FOR APPLIED RESEARCH IN ARTIFICIAL INTELLIGENCE, Williams, Bryan, Uthus, David C, and Aha, David W
Abstract: This document summarizes work towards the development of an automated chat generator, whose components are summarized in Figure 3. Our goal is that this will automatically generate chat that (slightly) resembles Navy Combat Information Center (CIC) chat, such that the resulting data can be used in a 5514 project that concerns automated chat highlighting and chat summarization.
Published: 2012

28. Plans Toward Automated Chat Summarization

Author: NAVAL RESEARCH LAB WASHINGTON DC, Uthus, David C., Aha, David W., NAVAL RESEARCH LAB WASHINGTON DC, Uthus, David C., and Aha, David W.
Abstract: We describe the beginning stages of our work on summarizing chat, which is motivated by our observations concerning the information overload of US Navy watchstanders. We describe the challenges of summarizing chat and focus on two chat-specific types of summarizations we are interested in: thread summaries and temporal summaries. We then discuss our plans for addressing these challenges and evaluation issues., Proceedings of the Workshop on Automatic Summarization for Different Genres, Media, and Languages, p1-7, June 23, 2011 Portland, OR. The original document contains color images.
Published: 2011

29. Solving the Traveling Tournament Problem with Iterative-Deepening A*

Author: Uthus, David, primary, Riddle, Patricia, additional, and Guesgen, Hans, additional
Published: 2013
Full Text: View/download PDF

30. Reports of the AAAI 2011 Conference Workshops

Author: Agmon, Noa, primary, Agrawal, Vikas, additional, Aha, David W., additional, Aloimonos, Yiannis, additional, Buckley, Donagh, additional, Doshi, Prashant, additional, Geib, Christopher, additional, Grasso, Floriana, additional, Green, Nancy, additional, Johnston, Benjamin, additional, Kaliski, Burt, additional, Kiekintveld, Christopher, additional, Law, Edith, additional, Lieberman, Henry, additional, Mengshoel, Ole J., additional, Metzler, Ted, additional, Modayil, Joseph, additional, Oard, Douglas W., additional, Onder, Nilufer, additional, O'Sullivan, Barry, additional, Pastra, Katerina, additional, Precup, Doina, additional, Ramachandran, Sowmya, additional, Reed, Chris, additional, Sariel‐Talay, Sanem, additional, Selker, Ted, additional, Shastri, Lokendra, additional, Singh, Satinder, additional, Smith, Stephen F., additional, Srivastava, Siddharth, additional, Sukthankar, Gita, additional, Uthus, David C., additional, and Williams, Mary‐Anne, additional
Published: 2012
Full Text: View/download PDF

31. Solving the traveling tournament problem with iterative-deepening A∗

Author: Uthus, David C., primary, Riddle, Patricia J., additional, and Guesgen, Hans W., additional
Published: 2011
Full Text: View/download PDF

32. An ant colony optimization approach to the traveling tournament problem

Author: Uthus, David C., primary, Riddle, Patricia J., additional, and Guesgen, Hans W., additional
Published: 2009
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

32 results on '"Uthus, David"'

1. Reconsidering Sentence-Level Sign Language Translation

2. Memory Augmented Language Models through Mixture of Word Experts

3. YouTube-ASL: A Large-Scale, Open-Domain American Sign Language-English Parallel Corpus

4. mLongT5: A Multilingual and Efficient Text-To-Text Transformer for Longer Sequences

5. CoLT5: Faster Long-Range Transformers with Conditional Computation

6. RISE: Leveraging Retrieval Techniques for Summarization Evaluation

7. LongT5: Efficient Text-To-Text Transformer for Long Sequences

8. Augmenting Poetry Composition with Verse by Verse

9. Investigating Societal Biases in a Poetry Composition System

10. TextSETTR: Few-Shot Text Style Extraction and Tunable Targeted Restyling

11. mLongT5: A Multilingual and Efficient Text-To-Text Transformer for Longer Sequences

12. RISE: Leveraging Retrieval Techniques for Summarization Evaluation

13. CoLT5: Faster Long-Range Transformers with Conditional Computation

14. DFS* and the Traveling Tournament Problem

15. Ant Colony Optimization and the Single Round Robin Maximum Value Problem

16. Multiparticipant chat analysis: A survey

17. Augmenting Poetry Composition with Verse by Verse

18. LongT5: Efficient Text-To-Text Transformer for Long Sequences

19. Solving the traveling tournament problem with iterative-deepening A∗

20. TextSETTR: Few-Shot Text Style Extraction and Tunable Targeted Restyling

21. DFS* and the Traveling Tournament Problem

22. Ant Colony Optimization and the Single Round Robin Maximum Value Problem

23. Detecting Bot-Answerable Questions in Ubuntu Chat

24. Extending Word Highlighting in Multiparticipant Chat

25. The Ubuntu Chat Corpus for Multiparticipant Chat Analysis

26. Multiparticipant Chat Analysis: A Survey

27. Automated Chat Generator

28. Plans Toward Automated Chat Summarization

29. Solving the Traveling Tournament Problem with Iterative-Deepening A*

30. Reports of the AAAI 2011 Conference Workshops

31. Solving the traveling tournament problem with iterative-deepening A∗

32. An ant colony optimization approach to the traveling tournament problem

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

32 results on '"Uthus, David"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources