674 results on '"Kumar, Shankar"'
Search Results
2. Spelling Correction through Rewriting of Non-Autoregressive ASR Lattices
- Author
-
Velikovich, Leonid, Li, Christopher, Caseiro, Diamantino, Kumar, Shankar, Rondon, Pat, Joshi, Kandarp, and Velez, Xavier
- Subjects
Computer Science - Computation and Language ,Computer Science - Sound ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
For end-to-end Automatic Speech Recognition (ASR) models, recognizing personal or rare phrases can be hard. A promising way to improve accuracy is through spelling correction (or rewriting) of the ASR lattice, where potentially misrecognized phrases are replaced with acoustically similar and contextually relevant alternatives. However, rewriting is challenging for ASR models trained with connectionist temporal classification (CTC) due to noisy hypotheses produced by a non-autoregressive, context-independent beam search. We present a finite-state transducer (FST) technique for rewriting wordpiece lattices generated by Transformer-based CTC models. Our algorithm performs grapheme-to-phoneme (G2P) conversion directly from wordpieces into phonemes, avoiding explicit word representations and exploiting the richness of the CTC lattice. Our approach requires no retraining or modification of the ASR model. We achieved up to a 15.2% relative reduction in sentence error rate (SER) on a test set with contextually relevant entities., Comment: 8 pages, 7 figures
- Published
- 2024
3. Structural build-up model for three-dimensional concrete printing based on kinetics theory
- Author
-
Prem, Prabhat Ranjan, Ambily, P. S., Kumar, Shankar, Giridhar, Greeshma, and Jiao, Dengwu
- Published
- 2024
- Full Text
- View/download PDF
4. Effect of Deck Length on Ground Vibration in Dragline Bench Blasting Using Artificial Intelligence Methods
- Author
-
Singh, Chitranjan Prasad, Kumar, Shankar, and Mishra, Arvind Kumar
- Published
- 2024
- Full Text
- View/download PDF
5. Long-Form Speech Translation through Segmentation with Finite-State Decoding Constraints on Large Language Models
- Author
-
McCarthy, Arya D., Zhang, Hao, Kumar, Shankar, Stahlberg, Felix, and Wu, Ke
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
One challenge in speech translation is that plenty of spoken content is long-form, but short units are necessary for obtaining high-quality translations. To address this mismatch, we adapt large language models (LLMs) to split long ASR transcripts into segments that can be independently translated so as to maximize the overall translation quality. We overcome the tendency of hallucination in LLMs by incorporating finite-state constraints during decoding; these eliminate invalid outputs without requiring additional training. We discover that LLMs are adaptable to transcripts containing ASR errors through prompt-tuning or fine-tuning. Relative to a state-of-the-art automatic punctuation baseline, our best LLM improves the average BLEU by 2.9 points for English-German, English-Spanish, and English-Arabic TED talk translation in 9 test sets, just by improving segmentation., Comment: accepted to the Findings of EMNLP 2023. arXiv admin note: text overlap with arXiv:2212.09895
- Published
- 2023
6. Heterogeneous Federated Learning Using Knowledge Codistillation
- Author
-
Lichtarge, Jared, Amid, Ehsan, Kumar, Shankar, Yang, Tien-Ju, Anil, Rohan, and Mathews, Rajiv
- Subjects
Computer Science - Machine Learning - Abstract
Federated Averaging, and many federated learning algorithm variants which build upon it, have a limitation: all clients must share the same model architecture. This results in unused modeling capacity on many clients, which limits model performance. To address this issue, we propose a method that involves training a small model on the entire pool and a larger model on a subset of clients with higher capacity. The models exchange information bidirectionally via knowledge distillation, utilizing an unlabeled dataset on a server without sharing parameters. We present two variants of our method, which improve upon federated averaging on image classification and language modeling tasks. We show this technique can be useful even if only out-of-domain or limited in-domain distillation data is available. Additionally, the bi-directional knowledge distillation allows for domain transfer between the models when different pool populations introduce domain shift.
- Published
- 2023
7. Towards an On-device Agent for Text Rewriting
- Author
-
Zhu, Yun, Liu, Yinxiao, Stahlberg, Felix, Kumar, Shankar, Chen, Yu-hui, Luo, Liangchen, Shu, Lei, Liu, Renjie, Chen, Jindong, and Meng, Lei
- Subjects
Computer Science - Computation and Language - Abstract
Large Language Models (LLMs) have demonstrated impressive capabilities for text rewriting. Nonetheless, the large sizes of these models make them impractical for on-device inference, which would otherwise allow for enhanced privacy and economical inference. Creating a smaller yet potent language model for text rewriting presents a formidable challenge because it requires balancing the need for a small size with the need to retain the emergent capabilities of the LLM, that requires costly data collection. To address the above challenge, we introduce a new instruction tuning approach for building a mobile-centric text rewriting model. Our strategies enable the generation of high quality training data without any human labeling. In addition, we propose a heuristic reinforcement learning framework which substantially enhances performance without requiring preference data. To further bridge the performance gap with the larger server-side model, we propose an effective approach that combines the mobile rewrite agent with the server model using a cascade. To tailor the text rewriting tasks to mobile scenarios, we introduce MessageRewriteEval, a benchmark that focuses on text rewriting for messages through natural language instructions. Through empirical experiments, we demonstrate that our on-device model surpasses the current state-of-the-art LLMs in text rewriting while maintaining a significantly reduced model size. Notably, we show that our proposed cascading approach improves model performance.
- Published
- 2023
8. Semantic Segmentation with Bidirectional Language Models Improves Long-form ASR
- Author
-
Huang, W. Ronny, Zhang, Hao, Kumar, Shankar, Chang, Shuo-yiin, and Sainath, Tara N.
- Subjects
Computer Science - Computation and Language ,Computer Science - Machine Learning ,Computer Science - Sound ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
We propose a method of segmenting long-form speech by separating semantically complete sentences within the utterance. This prevents the ASR decoder from needlessly processing faraway context while also preventing it from missing relevant context within the current sentence. Semantically complete sentence boundaries are typically demarcated by punctuation in written text; but unfortunately, spoken real-world utterances rarely contain punctuation. We address this limitation by distilling punctuation knowledge from a bidirectional teacher language model (LM) trained on written, punctuated text. We compare our segmenter, which is distilled from the LM teacher, against a segmenter distilled from a acoustic-pause-based teacher used in other works, on a streaming ASR pipeline. The pipeline with our segmenter achieves a 3.2% relative WER gain along with a 60 ms median end-of-segment latency reduction on a YouTube captioning task., Comment: Interspeech 2023. First 3 authors contributed equally
- Published
- 2023
9. Nasal Bone Fracture Reduction Under Local Anaesthesia: A Holistic Approach to Nasal Blocks and a Comparison with General Anaesthesia
- Author
-
Jana, Sonali, Guha, Ruma, De, Kumar Shankar, Adhikari, Biswajit, and Das, Prithvi
- Published
- 2024
- Full Text
- View/download PDF
10. Measuring Re-identification Risk
- Author
-
Carey, CJ, Dick, Travis, Epasto, Alessandro, Javanmard, Adel, Karlin, Josh, Kumar, Shankar, Medina, Andres Munoz, Mirrokni, Vahab, Nunes, Gabriel Henrique, Vassilvitskii, Sergei, and Zhong, Peilin
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Machine Learning - Abstract
Compact user representations (such as embeddings) form the backbone of personalization services. In this work, we present a new theoretical framework to measure re-identification risk in such user representations. Our framework, based on hypothesis testing, formally bounds the probability that an attacker may be able to obtain the identity of a user from their representation. As an application, we show how our framework is general enough to model important real-world applications such as the Chrome's Topics API for interest-based advertising. We complement our theoretical bounds by showing provably good attack algorithms for re-identification that we use to estimate the re-identification risk in the Topics API. We believe this work provides a rigorous and interpretable notion of re-identification risk and a framework to measure it that can be used to inform real-world applications.
- Published
- 2023
11. Static and dynamic performance of single batter piles embedded in slope
- Author
-
Kumar, Shankar, Najar, Danish Shafi, Sarkar, Rajib, and Nainegali, Lohitkumar
- Published
- 2024
- Full Text
- View/download PDF
12. Improved Long-Form Spoken Language Translation with Large Language Models
- Author
-
McCarthy, Arya D., Zhang, Hao, Kumar, Shankar, Stahlberg, Felix, and Ng, Axel H.
- Subjects
Computer Science - Computation and Language - Abstract
A challenge in spoken language translation is that plenty of spoken content is long-form, but short units are necessary for obtaining high-quality translations. To address this mismatch, we fine-tune a general-purpose, large language model to split long ASR transcripts into segments that can be independently translated so as to maximize the overall translation quality. We compare to several segmentation strategies and find that our approach improves BLEU score on three languages by an average of 2.7 BLEU overall compared to an automatic punctuation baseline. Further, we demonstrate the effectiveness of two constrained decoding strategies to improve well-formedness of the model output from above 99% to 100%.
- Published
- 2022
13. Conciseness: An Overlooked Language Task
- Author
-
Stahlberg, Felix, Kumar, Aashish, Alberti, Chris, and Kumar, Shankar
- Subjects
Computer Science - Computation and Language - Abstract
We report on novel investigations into training models that make sentences concise. We define the task and show that it is different from related tasks such as summarization and simplification. For evaluation, we release two test sets, consisting of 2000 sentences each, that were annotated by two and five human annotators, respectively. We demonstrate that conciseness is a difficult task for which zero-shot setups with large neural language models often do not perform well. Given the limitations of these approaches, we propose a synthetic data generation method based on round-trip translations. Using this data to either train Transformers from scratch or fine-tune T5 models yields our strongest baselines that can be further improved by fine-tuning on an artificial conciseness dataset that we derived from multi-annotator machine translation test sets., Comment: EMNLP 2022 Workshop on Text Simplification, Accessibility, and Readability (TSAR)
- Published
- 2022
14. Simple and Effective Gradient-Based Tuning of Sequence-to-Sequence Models
- Author
-
Lichtarge, Jared, Alberti, Chris, and Kumar, Shankar
- Subjects
Computer Science - Computation and Language ,Computer Science - Machine Learning - Abstract
Recent trends towards training ever-larger language models have substantially improved machine learning performance across linguistic tasks. However, the huge cost of training larger models can make tuning them prohibitively expensive, motivating the study of more efficient methods. Gradient-based hyper-parameter optimization offers the capacity to tune hyper-parameters during training, yet has not previously been studied in a sequence-to-sequence setting. We apply a simple and general gradient-based hyperparameter optimization method to sequence-to-sequence tasks for the first time, demonstrating both efficiency and performance gains over strong baselines for both Neural Machine Translation and Natural Language Understanding (NLU) tasks (via T5 pretraining). For translation, we show the method generalizes across language pairs, is more efficient than Bayesian hyper-parameter optimization, and that learned schedules for some hyper-parameters can out-perform even optimal constant-valued tuning. For T5, we show that learning hyper-parameters during pretraining can improve performance across downstream NLU tasks. When learning multiple hyper-parameters concurrently, we show that the global learning rate can follow a schedule over training that improves performance and is not explainable by the `short-horizon bias' of greedy methods \citep{wu2018}. We release the code used to facilitate further research., Comment: 18 pages, 6 figures, In Proceedings of AutoML 2022 (Workshop track), Baltimore, MD, USA
- Published
- 2022
15. Diagnostic performance of sonographic activity scores for adult terminal ileal Crohn’s disease compared to magnetic resonance and histological reference standards: experience from the METRIC trial
- Author
-
Kumar, Shankar, Parry, Thomas, Mallett, Sue, Plumb, Andrew, Bhatnagar, Gauraang, Beable, Richard, Betts, Margaret, Duncan, Gillian, Gupta, Arun, Higginson, Antony, Hyland, Rachel, Lapham, Roger, Patel, Uday, Pilcher, James, Slater, Andrew, Tolan, Damian, Zealley, Ian, Halligan, Steve, and Taylor, Stuart A.
- Published
- 2024
- Full Text
- View/download PDF
16. Submandibular Gland Excision with Facial Artery Preservation: The Argument for Changing the Established Norms
- Author
-
Das, Prithvi, De, Kumar Shankar, and Saha, Somnath
- Published
- 2023
- Full Text
- View/download PDF
17. Text Generation with Text-Editing Models
- Author
-
Malmi, Eric, Dong, Yue, Mallinson, Jonathan, Chuklin, Aleksandr, Adamek, Jakub, Mirylenka, Daniil, Stahlberg, Felix, Krause, Sebastian, Kumar, Shankar, and Severyn, Aliaksei
- Subjects
Computer Science - Computation and Language - Abstract
Text-editing models have recently become a prominent alternative to seq2seq models for monolingual text-generation tasks such as grammatical error correction, simplification, and style transfer. These tasks share a common trait - they exhibit a large amount of textual overlap between the source and target texts. Text-editing models take advantage of this observation and learn to generate the output by predicting edit operations applied to the source sequence. In contrast, seq2seq models generate outputs word-by-word from scratch thus making them slow at inference time. Text-editing models provide several benefits over seq2seq models including faster inference speed, higher sample efficiency, and better control and interpretability of the outputs. This tutorial provides a comprehensive overview of text-editing models and current state-of-the-art approaches, and analyzes their pros and cons. We discuss challenges related to productionization and how these models can be used to mitigate hallucination and bias, both pressing challenges in the field of text generation., Comment: Accepted as a tutorial at NAACL 2022
- Published
- 2022
18. Jam or Cream First? Modeling Ambiguity in Neural Machine Translation with SCONES
- Author
-
Stahlberg, Felix and Kumar, Shankar
- Subjects
Computer Science - Computation and Language - Abstract
The softmax layer in neural machine translation is designed to model the distribution over mutually exclusive tokens. Machine translation, however, is intrinsically uncertain: the same source sentence can have multiple semantically equivalent translations. Therefore, we propose to replace the softmax activation with a multi-label classification layer that can model ambiguity more effectively. We call our loss function Single-label Contrastive Objective for Non-Exclusive Sequences (SCONES). We show that the multi-label output layer can still be trained on single reference training data using the SCONES loss function. SCONES yields consistent BLEU score gains across six translation directions, particularly for medium-resource language pairs and small beam sizes. By using smaller beam sizes we can speed up inference by a factor of 3.9x and still match or improve the BLEU score obtained using softmax. Furthermore, we demonstrate that SCONES can be used to train NMT models that assign the highest probability to adequate translations, thus mitigating the "beam search curse". Additional experiments on synthetic language pairs with varying levels of uncertainty suggest that the improvements from SCONES can be attributed to better handling of ambiguity., Comment: NAACL 2022 paper
- Published
- 2022
19. Design, synthesis and anticancer activity of Novel benzimidazole containing quinoline hybrids
- Author
-
Shashidhar Bharadwaj Srinivasa, Boja Poojary, Bhuvanesh Sukhlal Kalal, Usha Brahmavara, Dhanashri Vaishali, Anupam J. Das, Thobias Mwalingo Kalenga, Maruthibabu Paidikondala, and Madan Kumar Shankar
- Subjects
Benzimidazole-quinoline hybrids ,One-pot ,Anticancer ,In-silico ,In-vitro ,Human melanoma cell line ,Chemistry ,QD1-999 - Abstract
In this work we present the synthesis of benzimidazole-quinoline hybrids series (9a-c and 10a-f), characterized using spectroscopy studies (FT-IR, 1H NMR, and mass spectroscopy). The precursor for the hybrid compounds consists of two schemes: i. synthesis of substituted quinoline-4-carboxylic acids (3a-b) with various acetophenones. ii. The key intermediates (8a-c) were obtained initially from the benzimidazole-5-carboxylates (7a-c), were efficiently synthesized by ‘one pot’ nitro reductive cyclization reaction between ethyl 3-nitro-4-(substituted amino) benzoates 6a-c and 5-bromothiophene-2-carbaldehyde. iii. Further, the benzimidazole esters (7a-c) were converted into the corresponding hydrazides (8a-c) and then finally obtained the benzimidazole-quinoline hybrids series (9a-c and 10a-f). Compounds 7a and 7b were crystallized and their molecular structures were determined using a single crystal X-ray diffraction method. The resultant compounds from the synthesis were screened (in-silico and in-vitro) for their anti-cancer activities (human melanoma cell line (A375) and human breast cancer cell line (MDA-MB-231)). The p53 receptor protein was used for the molecular docking analysis and compound (name) 10b binds the target site with four hydrogen bonds (−6.25 Kcal/mol). The antioxidant activity revealed compounds 9a (IC50 = 604.8 μg/mL) and 9b (IC50 = 604.8 μg/mL 683.7 μg/mL) to exhibit the highest percentage of inhibition and lowest IC50 value. In addition, compounds 10a and 10b showed high scavenging activity. The compounds 9a (A375: IC50 = 34.7 ± 0.9 µg/mL and MDA-MB-231: IC50 = 20.4 ± 1.1 µg/mL), 10a (A375: IC50 = 19.6 ± 1.3 µg/mL and MDA-MB-231: IC50 = 37.0 ± 1.3 µg/mL) and 10b (A375: IC50 = 16.5 ± 1.5 µg/mL and MDA-MB-231: IC50 = 13.4 ± 1.5 µg/mL) showed the significant cytotoxicity against these human cancer cell lines (melanoma and breast cancer) and can be potential anti-cancer molecules.
- Published
- 2024
- Full Text
- View/download PDF
20. Uncertainty Determines the Adequacy of the Mode and the Tractability of Decoding in Sequence-to-Sequence Models
- Author
-
Stahlberg, Felix, Kulikov, Ilia, and Kumar, Shankar
- Subjects
Computer Science - Computation and Language - Abstract
In many natural language processing (NLP) tasks the same input (e.g. source sentence) can have multiple possible outputs (e.g. translations). To analyze how this ambiguity (also known as intrinsic uncertainty) shapes the distribution learned by neural sequence models we measure sentence-level uncertainty by computing the degree of overlap between references in multi-reference test sets from two different NLP tasks: machine translation (MT) and grammatical error correction (GEC). At both the sentence- and the task-level, intrinsic uncertainty has major implications for various aspects of search such as the inductive biases in beam search and the complexity of exact search. In particular, we show that well-known pathologies such as a high number of beam search errors, the inadequacy of the mode, and the drop in system performance with large beam sizes apply to tasks with high level of ambiguity such as MT but not to less uncertain tasks such as GEC. Furthermore, we propose a novel exact $n$-best search algorithm for neural sequence models, and show that intrinsic uncertainty affects model uncertainty as the model tends to overly spread out the probability mass for uncertain tasks and sentences., Comment: ACL 2022 paper
- Published
- 2022
21. Scaling Language Model Size in Cross-Device Federated Learning
- Author
-
Ro, Jae Hun, Breiner, Theresa, McConnaughey, Lara, Chen, Mingqing, Suresh, Ananda Theertha, Kumar, Shankar, and Mathews, Rajiv
- Subjects
Computer Science - Computation and Language ,Computer Science - Machine Learning - Abstract
Most studies in cross-device federated learning focus on small models, due to the server-client communication and on-device computation bottlenecks. In this work, we leverage various techniques for mitigating these bottlenecks to train larger language models in cross-device federated learning. With systematic applications of partial model training, quantization, efficient transfer learning, and communication-efficient optimizers, we are able to train a $21$M parameter Transformer and $20.2$M parameter Conformer that achieve the same or better perplexity as that of a similarly sized LSTM with $\sim10\times$ smaller client-to-server communication cost and $11\%$ lower perplexity than smaller LSTMs commonly studied in literature.
- Published
- 2022
22. Sentence-Select: Large-Scale Language Model Data Selection for Rare-Word Speech Recognition
- Author
-
Huang, W. Ronny, Peyser, Cal, Sainath, Tara N., Pang, Ruoming, Strohman, Trevor, and Kumar, Shankar
- Subjects
Computer Science - Computation and Language ,Computer Science - Machine Learning ,Computer Science - Sound ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
Language model fusion helps smart assistants recognize words which are rare in acoustic data but abundant in text-only corpora (typed search logs). However, such corpora have properties that hinder downstream performance, including being (1) too large, (2) beset with domain-mismatched content, and (3) heavy-headed rather than heavy-tailed (excessively many duplicate search queries such as "weather"). We show that three simple strategies for selecting language modeling data can dramatically improve rare-word recognition without harming overall performance. First, to address the heavy-headedness, we downsample the data according to a soft log function, which tunably reduces high frequency (head) sentences. Second, to encourage rare-word exposure, we explicitly filter for words rare in the acoustic data. Finally, we tackle domain-mismatch via perplexity-based contrastive selection, filtering for examples matched to the target domain. We down-select a large corpus of web search queries by a factor of 53x and achieve better LM perplexities than without down-selection. When shallow-fused with a state-of-the-art, production speech engine, our LM achieves WER reductions of up to 24% relative on rare-word sentences (without changing overall WER) compared to a baseline LM trained on the raw corpus. These gains are further validated through favorable side-by-side evaluations on live voice search traffic., Comment: Interspeech 2022
- Published
- 2022
23. Capitalization Normalization for Language Modeling with an Accurate and Efficient Hierarchical RNN Model
- Author
-
Zhang, Hao, Cheng, You-Chi, Kumar, Shankar, Huang, W. Ronny, Chen, Mingqing, and Mathews, Rajiv
- Subjects
Computer Science - Computation and Language ,Computer Science - Machine Learning - Abstract
Capitalization normalization (truecasing) is the task of restoring the correct case (uppercase or lowercase) of noisy text. We propose a fast, accurate and compact two-level hierarchical word-and-character-based recurrent neural network model. We use the truecaser to normalize user-generated text in a Federated Learning framework for language modeling. A case-aware language model trained on this normalized text achieves the same perplexity as a model trained on text with gold capitalization. In a real user A/B experiment, we demonstrate that the improvement translates to reduced prediction error rates in a virtual keyboard application. Similarly, in an ASR language model fusion experiment, we show reduction in uppercase character error rate and word error rate., Comment: arXiv admin note: substantial text overlap with arXiv:2108.11943
- Published
- 2022
24. Coinage metal-catalyzed or-mediated oxidative heteroarylation of arenes
- Author
-
Kishor Jha, Abadh, Kumar, Shankar, Ravi, Rangnath, Akanksha, Roy, Sahil, Kumar Jha, Vikesh, Gupta, Sangeeta, Yadav, Poonam, Kumar Rauta, Akshaya, and Aggarwal, Anil K.
- Published
- 2024
- Full Text
- View/download PDF
25. Transformer-based Models of Text Normalization for Speech Applications
- Author
-
Ro, Jae Hun, Stahlberg, Felix, Wu, Ke, and Kumar, Shankar
- Subjects
Computer Science - Machine Learning - Abstract
Text normalization, or the process of transforming text into a consistent, canonical form, is crucial for speech applications such as text-to-speech synthesis (TTS). In TTS, the system must decide whether to verbalize "1995" as "nineteen ninety five" in "born in 1995" or as "one thousand nine hundred ninety five" in "page 1995". We present an experimental comparison of various Transformer-based sequence-to-sequence (seq2seq) models of text normalization for speech and evaluate them on a variety of datasets of written text aligned to its normalized spoken form. These models include variants of the 2-stage RNN-based tagging/seq2seq architecture introduced by Zhang et al. (2019), where we replace the RNN with a Transformer in one or more stages, as well as vanilla Transformers that output string representations of edit sequences. Of our approaches, using Transformers for sentence context encoding within the 2-stage model proved most effective, with the fine-tuned BERT encoder yielding the best performance.
- Published
- 2022
26. A Structural and In Silico Investigation of Potential CDC7 Kinase Enzyme Inhibitors
- Author
-
Mohanbabu Mookkan, Saravanan Kandasamy, Abdel-Basit Al-Odayni, Naaser Ahmed Yaseen Abduh, Sugarthi Srinivasan, Bistuvalli Chandrashekara Revannasidappa, Vasantha Kumar, Kalaiarasi Chinnasamy, Sanmargam Aravindhan, and Madan Kumar Shankar
- Subjects
Chemistry ,QD1-999 - Published
- 2023
- Full Text
- View/download PDF
27. Long-Term Safety and Immunogenicity of AZD1222 (ChAdOx1 nCoV-19): 2-Year Follow-Up from a Phase 3 Study
- Author
-
Kathryn Shoemaker, Karina Soboleva, Angela Branche, Shivanjali Shankaran, Deborah A. Theodore, Muhammad Bari, Victor Ezeh, Justin Green, Elizabeth Kelly, Dongmei Lan, Urban Olsson, Senthilkumar Saminathan, Nirmal Kumar Shankar, Berta Villegas, Tonya Villafana, Ann R. Falsey, and Magdalena E. Sobieszczyk
- Subjects
AZD1222 (ChAdOx1 nCoV-19) ,COVID-19 ,long-term safety ,SARS-CoV-2 ,humoral immunogenicity ,COVID-19 vaccine ,Medicine - Abstract
A better understanding of the long-term safety, efficacy, and immunogenicity of COVID-19 vaccines is needed. This phase 3, randomized, placebo-controlled study for AZD1222 (ChAdOx1 nCoV-19) primary-series vaccination enrolled 32,450 participants in the USA, Chile, and Peru between August 2020 and January 2021 (NCT04516746). Endpoints included the 2-year follow-up assessment of safety, efficacy, and immunogenicity. After 2 years, no emergent safety signals were observed for AZD1222, and no cases of thrombotic thrombocytopenia syndrome were reported. The assessment of anti-SARS-CoV-2 nucleocapsid antibody titers confirmed the durability of AZD1222 efficacy for up to 6 months, after which infection rates in the AZD1222 group increased over time. Despite this, all-cause and COVID-19-related mortality remained low through the study end, potentially reflecting the post-Omicron decoupling of SARS-CoV-2 infection rates and severe COVID-19 outcomes. Geometric mean titers were elevated for anti-SARS-CoV-2 neutralizing antibodies at the 1-year study visit and the anti-spike antibodies were elevated at year 2, providing further evidence of increasing SARS-CoV-2 infections over long-term follow-up. Overall, this 2-year follow-up of the AZD1222 phase 3 study confirms that the long-term safety profile remains consistent with previous findings and supports the continued need for COVID-19 booster vaccinations due to waning efficacy and humoral immunity.
- Published
- 2024
- Full Text
- View/download PDF
28. Position-Invariant Truecasing with a Word-and-Character Hierarchical Recurrent Neural Network
- Author
-
Zhang, Hao, Cheng, You-Chi, Kumar, Shankar, Chen, Mingqing, and Mathews, Rajiv
- Subjects
Computer Science - Computation and Language - Abstract
Truecasing is the task of restoring the correct case (uppercase or lowercase) of noisy text generated either by an automatic system for speech recognition or machine translation or by humans. It improves the performance of downstream NLP tasks such as named entity recognition and language modeling. We propose a fast, accurate and compact two-level hierarchical word-and-character-based recurrent neural network model, the first of its kind for this problem. Using sequence distillation, we also address the problem of truecasing while ignoring token positions in the sentence, i.e. in a position-invariant manner.
- Published
- 2021
29. Machine learning approach for the prediction of mining-induced stress in underground mines to mitigate ground control disasters and accidents
- Author
-
Vinay, Lingampally Sai, Bhattacharjee, Ram Madhab, Ghosh, Nilabjendu, and Kumar, Shankar
- Published
- 2023
- Full Text
- View/download PDF
30. Synthetic Data Generation for Grammatical Error Correction with Tagged Corruption Models
- Author
-
Stahlberg, Felix and Kumar, Shankar
- Subjects
Computer Science - Computation and Language - Abstract
Synthetic data generation is widely known to boost the accuracy of neural grammatical error correction (GEC) systems, but existing methods often lack diversity or are too simplistic to generate the broad range of grammatical errors made by human writers. In this work, we use error type tags from automatic annotation tools such as ERRANT to guide synthetic data generation. We compare several models that can produce an ungrammatical sentence given a clean sentence and an error type tag. We use these models to build a new, large synthetic pre-training data set with error tag frequency distributions matching a given development set. Our synthetic data set yields large and consistent gains, improving the state-of-the-art on the BEA-19 and CoNLL-14 test sets. We also show that our approach is particularly effective in adapting a GEC system, trained on mixed native and non-native English, to a native English test set, even surpassing real training data consisting of high-quality sentence pairs., Comment: Proceedings of the 16th Workshop on Innovative Use of NLP for Building Educational Applications, 2021. https://github.com/google-research-datasets/C4_200M-synthetic-dataset-for-grammatical-error-correction
- Published
- 2021
31. Estimation Equations for Back Break and Ground Vibration Using Genetic Programming
- Author
-
Kumar, Shankar, Mishra, Arvind Kumar, and Choudhary, Bhanwar Singh
- Published
- 2023
- Full Text
- View/download PDF
32. Lookup-Table Recurrent Language Models for Long Tail Speech Recognition
- Author
-
Huang, W. Ronny, Sainath, Tara N., Peyser, Cal, Kumar, Shankar, Rybach, David, and Strohman, Trevor
- Subjects
Computer Science - Computation and Language ,Computer Science - Sound ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
We introduce Lookup-Table Language Models (LookupLM), a method for scaling up the size of RNN language models with only a constant increase in the floating point operations, by increasing the expressivity of the embedding table. In particular, we instantiate an (additional) embedding table which embeds the previous n-gram token sequence, rather than a single token. This allows the embedding table to be scaled up arbitrarily -- with a commensurate increase in performance -- without changing the token vocabulary. Since embeddings are sparsely retrieved from the table via a lookup; increasing the size of the table adds neither extra operations to each forward pass nor extra parameters that need to be stored on limited GPU/TPU memory. We explore scaling n-gram embedding tables up to nearly a billion parameters. When trained on a 3-billion sentence corpus, we find that LookupLM improves long tail log perplexity by 2.44 and long tail WER by 23.4% on a downstream speech recognition task over a standard RNN language model baseline, an improvement comparable to a scaling up the baseline by 6.2x the number of floating point operations., Comment: Presented as conference paper at Interspeech 2021
- Published
- 2021
33. Chest radiograph classification and severity of suspected COVID-19 by different radiologist groups and attending clinicians: multi-reader, multi-case study
- Author
-
Nair, Arjun, Procter, Alexander, Halligan, Steve, Parry, Thomas, Ahmed, Asia, Duncan, Mark, Taylor, Magali, Chouhan, Manil, Gaunt, Trevor, Roberts, James, van Vucht, Niels, Campbell, Alan, Davis, Laura May, Jacob, Joseph, Hubbard, Rachel, Kumar, Shankar, Said, Ammaarah, Chan, Xinhui, Cutfield, Tim, Luintel, Akish, Marks, Michael, Stone, Neil, and Mallet, Sue
- Published
- 2023
- Full Text
- View/download PDF
34. Septoplasty- Does It Change the Quality of Life in Patients Having Nasal Obstruction?
- Author
-
Priyanka Debbarma, Sonali Jana, Ruma Guha, Kumar Shankar De, and Bivas Adhikary
- Subjects
nose score ,septoplasty ,diagnostic nasal endoscopy ,deviated nasal septum ,Medicine ,Otorhinolaryngology ,RF1-547 - Abstract
Introduction: Deviated nasal septum (DNS) is one of the most common causes of nasal obstruction. However, there are several other etiologies which cause difficulty in nasal breathing. The definitive treatment of symptomatic DNS is septoplasty. The efficacy of septoplasty remains controversial as there are no solid tools for clinical evaluation of patients for establishment of reliable statistical data. Our aim was to evaluate patients who have undergone septoplasty for symptomatic DNS, by following them up with the NOSE questionnaire for predicting the surgical efficacy of septoplasty. Materials and Methods: This was a prospective observational study, conducted over 1 year. 50 cases of either sex of 18-65 years, having symptomatic DNS not relieved by medical management, and corrected by isolated septoplasty were included. The primary outcome was measured by the NOSE questionnaire, applied before surgery, and 6th week and 12th week after the procedure from which the 12th week nasal score was taken into consideration. Results: The paired T test done among the variables showed a clear significance, proving the efficacy of the NOSE score in predicting the symptom reducing efficiency of septoplasty Conclusion: The NOSE score can be used as a regular quality scoring system in analysing the outcomes of septoplasty.
- Published
- 2023
- Full Text
- View/download PDF
35. Seq2Edits: Sequence Transduction Using Span-level Edit Operations
- Author
-
Stahlberg, Felix and Kumar, Shankar
- Subjects
Computer Science - Computation and Language - Abstract
We propose Seq2Edits, an open-vocabulary approach to sequence editing for natural language processing (NLP) tasks with a high degree of overlap between input and output texts. In this approach, each sequence-to-sequence transduction is represented as a sequence of edit operations, where each operation either replaces an entire source span with target tokens or keeps it unchanged. We evaluate our method on five NLP tasks (text normalization, sentence fusion, sentence splitting & rephrasing, text simplification, and grammatical error correction) and report competitive results across the board. For grammatical error correction, our method speeds up inference by up to 5.2x compared to full sequence models because inference time depends on the number of edits rather than the number of target tokens. For text normalization, sentence fusion, and grammatical error correction, our approach improves explainability by associating each edit operation with a human-readable tag., Comment: Accepted at EMNLP 2020
- Published
- 2020
36. Improving Tail Performance of a Deliberation E2E ASR Model Using a Large Text Corpus
- Author
-
Peyser, Cal, Mavandadi, Sepand, Sainath, Tara N., Apfel, James, Pang, Ruoming, and Kumar, Shankar
- Subjects
Electrical Engineering and Systems Science - Audio and Speech Processing ,Computer Science - Machine Learning - Abstract
End-to-end (E2E) automatic speech recognition (ASR) systems lack the distinct language model (LM) component that characterizes traditional speech systems. While this simplifies the model architecture, it complicates the task of incorporating text-only data into training, which is important to the recognition of tail words that do not occur often in audio-text pairs. While shallow fusion has been proposed as a method for incorporating a pre-trained LM into an E2E model at inference time, it has not yet been explored for very large text corpora, and it has been shown to be very sensitive to hyperparameter settings in the beam search. In this work, we apply shallow fusion to incorporate a very large text corpus into a state-of-the-art E2EASR model. We explore the impact of model size and show that intelligent pruning of the training set can be more effective than increasing the parameter count. Additionally, we show that incorporating the LM in minimum word error rate (MWER) fine tuning makes shallow fusion far less dependent on optimal hyperparameter settings, reducing the difficulty of that tuning problem.
- Published
- 2020
37. Data Weighted Training Strategies for Grammatical Error Correction
- Author
-
Lichtarge, Jared, Alberti, Chris, and Kumar, Shankar
- Subjects
Computer Science - Computation and Language ,Statistics - Machine Learning - Abstract
Recent progress in the task of Grammatical Error Correction (GEC) has been driven by addressing data sparsity, both through new methods for generating large and noisy pretraining data and through the publication of small and higher-quality finetuning data in the BEA-2019 shared task. Building upon recent work in Neural Machine Translation (NMT), we make use of both kinds of data by deriving example-level scores on our large pretraining data based on a smaller, higher-quality dataset. In this work, we perform an empirical study to discover how to best incorporate delta-log-perplexity, a type of example scoring, into a training schedule for GEC. In doing so, we perform experiments that shed light on the function and applicability of delta-log-perplexity. Models trained on scored data achieve state-of-the-art results on common GEC test sets., Comment: Accepted to TACL (Transactions of the Association for Computational Linguistics)
- Published
- 2020
38. Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss
- Author
-
Zhang, Qian, Lu, Han, Sak, Hasim, Tripathi, Anshuman, McDermott, Erik, Koo, Stephen, and Kumar, Shankar
- Subjects
Electrical Engineering and Systems Science - Audio and Speech Processing ,Computer Science - Computation and Language ,Computer Science - Sound - Abstract
In this paper we present an end-to-end speech recognition model with Transformer encoders that can be used in a streaming speech recognition system. Transformer computation blocks based on self-attention are used to encode both audio and label sequences independently. The activations from both audio and label encoders are combined with a feed-forward layer to compute a probability distribution over the label space for every combination of acoustic frame position and label history. This is similar to the Recurrent Neural Network Transducer (RNN-T) model, which uses RNNs for information encoding instead of Transformer encoders. The model is trained with the RNN-T loss well-suited to streaming decoding. We present results on the LibriSpeech dataset showing that limiting the left context for self-attention in the Transformer layers makes decoding computationally tractable for streaming, with only a slight degradation in accuracy. We also show that the full attention version of our model beats the-state-of-the art accuracy on the LibriSpeech benchmarks. Our results also show that we can bridge the gap between full attention and limited attention versions of our model by attending to a limited number of future frames., Comment: This is the final version of the paper submitted to the ICASSP 2020 on Oct 21, 2019
- Published
- 2020
39. Trends in Thyroid Nodules and Malignancy: A Two-Year Retrospective Study in a Tertiary Care Centre
- Author
-
Guha, Ruma, Jana, Sonali, Biswas, Arpan, De, Kumar Shankar, and Das, Prithvi
- Published
- 2023
- Full Text
- View/download PDF
40. Corpora Generation for Grammatical Error Correction
- Author
-
Lichtarge, Jared, Alberti, Chris, Kumar, Shankar, Shazeer, Noam, Parmar, Niki, and Tong, Simon
- Subjects
Computer Science - Computation and Language ,Statistics - Machine Learning - Abstract
Grammatical Error Correction (GEC) has been recently modeled using the sequence-to-sequence framework. However, unlike sequence transduction problems such as machine translation, GEC suffers from the lack of plentiful parallel data. We describe two approaches for generating large parallel datasets for GEC using publicly available Wikipedia data. The first method extracts source-target pairs from Wikipedia edit histories with minimal filtration heuristics, while the second method introduces noise into Wikipedia sentences via round-trip translation through bridge languages. Both strategies yield similar sized parallel corpora containing around 4B tokens. We employ an iterative decoding strategy that is tailored to the loosely supervised nature of our constructed corpora. We demonstrate that neural GEC models trained using either type of corpora give similar performance. Fine-tuning these models on the Lang-8 corpus and ensembling allows us to surpass the state of the art on both the CoNLL-2014 benchmark and the JFLEG task. We provide systematic analysis that compares the two approaches to data generation and highlights the effectiveness of ensembling., Comment: Accepted at NAACL 2019. arXiv admin note: text overlap with arXiv:1811.01710
- Published
- 2019
41. Neural Language Modeling with Visual Features
- Author
-
Anastasopoulos, Antonios, Kumar, Shankar, and Liao, Hank
- Subjects
Computer Science - Computation and Language - Abstract
Multimodal language models attempt to incorporate non-linguistic features for the language modeling task. In this work, we extend a standard recurrent neural network (RNN) language model with features derived from videos. We train our models on data that is two orders-of-magnitude bigger than datasets used in prior work. We perform a thorough exploration of model architectures for combining visual and text features. Our experiments on two corpora (YouCookII and 20bn-something-something-v2) show that the best performing architecture consists of middle fusion of visual and text features, yielding over 25% relative improvement in perplexity. We report analysis that provides insights into why our multimodal language model improves upon a standard RNN language model.
- Published
- 2019
42. Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling
- Author
-
Shen, Jonathan, Nguyen, Patrick, Wu, Yonghui, Chen, Zhifeng, Chen, Mia X., Jia, Ye, Kannan, Anjuli, Sainath, Tara, Cao, Yuan, Chiu, Chung-Cheng, He, Yanzhang, Chorowski, Jan, Hinsu, Smit, Laurenzo, Stella, Qin, James, Firat, Orhan, Macherey, Wolfgang, Gupta, Suyog, Bapna, Ankur, Zhang, Shuyuan, Pang, Ruoming, Weiss, Ron J., Prabhavalkar, Rohit, Liang, Qiao, Jacob, Benoit, Liang, Bowen, Lee, HyoukJoong, Chelba, Ciprian, Jean, Sébastien, Li, Bo, Johnson, Melvin, Anil, Rohan, Tibrewal, Rajat, Liu, Xiaobing, Eriguchi, Akiko, Jaitly, Navdeep, Ari, Naveen, Cherry, Colin, Haghani, Parisa, Good, Otavio, Cheng, Youlong, Alvarez, Raziel, Caswell, Isaac, Hsu, Wei-Ning, Yang, Zongheng, Wang, Kuan-Chieh, Gonina, Ekaterina, Tomanek, Katrin, Vanik, Ben, Wu, Zelin, Jones, Llion, Schuster, Mike, Huang, Yanping, Chen, Dehao, Irie, Kazuki, Foster, George, Richardson, John, Macherey, Klaus, Bruguier, Antoine, Zen, Heiga, Raffel, Colin, Kumar, Shankar, Rao, Kanishka, Rybach, David, Murray, Matthew, Peddinti, Vijayaditya, Krikun, Maxim, Bacchiani, Michiel A. U., Jablin, Thomas B., Suderman, Rob, Williams, Ian, Lee, Benjamin, Bhatia, Deepti, Carlson, Justin, Yavuz, Semih, Zhang, Yu, McGraw, Ian, Galkin, Max, Ge, Qi, Pundak, Golan, Whipkey, Chad, Wang, Todd, Alon, Uri, Lepikhin, Dmitry, Tian, Ye, Sabour, Sara, Chan, William, Toshniwal, Shubham, Liao, Baohua, Nirschl, Michael, and Rondon, Pat
- Subjects
Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
Lingvo is a Tensorflow framework offering a complete solution for collaborative deep learning research, with a particular focus towards sequence-to-sequence models. Lingvo models are composed of modular building blocks that are flexible and easily extensible, and experiment configurations are centralized and highly customizable. Distributed training and quantized inference are supported directly within the framework, and it contains existing implementations of a large number of utilities, helper functions, and the newest research ideas. Lingvo has been used in collaboration by dozens of researchers in more than 20 papers over the last two years. This document outlines the underlying design of Lingvo and serves as an introduction to the various pieces of the framework, while also offering examples of advanced features that showcase the capabilities of the framework.
- Published
- 2019
43. Are preoperative CT variables associated with the success or failure of subsequent ventral hernia repair: nested case-control study
- Author
-
Kumar, Shankar, Rao, Nikhil, Parker, Sam, Plumb, Andrew, Windsor, Alastair, Mallett, Sue, and Halligan, Steve
- Published
- 2022
- Full Text
- View/download PDF
44. Development of concrete mixes for 3D printing using simple tools and techniques
- Author
-
Giridhar, Greeshma, Prem, Prabhat Ranjan, and Kumar, Shankar
- Published
- 2023
- Full Text
- View/download PDF
45. A case series on the determination of pre-operative risk factors for anticipation of post operative hypocalcaemia in thyroidectomy
- Author
-
Adhikary, Akash, primary, De, Kumar Shankar, additional, and Adhikary, Bivas, additional
- Published
- 2024
- Full Text
- View/download PDF
46. Prediction of back break in blasting using random decision trees
- Author
-
Kumar, Shankar, Mishra, A. K., and Choudhary, B. S.
- Published
- 2022
- Full Text
- View/download PDF
47. Weakly Supervised Grammatical Error Correction using Iterative Decoding
- Author
-
Lichtarge, Jared, Alberti, Christopher, Kumar, Shankar, Shazeer, Noam, and Parmar, Niki
- Subjects
Computer Science - Computation and Language ,Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
We describe an approach to Grammatical Error Correction (GEC) that is effective at making use of models trained on large amounts of weakly supervised bitext. We train the Transformer sequence-to-sequence model on 4B tokens of Wikipedia revisions and employ an iterative decoding strategy that is tailored to the loosely-supervised nature of the Wikipedia training corpus. Finetuning on the Lang-8 corpus and ensembling yields an F0.5 of 58.3 on the CoNLL'14 benchmark and a GLEU of 62.4 on JFLEG. The combination of weakly supervised training and iterative decoding obtains an F0.5 of 48.2 on CoNLL'14 even without using any labeled GEC data.
- Published
- 2018
48. No Need for a Lexicon? Evaluating the Value of the Pronunciation Lexica in End-to-End Models
- Author
-
Sainath, Tara N., Prabhavalkar, Rohit, Kumar, Shankar, Lee, Seungji, Kannan, Anjuli, Rybach, David, Schogol, Vlad, Nguyen, Patrick, Li, Bo, Wu, Yonghui, Chen, Zhifeng, and Chiu, Chung-Cheng
- Subjects
Computer Science - Computation and Language ,Computer Science - Sound ,Electrical Engineering and Systems Science - Audio and Speech Processing ,Statistics - Machine Learning - Abstract
For decades, context-dependent phonemes have been the dominant sub-word unit for conventional acoustic modeling systems. This status quo has begun to be challenged recently by end-to-end models which seek to combine acoustic, pronunciation, and language model components into a single neural network. Such systems, which typically predict graphemes or words, simplify the recognition process since they remove the need for a separate expert-curated pronunciation lexicon to map from phoneme-based units to words. However, there has been little previous work comparing phoneme-based versus grapheme-based sub-word units in the end-to-end modeling framework, to determine whether the gains from such approaches are primarily due to the new probabilistic model, or from the joint learning of the various components with grapheme-based units. In this work, we conduct detailed experiments which are aimed at quantifying the value of phoneme-based pronunciation lexica in the context of end-to-end models. We examine phoneme-based end-to-end models, which are contrasted against grapheme-based ones on a large vocabulary English Voice-search task, where we find that graphemes do indeed outperform phonemes. We also compare grapheme and phoneme-based approaches on a multi-dialect English task, which once again confirm the superiority of graphemes, greatly simplifying the system for recognizing multiple dialects.
- Published
- 2017
49. Lattice Rescoring Strategies for Long Short Term Memory Language Models in Speech Recognition
- Author
-
Kumar, Shankar, Nirschl, Michael, Holtmann-Rice, Daniel, Liao, Hank, Suresh, Ananda Theertha, and Yu, Felix
- Subjects
Statistics - Machine Learning ,Computer Science - Computation and Language ,Computer Science - Learning - Abstract
Recurrent neural network (RNN) language models (LMs) and Long Short Term Memory (LSTM) LMs, a variant of RNN LMs, have been shown to outperform traditional N-gram LMs on speech recognition tasks. However, these models are computationally more expensive than N-gram LMs for decoding, and thus, challenging to integrate into speech recognizers. Recent research has proposed the use of lattice-rescoring algorithms using RNNLMs and LSTMLMs as an efficient strategy to integrate these models into a speech recognition system. In this paper, we evaluate existing lattice rescoring algorithms along with new variants on a YouTube speech recognition task. Lattice rescoring using LSTMLMs reduces the word error rate (WER) for this task by 8\% relative to the WER obtained using an N-gram LM., Comment: Accepted at ASRU 2017
- Published
- 2017
50. Differential Diagnosis of Lateral Neck Masses
- Author
-
Ruma Guha, Sonali Jana, Prithvi Das, and Kumar Shankar De
- Subjects
Neck Mass ,Parapharyngeal ,Schwannoma ,Infiltrating Lipoma ,Giant Pleomorphic Adenoma ,Medicine ,Otorhinolaryngology ,RF1-547 - Abstract
Introduction: The neck and parapharyngeal space are one of the most vital regions in the body, encompassing multiple major blood vessels, nerves, spine and the airway itself. Lateral neck masses that present to an ENT practitioner may not only include a wide variety of differentials, but may present as emergencies in case of an airway compromise. In such situations, decision making and arriving at the diagnosis becomes important not only from a curative perspective but also from a lifesaving one. Here we discuss a few cases of lateral neck masses that had presented to us with unusual presentation or had a rare diagnosis, along with the line of management that was followed. Cases: We present a case of a giant pleomorphic adenoma presenting with stridor, an adult onset cystic hygroma, a schwannoma presenting with dyspnea, an isolated infiltrating lipoma and 2 other cases of schwannoma. Conclusion: Tumours of the neck and parapharyngeal region have a wide variety as this region contains almost every kind of tissue. Diagnosis of any lesion should be done with caution, using the appropriate history, examination and investigative tools available. Not only the common presentation, but also outliers and uncommon presentations of common tumours should be kept in mind.
- Published
- 2023
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.