Author: "Chaturvedi, Snigdha" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Chaturvedi, Snigdha"' showing total 153 results

Start Over Author "Chaturvedi, Snigdha"

153 results on '"Chaturvedi, Snigdha"'

1. SocialGaze: Improving the Integration of Human Social Norms in Large Language Models

Author: Vijjini, Anvesh Rao, Menon, Rakesh R., Fu, Jiayi, Srivastava, Shashank, and Chaturvedi, Snigdha
Subjects: Computer Science - Computation and Language, Computer Science - Computers and Society
Abstract: While much research has explored enhancing the reasoning capabilities of large language models (LLMs) in the last few years, there is a gap in understanding the alignment of these models with social values and norms. We introduce the task of judging social acceptance. Social acceptance requires models to judge and rationalize the acceptability of people's actions in social situations. For example, is it socially acceptable for a neighbor to ask others in the community to keep their pets indoors at night? We find that LLMs' understanding of social acceptance is often misaligned with human consensus. To alleviate this, we introduce SocialGaze, a multi-step prompting framework, in which a language model verbalizes a social situation from multiple perspectives before forming a judgment. Our experiments demonstrate that the SocialGaze approach improves the alignment with human judgments by up to 11 F1 points with the GPT-3.5 model. We also identify biases and correlations in LLMs in assigning blame that is related to features such as the gender (males are significantly more likely to be judged unfairly) and age (LLMs are more aligned with humans for older narrators).
Published: 2024

2. Structured Unrestricted-Rank Matrices for Parameter Efficient Fine-tuning

Author: Sehanobish, Arijit, Dubey, Avinava, Choromanski, Krzysztof, Chowdhury, Somnath Basu Roy, Jain, Deepali, Sindhwani, Vikas, and Chaturvedi, Snigdha
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition
Abstract: Recent efforts to scale Transformer models have demonstrated rapid progress across a wide range of tasks (Wei et al., 2022). However, fine-tuning these models for downstream tasks is expensive due to their large parameter counts. Parameter-efficient fine-tuning (PEFT) approaches have emerged as a viable alternative by allowing us to fine-tune models by updating only a small number of parameters. In this work, we propose a general framework for parameter efficient fine-tuning (PEFT), based on structured unrestricted-rank matrices (SURM) which can serve as a drop-in replacement for popular approaches such as Adapters and LoRA. Unlike other methods like LoRA, SURMs provides more flexibility in finding the right balance between compactness and expressiveness. This is achieved by using low displacement rank matrices (LDRMs), which hasn't been used in this context before. SURMs remain competitive with baselines, often providing significant quality improvements while using a smaller parameter budget. SURMs achieve 5-7% accuracy gains on various image classification tasks while replacing low-rank matrices in LoRA. It also results in up to 12x reduction of the number of parameters in adapters (with virtually no loss in quality) on the GLUE benchmark., Comment: Work in progress
Published: 2024

3. Towards Scalable Exact Machine Unlearning Using Parameter-Efficient Fine-Tuning

Author: Chowdhury, Somnath Basu Roy, Choromanski, Krzysztof, Sehanobish, Arijit, Dubey, Avinava, and Chaturvedi, Snigdha
Subjects: Computer Science - Machine Learning
Abstract: Machine unlearning is the process of efficiently removing the influence of a training data instance from a trained machine learning model without retraining it from scratch. A popular subclass of unlearning approaches is exact machine unlearning, which focuses on techniques that explicitly guarantee the removal of the influence of a data instance from a model. Exact unlearning approaches use a machine learning model in which individual components are trained on disjoint subsets of the data. During deletion, exact unlearning approaches only retrain the affected components rather than the entire model. While existing approaches reduce retraining costs, it can still be expensive for an organization to retrain a model component as it requires halting a system in production, which leads to service failure and adversely impacts customers. To address these challenges, we introduce an exact unlearning framework -- Sequence-aware Sharded Sliced Training (S3T), which is designed to enhance the deletion capabilities of an exact unlearning system while minimizing the impact on model's performance. At the core of S3T, we utilize a lightweight parameter-efficient fine-tuning approach that enables parameter isolation by sequentially training layers with disjoint data slices. This enables efficient unlearning by simply deactivating the layers affected by data deletion. Furthermore, to reduce the retraining cost and improve model performance, we train the model on multiple data sequences, which allows S3T to handle an increased number of deletion requests. Both theoretically and empirically, we demonstrate that S3T attains superior deletion capabilities and enhanced performance compared to baselines across a wide range of settings., Comment: Preliminary version accepted at the SafeGenAi Workshop, NeurIPS, 2024
Published: 2024

4. Fast Tree-Field Integrators: From Low Displacement Rank to Topological Transformers

Author: Choromanski, Krzysztof, Sehanobish, Arijit, Chowdhury, Somnath Basu Roy, Lin, Han, Dubey, Avinava, Sarlos, Tamas, and Chaturvedi, Snigdha
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: We present a new class of fast polylog-linear algorithms based on the theory of structured matrices (in particular low displacement rank) for integrating tensor fields defined on weighted trees. Several applications of the resulting fast tree-field integrators (FTFIs) are presented, including (a) approximation of graph metrics with tree metrics, (b) graph classification, (c) modeling on meshes, and finally (d) Topological Transformers (TTs) (Choromanski et al., 2022) for images. For Topological Transformers, we propose new relative position encoding (RPE) masking mechanisms with as few as three extra learnable parameters per Transformer layer, leading to 1.0-1.5%+ accuracy gains. Importantly, most of FTFIs are exact methods, thus numerically equivalent to their brute-force counterparts. When applied to graphs with thousands of nodes, those exact algorithms provide 5.7-13x speedups. We also provide an extensive theoretical analysis of our methods., Comment: Preprint. Comments welcome
Published: 2024

5. Exploring Safety-Utility Trade-Offs in Personalized Language Models

Author: Vijjini, Anvesh Rao, Chowdhury, Somnath Basu Roy, and Chaturvedi, Snigdha
Subjects: Computer Science - Computation and Language
Abstract: As large language models (LLMs) become increasingly integrated into daily applications, it is essential to ensure they operate fairly across diverse user demographics. In this work, we show that LLMs suffer from personalization bias, where their performance is impacted when they are personalized to a user's identity. We quantify personalization bias by evaluating the performance of LLMs along two axes - safety and utility. We measure safety by examining how benign LLM responses are to unsafe prompts with and without personalization. We measure utility by evaluating the LLM's performance on various tasks, including general knowledge, mathematical abilities, programming, and reasoning skills. We find that various LLMs, ranging from open-source models like Llama (Touvron et al., 2023) and Mistral (Jiang et al., 2023) to API-based ones like GPT-3.5 and GPT-4o (Ouyang et al., 2022), exhibit significant variance in performance in terms of safety-utility trade-offs depending on the user's identity. Finally, we discuss several strategies to mitigate personalization bias using preference tuning and prompt-based defenses., Comment: Work in Progress
Published: 2024

6. Returning to the Start: Generating Narratives with Related Endpoints

Author: Brei, Anneliese, Zhao, Chao, and Chaturvedi, Snigdha
Subjects: Computer Science - Computation and Language
Abstract: Human writers often bookend their writing with ending sentences that relate back to the beginning sentences in order to compose a satisfying narrative that "closes the loop." Motivated by this observation, we propose RENarGen, a controllable story-generation paradigm that generates narratives by ensuring the first and last sentences are related and then infilling the middle sentences. Our contributions include an initial exploration of how various methods of bookending from Narratology affect language modeling for stories. Automatic and human evaluations indicate RENarGen produces better stories with more narrative closure than current autoregressive models.
Published: 2024

7. Rationale-based Opinion Summarization

Author: Li, Haoyuan and Chaturvedi, Snigdha
Subjects: Computer Science - Computation and Language
Abstract: Opinion summarization aims to generate concise summaries that present popular opinions of a large group of reviews. However, these summaries can be too generic and lack supporting details. To address these issues, we propose a new paradigm for summarizing reviews, rationale-based opinion summarization. Rationale-based opinion summaries output the representative opinions as well as one or more corresponding rationales. To extract good rationales, we define four desirable properties: relatedness, specificity, popularity, and diversity and present a Gibbs-sampling-based method to extract rationales. Overall, we propose RATION, an unsupervised extractive system that has two components: an Opinion Extractor (to extract representative opinions) and Rationales Extractor (to extract corresponding rationales). We conduct automatic and human evaluations to show that rationales extracted by RATION have the proposed properties and its summaries are more useful than conventional summaries. The implementation of our work is available at https://github.com/leehaoyuan/RATION.
Published: 2024

8. Incremental Extractive Opinion Summarization Using Cover Trees

Author: Chowdhury, Somnath Basu Roy, Monath, Nicholas, Dubey, Avinava, Zaheer, Manzil, McCallum, Andrew, Ahmed, Amr, and Chaturvedi, Snigdha
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Extractive opinion summarization involves automatically producing a summary of text about an entity (e.g., a product's reviews) by extracting representative sentences that capture prevalent opinions in the review set. Typically, in online marketplaces user reviews accumulate over time, and opinion summaries need to be updated periodically to provide customers with up-to-date information. In this work, we study the task of extractive opinion summarization in an incremental setting, where the underlying review set evolves over time. Many of the state-of-the-art extractive opinion summarization approaches are centrality-based, such as CentroidRank (Radev et al., 2004; Chowdhury et al., 2022). CentroidRank performs extractive summarization by selecting a subset of review sentences closest to the centroid in the representation space as the summary. However, these methods are not capable of operating efficiently in an incremental setting, where reviews arrive one at a time. In this paper, we present an efficient algorithm for accurately computing the CentroidRank summaries in an incremental setting. Our approach, CoverSumm, relies on indexing review representations in a cover tree and maintaining a reservoir of candidate summary review sentences. CoverSumm's efficacy is supported by a theoretical and empirical analysis of running time. Empirically, on a diverse collection of data (both real and synthetically created to illustrate scaling considerations), we demonstrate that CoverSumm is up to 36x faster than baseline methods, and capable of adapting to nuanced changes in data distribution. We also conduct human evaluations of the generated summaries and find that CoverSumm is capable of producing informative summaries consistent with the underlying review set., Comment: Accepted at TMLR
Published: 2024

9. Robust Concept Erasure via Kernelized Rate-Distortion Maximization

Author: Chowdhury, Somnath Basu Roy, Monath, Nicholas, Dubey, Avinava, Ahmed, Amr, and Chaturvedi, Snigdha
Subjects: Computer Science - Machine Learning, Computer Science - Computation and Language
Abstract: Distributed representations provide a vector space that captures meaningful relationships between data instances. The distributed nature of these representations, however, entangles together multiple attributes or concepts of data instances (e.g., the topic or sentiment of a text, characteristics of the author (age, gender, etc), etc). Recent work has proposed the task of concept erasure, in which rather than making a concept predictable, the goal is to remove an attribute from distributed representations while retaining other information from the original representation space as much as possible. In this paper, we propose a new distance metric learning-based objective, the Kernelized Rate-Distortion Maximizer (KRaM), for performing concept erasure. KRaM fits a transformation of representations to match a specified distance measure (defined by a labeled concept to erase) using a modified rate-distortion function. Specifically, KRaM's objective function aims to make instances with similar concept labels dissimilar in the learned representation space while retaining other information. We find that optimizing KRaM effectively erases various types of concepts: categorical, continuous, and vector-valued variables from data representations across diverse domains. We also provide a theoretical analysis of several properties of KRaM's objective. To assess the quality of the learned representations, we propose an alignment score to evaluate their similarity with the original representation space. Additionally, we conduct experiments to showcase KRaM's efficacy in various settings, from erasing binary gender variables in word embeddings to vector-valued variables in GPT-3 representations., Comment: NeurIPS 2023
Published: 2023

10. Affective and Dynamic Beam Search for Story Generation

Author: Huang, Tenghao, Qasemi, Ehsan, Li, Bangzheng, Wang, He, Brahman, Faeze, Chen, Muhao, and Chaturvedi, Snigdha
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Storytelling's captivating potential makes it a fascinating research area, with implications for entertainment, education, therapy, and cognitive studies. In this paper, we propose Affective Story Generator (AffGen) for generating interesting narratives. AffGen introduces "intriguing twists" in narratives by employing two novel techniques-Dynamic Beam Sizing and Affective Reranking. Dynamic Beam Sizing encourages less predictable, more captivating word choices using a contextual multi-arm bandit model. Affective Reranking prioritizes sentence candidates based on affect intensity. Our empirical evaluations, both automatic and human, demonstrate AffGen's superior performance over existing baselines in generating affectively charged and interesting narratives. Our ablation study and analysis provide insights into the strengths and weaknesses of AffGen., Comment: Accepted at EMNLP-findings 2023
Published: 2023

11. Enhancing Group Fairness in Online Settings Using Oblique Decision Forests

Author: Chowdhury, Somnath Basu Roy, Monath, Nicholas, Beirami, Ahmad, Kidambi, Rahul, Dubey, Avinava, Ahmed, Amr, and Chaturvedi, Snigdha
Subjects: Computer Science - Machine Learning
Abstract: Fairness, especially group fairness, is an important consideration in the context of machine learning systems. The most commonly adopted group fairness-enhancing techniques are in-processing methods that rely on a mixture of a fairness objective (e.g., demographic parity) and a task-specific objective (e.g., cross-entropy) during the training process. However, when data arrives in an online fashion -- one instance at a time -- optimizing such fairness objectives poses several challenges. In particular, group fairness objectives are defined using expectations of predictions across different demographic groups. In the online setting, where the algorithm has access to a single instance at a time, estimating the group fairness objective requires additional storage and significantly more computation (e.g., forward/backward passes) than the task-specific objective at every time step. In this paper, we propose Aranyani, an ensemble of oblique decision trees, to make fair decisions in online settings. The hierarchical tree structure of Aranyani enables parameter isolation and allows us to efficiently compute the fairness gradients using aggregate statistics of previous decisions, eliminating the need for additional storage and forward/backward passes. We also present an efficient framework to train Aranyani and theoretically analyze several of its properties. We conduct empirical evaluations on 5 publicly available benchmarks (including vision and language datasets) to show that Aranyani achieves a better accuracy-fairness trade-off compared to baseline approaches., Comment: ICLR 2024 (Spotlight)
Published: 2023

12. A PSO-optimized novel PID neural network model for temperature control of jacketed CSTR: design, simulation, and a comparative study

Author: Chaturvedi, Snigdha, Kumar, Narendra, and Kumar, Rajesh
Published: 2024
Full Text: View/download PDF

13. Efficient Graph Field Integrators Meet Point Clouds

Author: Choromanski, Krzysztof, Sehanobish, Arijit, Lin, Han, Zhao, Yunfan, Berger, Eli, Parshakova, Tetiana, Pan, Alvin, Watkins, David, Zhang, Tianyi, Likhosherstov, Valerii, Chowdhury, Somnath Basu Roy, Dubey, Avinava, Jain, Deepali, Sarlos, Tamas, Chaturvedi, Snigdha, and Weller, Adrian
Subjects: Computer Science - Machine Learning
Abstract: We present two new classes of algorithms for efficient field integration on graphs encoding point clouds. The first class, SeparatorFactorization(SF), leverages the bounded genus of point cloud mesh graphs, while the second class, RFDiffusion(RFD), uses popular epsilon-nearest-neighbor graph representations for point clouds. Both can be viewed as providing the functionality of Fast Multipole Methods (FMMs), which have had a tremendous impact on efficient integration, but for non-Euclidean spaces. We focus on geometries induced by distributions of walk lengths between points (e.g., shortest-path distance). We provide an extensive theoretical analysis of our algorithms, obtaining new results in structural graph theory as a byproduct. We also perform exhaustive empirical evaluation, including on-surface interpolation for rigid and deformable objects (particularly for mesh-dynamics modeling), Wasserstein distance computations for point clouds, and the Gromov-Wasserstein variant.
Published: 2023

14. Grounded Keys-to-Text Generation: Towards Factual Open-Ended Generation

Author: Brahman, Faeze, Peng, Baolin, Galley, Michel, Rao, Sudha, Dolan, Bill, Chaturvedi, Snigdha, and Gao, Jianfeng
Subjects: Computer Science - Computation and Language
Abstract: Large pre-trained language models have recently enabled open-ended generation frameworks (e.g., prompt-to-text NLG) to tackle a variety of tasks going beyond the traditional data-to-text generation. While this framework is more general, it is under-specified and often leads to a lack of controllability restricting their real-world usage. We propose a new grounded keys-to-text generation task: the task is to generate a factual description about an entity given a set of guiding keys, and grounding passages. To address this task, we introduce a new dataset, called EntDeGen. Inspired by recent QA-based evaluation measures, we propose an automatic metric, MAFE, for factual correctness of generated descriptions. Our EntDescriptor model is equipped with strong rankers to fetch helpful passages and generate entity descriptions. Experimental result shows a good correlation (60.14) between our proposed metric and human judgments of factuality. Our rankers significantly improved the factual correctness of generated descriptions (15.95% and 34.51% relative gains in recall and precision). Finally, our ablation study highlights the benefit of combining keys and groundings., Comment: EMNLP 2022 Findings camera-ready
Published: 2022

15. NarraSum: A Large-Scale Dataset for Abstractive Narrative Summarization

Author: Zhao, Chao, Brahman, Faeze, Song, Kaiqiang, Yao, Wenlin, Yu, Dian, and Chaturvedi, Snigdha
Subjects: Computer Science - Computation and Language
Abstract: Narrative summarization aims to produce a distilled version of a narrative to describe its most salient events and characters. Summarizing a narrative is challenging as it requires an understanding of event causality and character behaviors. To encourage research in this direction, we propose NarraSum, a large-scale narrative summarization dataset. It contains 122K narrative documents, which are collected from plot descriptions of movies and TV episodes with diverse genres, and their corresponding abstractive summaries. Experiments show that there is a large performance gap between humans and the state-of-the-art summarization models on NarraSum. We hope that this dataset will promote future research in summarization, as well as broader studies of natural language understanding and generation. The dataset is available at https://github.com/zhaochaocs/narrasum., Comment: EMNLP Findings 2022
Published: 2022

16. SPE: Symmetrical Prompt Enhancement for Fact Probing

Author: Li, Yiyuan, Che, Tong, Wang, Yezhen, Jiang, Zhengbao, Xiong, Caiming, and Chaturvedi, Snigdha
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Pretrained language models (PLMs) have been shown to accumulate factual knowledge during pretrainingng (Petroni et al., 2019). Recent works probe PLMs for the extent of this knowledge through prompts either in discrete or continuous forms. However, these methods do not consider symmetry of the task: object prediction and subject prediction. In this work, we propose Symmetrical Prompt Enhancement (SPE), a continuous prompt-based method for factual probing in PLMs that leverages the symmetry of the task by constructing symmetrical prompts for subject and object prediction. Our results on a popular factual probing dataset, LAMA, show significant improvement of SPE over previous probing methods., Comment: accepted at EMNLP 2022
Published: 2022

17. Towards Inter-character Relationship-driven Story Generation

Author: Vijjini, Anvesh Rao, Brahman, Faeze, and Chaturvedi, Snigdha
Subjects: Computer Science - Computation and Language
Abstract: In this paper, we introduce the task of modeling interpersonal relationships for story generation. For addressing this task, we propose Relationships as Latent Variables for Story Generation, (ReLiSt). ReLiSt generates stories sentence by sentence and has two major components - a relationship selector and a story continuer. The relationship selector specifies a latent variable to pick the relationship to exhibit in the next sentence and the story continuer generates the next sentence while expressing the selected relationship in a coherent way. Our automatic and human evaluations demonstrate that ReLiSt is able to generate stories with relationships that are more faithful to desired relationships while maintaining the content quality. The relationship assignments to sentences during inference bring interpretability to ReLiSt., Comment: EMNLP 2022
Published: 2022

18. Unsupervised Opinion Summarization Using Approximate Geodesics

Author: Chowdhury, Somnath Basu Roy, Monath, Nicholas, Dubey, Avinava, Ahmed, Amr, and Chaturvedi, Snigdha
Subjects: Computer Science - Computation and Language
Abstract: Opinion summarization is the task of creating summaries capturing popular opinions from user reviews. In this paper, we introduce Geodesic Summarizer (GeoSumm), a novel system to perform unsupervised extractive opinion summarization. GeoSumm involves an encoder-decoder based representation learning model, that generates representations of text as a distribution over latent semantic units. GeoSumm generates these representations by performing dictionary learning over pre-trained text representations at multiple decoder layers. We then use these representations to quantify the relevance of review sentences using a novel approximate geodesic distance based scoring mechanism. We use the relevance scores to identify popular opinions in order to compose general and aspect-specific summaries. Our proposed model, GeoSumm, achieves state-of-the-art performance on three opinion summarization datasets. We perform additional experiments to analyze the functioning of our model and showcase the generalization ability of {\X} across different domains., Comment: Findings of EMNLP 2023
Published: 2022

19. Sustaining Fairness via Incremental Learning

Author: Chowdhury, Somnath Basu Roy and Chaturvedi, Snigdha
Subjects: Computer Science - Machine Learning, Computer Science - Computers and Society
Abstract: Machine learning systems are often deployed for making critical decisions like credit lending, hiring, etc. While making decisions, such systems often encode the user's demographic information (like gender, age) in their intermediate representations. This can lead to decisions that are biased towards specific demographics. Prior work has focused on debiasing intermediate representations to ensure fair decisions. However, these approaches fail to remain fair with changes in the task or demographic distribution. To ensure fairness in the wild, it is important for a system to adapt to such changes as it accesses new data in an incremental fashion. In this work, we propose to address this issue by introducing the problem of learning fair representations in an incremental learning setting. To this end, we present Fairness-aware Incremental Representation Learning (FaIRL), a representation learning system that can sustain fairness while incrementally learning new tasks. FaIRL is able to achieve fairness and learn new tasks by controlling the rate-distortion function of the learned representations. Our empirical evaluations show that FaIRL is able to make fair decisions while achieving high performance on the target task, outperforming several baselines., Comment: Accepted at AAAI 2023
Published: 2022

20. Revisiting Generative Commonsense Reasoning: A Pre-Ordering Approach

Author: Zhao, Chao, Brahman, Faeze, Huang, Tenghao, and Chaturvedi, Snigdha
Subjects: Computer Science - Computation and Language
Abstract: Pre-trained models (PTMs) have lead to great improvements in natural language generation (NLG). However, it is still unclear how much commonsense knowledge they possess. With the goal of evaluating commonsense knowledge of NLG models, recent work has proposed the problem of generative commonsense reasoning, e.g., to compose a logical sentence given a set of unordered concepts. Existing approaches to this problem hypothesize that PTMs lack sufficient parametric knowledge for this task, which can be overcome by introducing external knowledge or task-specific pre-training objectives. Different from this trend, we argue that PTM's inherent ability for generative commonsense reasoning is underestimated due to the order-agnostic property of its input. In particular, we hypothesize that the order of the input concepts can affect the PTM's ability to utilize its commonsense knowledge. To this end, we propose a pre-ordering approach to elaborately manipulate the order of the given concepts before generation. Experiments show that our approach can outperform the more sophisticated models that have access to a lot of external data and resources., Comment: NAACL 2022 Findings
Published: 2022

21. Read Top News First: A Document Reordering Approach for Multi-Document News Summarization

Author: Zhao, Chao, Huang, Tenghao, Chowdhury, Somnath Basu Roy, Chandrasekaran, Muthu Kumar, McKeown, Kathleen, and Chaturvedi, Snigdha
Subjects: Computer Science - Computation and Language
Abstract: A common method for extractive multi-document news summarization is to re-formulate it as a single-document summarization problem by concatenating all documents as a single meta-document. However, this method neglects the relative importance of documents. We propose a simple approach to reorder the documents according to their relative importance before concatenating and summarizing them. The reordering makes the salient content easier to learn by the summarization model. Experiments show that our approach outperforms previous state-of-the-art methods with more complex architectures., Comment: Accepted at Findings of ACL 2022
Published: 2022

22. Unsupervised Extractive Opinion Summarization Using Sparse Coding

Author: Chowdhury, Somnath Basu Roy, Zhao, Chao, and Chaturvedi, Snigdha
Subjects: Computer Science - Computation and Language
Abstract: Opinion summarization is the task of automatically generating summaries that encapsulate information from multiple user reviews. We present Semantic Autoencoder (SemAE) to perform extractive opinion summarization in an unsupervised manner. SemAE uses dictionary learning to implicitly capture semantic information from the review and learns a latent representation of each sentence over semantic units. A semantic unit is supposed to capture an abstract semantic concept. Our extractive summarization algorithm leverages the representations to identify representative opinions among hundreds of reviews. SemAE is also able to perform controllable summarization to generate aspect-specific summaries. We report strong performance on SPACE and AMAZON datasets, and perform experiments to investigate the functioning of our model. Our code is publicly available at https://github.com/brcsomnath/SemAE., Comment: Accepted at ACL 2022
Published: 2022

23. Learning Fair Representations via Rate-Distortion Maximization

Author: Chowdhury, Somnath Basu Roy and Chaturvedi, Snigdha
Subjects: Computer Science - Machine Learning, Computer Science - Computation and Language
Abstract: Text representations learned by machine learning models often encode undesirable demographic information of the user. Predictive models based on these representations can rely on such information, resulting in biased decisions. We present a novel debiasing technique, Fairness-aware Rate Maximization (FaRM), that removes protected information by making representations of instances belonging to the same protected attribute class uncorrelated, using the rate-distortion function. FaRM is able to debias representations with or without a target task at hand. FaRM can also be adapted to remove information about multiple protected attributes simultaneously. Empirical evaluations show that FaRM achieves state-of-the-art performance on several datasets, and learned representations leak significantly less protected attribute information against an attack by a non-linear probing network., Comment: Accepted at TACL
Published: 2022

24. Effective Forum Curation via Multi-Task Learning

Author: Brahman, Faeze, Varghese, Nikhil, Bhat, Suma, and Chaturvedi, Snigdha
Abstract: Despite several advantages of online education, lack of effective student-instructor interaction, especially when students need timely help, poses significant pedagogical challenges. Motivated by this, we address the problems of automatically identifying posts that express confusion or urgency from Massive Open Online Course (MOOC) forums. To this end, we first investigate the extent to which the tasks of confusion detection and urgency detection are correlated so as to explore the possibility of utilizing a multitasking set-up. We then propose two LSTM-based [Long Short Term Memory-based] multitask learning frameworks to leverage shared information and transfer knowledge across these related tasks. Our experiments demonstrate that the approaches improve over single-task models. Our best-performing model is especially useful in identifying posts that express both confusion and urgency, which can be of particular relevance for forum curation. [For the full proceedings, see ED607784.]
Published: 2020

25. Adversarial Scrubbing of Demographic Information for Text Classification

Author: Chowdhury, Somnath Basu Roy, Ghosh, Sayan, Li, Yiyuan, Oliva, Junier B., Srivastava, Shashank, and Chaturvedi, Snigdha
Subjects: Computer Science - Computation and Language
Abstract: Contextual representations learned by language models can often encode undesirable attributes, like demographic associations of the users, while being trained for an unrelated target task. We aim to scrub such undesirable attributes and learn fair representations while maintaining performance on the target task. In this paper, we present an adversarial learning framework "Adversarial Scrubber" (ADS), to debias contextual representations. We perform theoretical analysis to show that our framework converges without leaking demographic information under certain conditions. We extend previous evaluation techniques by evaluating debiasing performance using Minimum Description Length (MDL) probing. Experimental evaluations on 8 datasets show that ADS generates representations with minimal information about demographic attributes while being maximally informative about the target task., Comment: Accepted at EMNLP 2021
Published: 2021

26. Does Commonsense help in detecting Sarcasm?

Author: Chowdhury, Somnath Basu Roy and Chaturvedi, Snigdha
Subjects: Computer Science - Computation and Language
Abstract: Sarcasm detection is important for several NLP tasks such as sentiment identification in product reviews, user feedback, and online forums. It is a challenging task requiring a deep understanding of language, context, and world knowledge. In this paper, we investigate whether incorporating commonsense knowledge helps in sarcasm detection. For this, we incorporate commonsense knowledge into the prediction process using a graph convolution network with pre-trained language model embeddings as input. Our experiments with three sarcasm detection datasets indicate that the approach does not outperform the baseline model. We perform an exhaustive set of experiments to analyze where commonsense support adds value and where it hurts classification. Our implementation is publicly available at: https://github.com/brcsomnath/commonsense-sarcasm., Comment: Accepted at Insights from Negative Results in NLP Workshop, EMNLP 2021
Published: 2021

27. Uncovering Implicit Gender Bias in Narratives through Commonsense Inference

Author: Huang, Tenghao, Brahman, Faeze, Shwartz, Vered, and Chaturvedi, Snigdha
Subjects: Computer Science - Computation and Language
Abstract: Pre-trained language models learn socially harmful biases from their training corpora, and may repeat these biases when used for generation. We study gender biases associated with the protagonist in model-generated stories. Such biases may be expressed either explicitly ("women can't park") or implicitly (e.g. an unsolicited male character guides her into a parking space). We focus on implicit biases, and use a commonsense reasoning engine to uncover them. Specifically, we infer and analyze the protagonist's motivations, attributes, mental states, and implications on others. Our findings regarding implicit biases are in line with prior work that studied explicit biases, for example showing that female characters' portrayal is centered around appearance, while male figures' focus on intellect., Comment: Accepted at Findings of EMNLP 2021
Published: 2021

28. 'Let Your Characters Tell Their Story': A Dataset for Character-Centric Narrative Understanding

Author: Brahman, Faeze, Huang, Meng, Tafjord, Oyvind, Zhao, Chao, Sachan, Mrinmaya, and Chaturvedi, Snigdha
Subjects: Computer Science - Computation and Language
Abstract: When reading a literary piece, readers often make inferences about various characters' roles, personalities, relationships, intents, actions, etc. While humans can readily draw upon their past experiences to build such a character-centric view of the narrative, understanding characters in narratives can be a challenging task for machines. To encourage research in this field of character-centric narrative understanding, we present LiSCU -- a new dataset of literary pieces and their summaries paired with descriptions of characters that appear in them. We also introduce two new tasks on LiSCU: Character Identification and Character Description Generation. Our experiments with several pre-trained language models adapted for these tasks demonstrate that there is a need for better models of narrative comprehension., Comment: Accepted to Findings of EMNLP 2021
Published: 2021

29. Is Everything in Order? A Simple Way to Order Sentences

Author: Chowdhury, Somnath Basu Roy, Brahman, Faeze, and Chaturvedi, Snigdha
Subjects: Computer Science - Computation and Language
Abstract: The task of organizing a shuffled set of sentences into a coherent text has been used to evaluate a machine's understanding of causal and temporal relations. We formulate the sentence ordering task as a conditional text-to-marker generation problem. We present Reorder-BART (Re-BART) that leverages a pre-trained Transformer-based model to identify a coherent order for a given set of shuffled sentences. The model takes a set of shuffled sentences with sentence-specific markers as input and generates a sequence of position markers of the sentences in the ordered text. Re-BART achieves the state-of-the-art performance across 7 datasets in Perfect Match Ratio (PMR) and Kendall's tau ($\tau$). We perform evaluations in a zero-shot setting, showcasing that our model is able to generalize well across other datasets. We additionally perform several experiments to understand the functioning and limitations of our framework., Comment: Accepted at EMNLP 2021
Published: 2021

30. Cue Me In: Content-Inducing Approaches to Interactive Story Generation

Author: Brahman, Faeze, Petrusca, Alexandru, and Chaturvedi, Snigdha
Subjects: Computer Science - Computation and Language
Abstract: Automatically generating stories is a challenging problem that requires producing causally related and logical sequences of events about a topic. Previous approaches in this domain have focused largely on one-shot generation, where a language model outputs a complete story based on limited initial input from a user. Here, we instead focus on the task of interactive story generation, where the user provides the model mid-level sentence abstractions in the form of cue phrases during the generation process. This provides an interface for human users to guide the story generation. We present two content-inducing approaches to effectively incorporate this additional information. Experimental results from both automatic and human evaluations show that these methods produce more topically coherent and personalized stories compared to baseline methods., Comment: AACL 2020
Published: 2020

31. Modeling Protagonist Emotions for Emotion-Aware Storytelling

Author: Brahman, Faeze and Chaturvedi, Snigdha
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Emotions and their evolution play a central role in creating a captivating story. In this paper, we present the first study on modeling the emotional trajectory of the protagonist in neural storytelling. We design methods that generate stories that adhere to given story titles and desired emotion arcs for the protagonist. Our models include Emotion Supervision (EmoSup) and two Emotion-Reinforced (EmoRL) models. The EmoRL models use special rewards designed to regularize the story generation process through reinforcement learning. Our automatic and manual evaluations demonstrate that these models are significantly better at generating stories that follow the desired emotion arcs compared to baseline methods, without sacrificing story quality., Comment: EMNLP 2020, update: Conference version of Weber et al. (2020) is cited
Published: 2020

32. Weakly-Supervised Opinion Summarization by Leveraging External Information

Author: Zhao, Chao and Chaturvedi, Snigdha
Subjects: Computer Science - Computation and Language
Abstract: Opinion summarization from online product reviews is a challenging task, which involves identifying opinions related to various aspects of the product being reviewed. While previous works require additional human effort to identify relevant aspects, we instead apply domain knowledge from external sources to automatically achieve the same goal. This work proposes AspMem, a generative method that contains an array of memory cells to store aspect-related knowledge. This explicit memory can help obtain a better opinion representation and infer the aspect information more precisely. We evaluate this method on both aspect identification and opinion summarization tasks. Our experiments show that AspMem outperforms the state-of-the-art methods even though, unlike the baselines, it does not rely on human supervision which is carefully handcrafted for the given tasks., Comment: Accepted By AAAI-20
Published: 2019

33. Named Entity Recognition with Partially Annotated Training Data

Author: Mayhew, Stephen, Chaturvedi, Snigdha, Tsai, Chen-Tse, and Roth, Dan
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Supervised machine learning assumes the availability of fully-labeled data, but in many cases, such as low-resource languages, the only data available is partially annotated. We study the problem of Named Entity Recognition (NER) with partially annotated training data in which a fraction of the named entities are labeled, and all other tokens, entities or otherwise, are labeled as non-entity by default. In order to train on this noisy dataset, we need to distinguish between the true and false negatives. To this end, we introduce a constraint-driven iterative algorithm that learns to detect false negatives in the noisy set and downweigh them, resulting in a weighted training set. With this set, we train a weighted NER model. We evaluate our algorithm with weighted variants of neural and non-neural NER models on data in 8 languages from several language and script families, showing strong ability to learn from partial data. Finally, to show real-world efficacy, we evaluate on a Bengali NER corpus annotated by non-speakers, outperforming the prior state-of-the-art by over 5 points F1., Comment: Accepted to CoNLL 2019
Published: 2019

34. Discovering Archetypes to Interpret Evolution of Individual Behavior

Author: Narang, Kanika, Chung, Austin, Sundaram, Hari, and Chaturvedi, Snigdha
Subjects: Computer Science - Social and Information Networks, Physics - Physics and Society
Abstract: In this paper, we aim to discover archetypical patterns of individual evolution in large social networks. In our work, an archetype comprises of $\textit{progressive stages}$ of distinct behavior. We introduce a novel Gaussian Hidden Markov Model (G-HMM) Cluster to identify archetypes of evolutionary patterns. G-HMMs allow for: near limitless behavioral variation; imposing constraints on how individuals can evolve; different evolutionary rates; and are parsimonious. Our experiments with Academic and StackExchange dataset discover insightful archetypes. We identify four archetypes for researchers: $\textit{Steady}$, $\textit{Diverse, Evolving and Diffuse}$. We observe clear differences in the evolution of male and female researchers within the same archetype. Specifically, women and men differ within an archetype (e.g. Diverse) in how they start, how they transition and the time spent in mid-career. We also found that the differences in grant income are better explained by the differences in archetype than by differences in gender. For StackOverflow, discovered archetypes could be labeled as $\textit{Experts, Seekers, Enthusiasts, and Facilitators}$. We have strong quantitative results with competing baselines for activity prediction and perplexity. For future session prediction, the proposed G-HMM cluster model improves by an average of $32\%$ for different Stack Exchanges and $24\%$ for Academic dataset. Our model also exhibits lower perplexity than the baselines.
Published: 2019

35. Learner Affect through the Looking Glass: Characterization and Detection of Confusion in Online Courses

Author: Zeng, Ziheng, Chaturvedi, Snigdha, and Bhat, Suma
Abstract: Characterizing the nature of students' affective and emotional states and detecting them is of fundamental importance in online course platforms. In this paper, we study this problem by using discussion forum posts derived from large open online courses. We find that posts identified as encoding confusion are actually manifestations of different learner affects pertaining to their informational needs--primarily seeking factual answers. We quantitatively demonstrate that the use of content-related linguistic features and community-related features derived from a post serve as reliable detectors of confusion while widely "outperforming" currently available algorithms of confusion detection. We also point out that several prediction tasks in this domain (e.g., confusion and urgency detection) can be correlated, and that a model trained for one task can effectively be used for making predictions on the other task without requiring labeled examples. Finally, we highlight a very significant problem of adapting the classifier to unseen courses. [For the full proceedings, see ED596512.]
Published: 2017

36. Ask, and shall you receive?: Understanding Desire Fulfillment in Natural Language Text

Author: Chaturvedi, Snigdha, Goldwasser, Dan, and Daume III, Hal
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: The ability to comprehend wishes or desires and their fulfillment is important to Natural Language Understanding. This paper introduces the task of identifying if a desire expressed by a subject in a given short piece of text was fulfilled. We propose various unstructured and structured models that capture fulfillment cues such as the subject's emotional state and actions. Our experiments with two different datasets demonstrate the importance of understanding the narrative and discourse structure to address this task.
Published: 2015

37. Modeling Dynamic Relationships Between Characters in Literary Novels

Author: Chaturvedi, Snigdha, Srivastava, Shashank, Daume III, Hal, and Dyer, Chris
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Studying characters plays a vital role in computationally representing and interpreting narratives. Unlike previous work, which has focused on inferring character roles, we focus on the problem of modeling their relationships. Rather than assuming a fixed relationship for a character pair, we hypothesize that relationships are dynamic and temporally evolve with the progress of the narrative, and formulate the problem of relationship modeling as a structured prediction problem. We propose a semi-supervised framework to learn relationship sequences from fully as well as partially labeled data. We present a Markovian model capable of accumulating historical beliefs about the relationship and status changes. We use a set of rich linguistic and semantically motivated features that incorporate world knowledge to investigate the textual content of narrative. We empirically demonstrate that such a framework outperforms competitive baselines., Comment: 9 pages, 1 figure. Accepted at AAAI 2016
Published: 2015

38. Inferring Interpersonal Relations in Narrative Summaries

Author: Srivastava, Shashank, Chaturvedi, Snigdha, and Mitchell, Tom
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Social and Information Networks
Abstract: Characterizing relationships between people is fundamental for the understanding of narratives. In this work, we address the problem of inferring the polarity of relationships between people in narrative summaries. We formulate the problem as a joint structured prediction for each narrative, and present a model that combines evidence from linguistic and semantic features, as well as features based on the structure of the social community in the text. We also provide a clustering-based approach that can exploit regularities in narrative types. e.g., learn an affinity for love-triangles in romantic stories. On a dataset of movie summaries from Wikipedia, our structured models provide more than a 30% error-reduction over a competitive baseline that considers pairs of characters in isolation.
Published: 2015

39. Two Feedback PID Controllers Tuned with Teaching–Learning-Based Optimization Algorithm for Ball and Beam System.

Author: Chaturvedi, Snigdha, Kumar, Narendra, and Kumar, Rajesh
Subjects: *ARTIFICIAL intelligence, *OPTIMIZATION algorithms, *CASCADE control, *PID controllers, *NONLINEAR systems
Abstract: The proportional integral derivative (PID) controller continues to be the most popular and widely used in the industry despite the development of various artificial intelligence-based controllers due to its simplicity and ease of use. Setting PID parameters, especially in non-linear systems, is still a significant problem. This study presents a cascade PID tuning technique based on the teaching–learning-based optimization (TLBO) algorithm. The proposed method is tested for the position control of a non-linear ball and beam system. A comparative analysis of the proposed method is done with the conventional tuning method and particle swarm optimization-tuned cascade PID controller. The optimization was carried out using integral time absolute error, integral time error, and integral square error as objective functions. It was observed that the evolutionary algorithm-based controller tuned by TLBO gives a much better response regarding rise time, settling time, and overshoot. To test the effectiveness and validity of the proposed controller robust analysis of the proposed controller is carried out with a step disturbance applied at t = 3 s. The comparative study proves that TLBO-tuned response is better than other controllers. Sensitivity analysis of the proposed controller is also performed by varying system parameters. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

40. Two Feedback PID Controllers Tuned with Teaching–Learning-Based Optimization Algorithm for Ball and Beam System

Author: Chaturvedi, Snigdha, primary, Kumar, Narendra, additional, and Kumar, Rajesh, additional
Published: 2023
Full Text: View/download PDF

41. A PSO-optimized novel PID neural network model for temperature control of jacketed CSTR: design, simulation, and a comparative study

Author: Chaturvedi, Snigdha, primary, Kumar, Narendra, additional, and Kumar, Rajesh, additional
Published: 2023
Full Text: View/download PDF

42. Sustaining Fairness via Incremental Learning

Author: Basu Roy Chowdhury, Somnath, primary and Chaturvedi, Snigdha, additional
Published: 2023
Full Text: View/download PDF

43. Design and Implementation of an Optimized PID Controller for the Adaptive Cruise Control System.

Author: Chaturvedi, Snigdha and Kumar, Narendra
Subjects: *ADAPTIVE control systems, *PID controllers, *CRUISE control, *PARTICLE swarm optimization, *MATHEMATICAL optimization
Abstract: We have designed and implemented an optimized PID controller for an adaptive cruise control system in this paper. The mathematical model for a cruise control system has been developed, and it is observed that it is a nonlinear first-order model with dead time. The objective functions chosen for optimizing the PID controller are ITE, ITAE and ITSE. The design of the optimized PID controller is based on the particle swarm optimization technique and teacher learning-based optimization technique. The results are scientifically compared with the conventionally tuned PID and fuzzy-based controllers. The optimized Proportional Integral Derivative controller shows better performance than a conventional PID and fuzzy-based controller. The overshoot of the system has been reduced to 0% from 46%, and the rise time has been reduced to 0.6150 s. This is the new work in the literature that will be quite useful for the performance enhancement of the cruise control system. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

44. Improving Classroom Dialogue Act Recognition from Limited Labeled Data with Self-Supervised Contrastive Learning Classifiers

Author: Kumaran, Vikram, primary, Rowe, Jonathan, additional, Mott, Bradford, additional, Chaturvedi, Snigdha, additional, and Lester, James, additional
Published: 2023
Full Text: View/download PDF

45. PARROT: Zero-Shot Narrative Reading Comprehension via Parallel Reading

Author: Zhao, Chao, primary, Vijjini, Anvesh, additional, and Chaturvedi, Snigdha, additional
Published: 2023
Full Text: View/download PDF

46. Unsupervised Opinion Summarization Using Approximate Geodesics

Author: Basu Roy Chowdhury, Somnath, primary, Monath, Nicholas, additional, Dubey, Kumar, additional, Ahmed, Amr, additional, and Chaturvedi, Snigdha, additional
Published: 2023
Full Text: View/download PDF

47. Affective and Dynamic Beam Search for Story Generation

Author: Huang, Tenghao, primary, Qasemi, Ehsan, additional, Li, Bangzheng, additional, Wang, He, additional, Brahman, Faeze, additional, Chen, Muhao, additional, and Chaturvedi, Snigdha, additional
Published: 2023
Full Text: View/download PDF

48. Aspect-aware Unsupervised Extractive Opinion Summarization

Author: Li, Haoyuan, primary, Basu Roy Chowdhury, Somnath, additional, and Chaturvedi, Snigdha, additional
Published: 2023
Full Text: View/download PDF

49. Read Top News First: A Document Reordering Approach for Multi-Document News Summarization

Author: Zhao, Chao, primary, Huang, Tenghao, additional, Basu Roy Chowdhury, Somnath, additional, Chandrasekaran, Muthu Kumar, additional, McKeown, Kathleen, additional, and Chaturvedi, Snigdha, additional
Published: 2022
Full Text: View/download PDF

50. NarraSum: A Large-Scale Dataset for Abstractive Narrative Summarization

Author: Zhao, Chao, primary, Brahman, Faeze, additional, Song, Kaiqiang, additional, Yao, Wenlin, additional, Yu, Dian, additional, and Chaturvedi, Snigdha, additional
Published: 2022
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

153 results on '"Chaturvedi, Snigdha"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources