Author: "Mehrabi, Ninareh" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Mehrabi, Ninareh"' showing total 30 results

Start Over Author "Mehrabi, Ninareh"

30 results on '"Mehrabi, Ninareh"'

1. Attribute Controlled Fine-tuning for Large Language Models: A Case Study on Detoxification

Author: Meng, Tao, Mehrabi, Ninareh, Goyal, Palash, Ramakrishna, Anil, Galstyan, Aram, Zemel, Richard, Chang, Kai-Wei, Gupta, Rahul, and Peris, Charith
Subjects: Computer Science - Computation and Language
Abstract: We propose a constraint learning schema for fine-tuning Large Language Models (LLMs) with attribute control. Given a training corpus and control criteria formulated as a sequence-level constraint on model outputs, our method fine-tunes the LLM on the training corpus while enhancing constraint satisfaction with minimal impact on its utility and generation quality. Specifically, our approach regularizes the LLM training by penalizing the KL divergence between the desired output distribution, which satisfies the constraints, and the LLM's posterior. This regularization term can be approximated by an auxiliary model trained to decompose the sequence-level constraints into token-level guidance, allowing the term to be measured by a closed-form formulation. To further improve efficiency, we design a parallel scheme for concurrently updating both the LLM and the auxiliary model. We evaluate the empirical performance of our approach by controlling the toxicity when training an LLM. We show that our approach leads to an LLM that produces fewer inappropriate responses while achieving competitive performance on benchmarks and a toxicity detection task., Comment: Accepted to EMNLP Findings
Published: 2024

2. Data Advisor: Dynamic Data Curation for Safety Alignment of Large Language Models

Author: Wang, Fei, Mehrabi, Ninareh, Goyal, Palash, Gupta, Rahul, Chang, Kai-Wei, and Galstyan, Aram
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Data is a crucial element in large language model (LLM) alignment. Recent studies have explored using LLMs for efficient data collection. However, LLM-generated data often suffers from quality issues, with underrepresented or absent aspects and low-quality datapoints. To address these problems, we propose Data Advisor, an enhanced LLM-based method for generating data that takes into account the characteristics of the desired dataset. Starting from a set of pre-defined principles in hand, Data Advisor monitors the status of the generated data, identifies weaknesses in the current dataset, and advises the next iteration of data generation accordingly. Data Advisor can be easily integrated into existing data generation methods to enhance data quality and coverage. Experiments on safety alignment of three representative LLMs (i.e., Mistral, Llama2, and Falcon) demonstrate the effectiveness of Data Advisor in enhancing model safety against various fine-grained safety issues without sacrificing model utility., Comment: Accepted to EMNLP 2024 Main Conference. Project website: https://feiwang96.github.io/DataAdvisor/
Published: 2024

3. Tree-of-Traversals: A Zero-Shot Reasoning Algorithm for Augmenting Black-box Language Models with Knowledge Graphs

Author: Markowitz, Elan, Ramakrishna, Anil, Dhamala, Jwala, Mehrabi, Ninareh, Peris, Charith, Gupta, Rahul, Chang, Kai-Wei, and Galstyan, Aram
Subjects: Computer Science - Artificial Intelligence
Abstract: Knowledge graphs (KGs) complement Large Language Models (LLMs) by providing reliable, structured, domain-specific, and up-to-date external knowledge. However, KGs and LLMs are often developed separately and must be integrated after training. We introduce Tree-of-Traversals, a novel zero-shot reasoning algorithm that enables augmentation of black-box LLMs with one or more KGs. The algorithm equips a LLM with actions for interfacing a KG and enables the LLM to perform tree search over possible thoughts and actions to find high confidence reasoning paths. We evaluate on two popular benchmark datasets. Our results show that Tree-of-Traversals significantly improves performance on question answering and KG question answering tasks. Code is available at \url{https://github.com/amazon-science/tree-of-traversals}, Comment: Accepted for publication at the ACL 2024 Conference
Published: 2024

4. Prompt Perturbation Consistency Learning for Robust Language Models

Author: Qiang, Yao, Nandi, Subhrangshu, Mehrabi, Ninareh, Steeg, Greg Ver, Kumar, Anoop, Rumshisky, Anna, and Galstyan, Aram
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Large language models (LLMs) have demonstrated impressive performance on a number of natural language processing tasks, such as question answering and text summarization. However, their performance on sequence labeling tasks such as intent classification and slot filling (IC-SF), which is a central component in personal assistant systems, lags significantly behind discriminative models. Furthermore, there is a lack of substantive research on the robustness of LLMs to various perturbations in the input prompts. The contributions of this paper are three-fold. First, we show that fine-tuning sufficiently large LLMs can produce IC-SF performance comparable to discriminative models. Next, we systematically analyze the performance deterioration of those fine-tuned models due to three distinct yet relevant types of input perturbations - oronyms, synonyms, and paraphrasing. Finally, we propose an efficient mitigation approach, Prompt Perturbation Consistency Learning (PPCL), which works by regularizing the divergence between losses from clean and perturbed samples. Our experiments demonstrate that PPCL can recover on average 59% and 69% of the performance drop for IC and SF tasks, respectively. Furthermore, PPCL beats the data augmentation approach while using ten times fewer augmented data samples.
Published: 2024

5. Tokenization Matters: Navigating Data-Scarce Tokenization for Gender Inclusive Language Technologies

Author: Ovalle, Anaelia, Mehrabi, Ninareh, Goyal, Palash, Dhamala, Jwala, Chang, Kai-Wei, Zemel, Richard, Galstyan, Aram, Pinter, Yuval, and Gupta, Rahul
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Gender-inclusive NLP research has documented the harmful limitations of gender binary-centric large language models (LLM), such as the inability to correctly use gender-diverse English neopronouns (e.g., xe, zir, fae). While data scarcity is a known culprit, the precise mechanisms through which scarcity affects this behavior remain underexplored. We discover LLM misgendering is significantly influenced by Byte-Pair Encoding (BPE) tokenization, the tokenizer powering many popular LLMs. Unlike binary pronouns, BPE overfragments neopronouns, a direct consequence of data scarcity during tokenizer training. This disparate tokenization mirrors tokenizer limitations observed in multilingual and low-resource NLP, unlocking new misgendering mitigation strategies. We propose two techniques: (1) pronoun tokenization parity, a method to enforce consistent tokenization across gendered pronouns, and (2) utilizing pre-existing LLM pronoun knowledge to improve neopronoun proficiency. Our proposed methods outperform finetuning with standard BPE, improving neopronoun accuracy from 14.1% to 58.4%. Our paper is the first to link LLM misgendering to tokenization and deficient neopronoun grammar, indicating that LLMs unable to correctly treat neopronouns as pronouns are more prone to misgender., Comment: Accepted to NAACL 2024 findings
Published: 2023

6. JAB: Joint Adversarial Prompting and Belief Augmentation

Author: Mehrabi, Ninareh, Goyal, Palash, Ramakrishna, Anil, Dhamala, Jwala, Ghosh, Shalini, Zemel, Richard, Chang, Kai-Wei, Galstyan, Aram, and Gupta, Rahul
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: With the recent surge of language models in different applications, attention to safety and robustness of these models has gained significant importance. Here we introduce a joint framework in which we simultaneously probe and improve the robustness of a black-box target model via adversarial prompting and belief augmentation using iterative feedback loops. This framework utilizes an automated red teaming approach to probe the target model, along with a belief augmenter to generate instructions for the target model to improve its robustness to those adversarial probes. Importantly, the adversarial model and the belief generator leverage the feedback from past interactions to improve the effectiveness of the adversarial prompts and beliefs, respectively. In our experiments, we demonstrate that such a framework can reduce toxic content generation both in dynamic cases where an adversary directly interacts with a target model and static cases where we use a static benchmark dataset to evaluate our model.
Published: 2023

7. On the steerability of large language models toward data-driven personas

Author: Li, Junyi, Mehrabi, Ninareh, Peris, Charith, Goyal, Palash, Chang, Kai-Wei, Galstyan, Aram, Zemel, Richard, and Gupta, Rahul
Subjects: Computer Science - Computation and Language
Abstract: Large language models (LLMs) are known to generate biased responses where the opinions of certain groups and populations are underrepresented. Here, we present a novel approach to achieve controllable generation of specific viewpoints using LLMs, that can be leveraged to produce multiple perspectives and to reflect the diverse opinions. Moving beyond the traditional reliance on demographics like age, gender, or party affiliation, we introduce a data-driven notion of persona grounded in collaborative filtering, which is defined as either a single individual or a cohort of individuals manifesting similar views across specific inquiries. As individuals in the same demographic group may have different personas, our data-driven persona definition allows for a more nuanced understanding of different (latent) social groups present in the population. In addition to this, we also explore an efficient method to steer LLMs toward the personas that we define. We show that our data-driven personas significantly enhance model steerability, with improvements of between $57\%-77\%$ over our best performing baselines.
Published: 2023

8. FLIRT: Feedback Loop In-context Red Teaming

Author: Mehrabi, Ninareh, Goyal, Palash, Dupuy, Christophe, Hu, Qian, Ghosh, Shalini, Zemel, Richard, Chang, Kai-Wei, Galstyan, Aram, and Gupta, Rahul
Subjects: Computer Science - Artificial Intelligence
Abstract: Warning: this paper contains content that may be inappropriate or offensive. As generative models become available for public use in various applications, testing and analyzing vulnerabilities of these models has become a priority. Here we propose an automatic red teaming framework that evaluates a given model and exposes its vulnerabilities against unsafe and inappropriate content generation. Our framework uses in-context learning in a feedback loop to red team models and trigger them into unsafe content generation. We propose different in-context attack strategies to automatically learn effective and diverse adversarial prompts for text-to-image models. Our experiments demonstrate that compared to baseline approaches, our proposed strategy is significantly more effective in exposing vulnerabilities in Stable Diffusion (SD) model, even when the latter is enhanced with safety features. Furthermore, we demonstrate that the proposed framework is effective for red teaming text-to-text models, resulting in significantly higher toxic response generation rate compared to previously reported numbers.
Published: 2023

9. Is the Elephant Flying? Resolving Ambiguities in Text-to-Image Generative Models

Author: Mehrabi, Ninareh, Goyal, Palash, Verma, Apurv, Dhamala, Jwala, Kumar, Varun, Hu, Qian, Chang, Kai-Wei, Zemel, Richard, Galstyan, Aram, and Gupta, Rahul
Subjects: Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning, Computer Science - Multimedia
Abstract: Natural language often contains ambiguities that can lead to misinterpretation and miscommunication. While humans can handle ambiguities effectively by asking clarifying questions and/or relying on contextual cues and common-sense knowledge, resolving ambiguities can be notoriously hard for machines. In this work, we study ambiguities that arise in text-to-image generative models. We curate a benchmark dataset covering different types of ambiguities that occur in these systems. We then propose a framework to mitigate ambiguities in the prompts given to the systems by soliciting clarifications from the user. Through automatic and human evaluations, we show the effectiveness of our framework in generating more faithful images aligned with human intention in the presence of ambiguities.
Published: 2022

10. Robust Conversational Agents against Imperceptible Toxicity Triggers

Author: Mehrabi, Ninareh, Beirami, Ahmad, Morstatter, Fred, and Galstyan, Aram
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Warning: this paper contains content that maybe offensive or upsetting. Recent research in Natural Language Processing (NLP) has advanced the development of various toxicity detection models with the intention of identifying and mitigating toxic language from existing systems. Despite the abundance of research in this area, less attention has been given to adversarial attacks that force the system to generate toxic language and the defense against them. Existing work to generate such attacks is either based on human-generated attacks which is costly and not scalable or, in case of automatic attacks, the attack vector does not conform to human-like language, which can be detected using a language model loss. In this work, we propose attacks against conversational agents that are imperceptible, i.e., they fit the conversation in terms of coherency, relevancy, and fluency, while they are effective and scalable, i.e., they can automatically trigger the system into generating toxic language. We then propose a defense mechanism against such attacks which not only mitigates the attack but also attempts to maintain the conversational flow. Through automatic and human evaluations, we show that our defense is effective at avoiding toxic language generation even against imperceptible toxicity triggers while the generated language fits the conversation in terms of coherency and relevancy. Lastly, we establish the generalizability of such a defense mechanism on language generation models beyond conversational agents.
Published: 2022

11. Towards Multi-Objective Statistically Fair Federated Learning

Author: Mehrabi, Ninareh, de Lichy, Cyprien, McKay, John, He, Cynthia, and Campbell, William
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Federated Learning (FL) has emerged as a result of data ownership and privacy concerns to prevent data from being shared between multiple parties included in a training procedure. Although issues, such as privacy, have gained significant attention in this domain, not much attention has been given to satisfying statistical fairness measures in the FL setting. With this goal in mind, we conduct studies to show that FL is able to satisfy different fairness metrics under different data regimes consisting of different types of clients. More specifically, uncooperative or adversarial clients might contaminate the global FL model by injecting biased or poisoned models due to existing biases in their training datasets. Those biases might be a result of imbalanced training set (Zhang and Zhou 2019), historical biases (Mehrabi et al. 2021a), or poisoned data-points from data poisoning attacks against fairness (Mehrabi et al. 2021b; Solans, Biggio, and Castillo 2020). Thus, we propose a new FL framework that is able to satisfy multiple objectives including various statistical fairness metrics. Through experimentation, we then show the effectiveness of this method comparing it with various baselines, its ability in satisfying different objectives collectively and individually, and its ability in identifying uncooperative or adversarial clients and down-weighing their effect
Published: 2022

12. Attributing Fair Decisions with Attention Interventions

Author: Mehrabi, Ninareh, Gupta, Umang, Morstatter, Fred, Steeg, Greg Ver, and Galstyan, Aram
Subjects: Computer Science - Artificial Intelligence
Abstract: The widespread use of Artificial Intelligence (AI) in consequential domains, such as healthcare and parole decision-making systems, has drawn intense scrutiny on the fairness of these methods. However, ensuring fairness is often insufficient as the rationale for a contentious decision needs to be audited, understood, and defended. We propose that the attention mechanism can be used to ensure fair outcomes while simultaneously providing feature attributions to account for how a decision was made. Toward this goal, we design an attention-based model that can be leveraged as an attribution framework. It can identify features responsible for both performance and fairness of the model through attention interventions and attention weight manipulation. Using this attribution framework, we then design a post-processing bias mitigation strategy and compare it with a suite of baselines. We demonstrate the versatility of our approach by conducting experiments on two distinct data types, tabular and textual.
Published: 2021

13. Lawyers are Dishonest? Quantifying Representational Harms in Commonsense Knowledge Resources

Author: Mehrabi, Ninareh, Zhou, Pei, Morstatter, Fred, Pujara, Jay, Ren, Xiang, and Galstyan, Aram
Subjects: Computer Science - Computation and Language
Abstract: Warning: this paper contains content that may be offensive or upsetting. Numerous natural language processing models have tried injecting commonsense by using the ConceptNet knowledge base to improve performance on different tasks. ConceptNet, however, is mostly crowdsourced from humans and may reflect human biases such as "lawyers are dishonest." It is important that these biases are not conflated with the notion of commonsense. We study this missing yet important problem by first defining and quantifying biases in ConceptNet as two types of representational harms: overgeneralization of polarized perceptions and representation disparity. We find that ConceptNet contains severe biases and disparities across four demographic categories. In addition, we analyze two downstream models that use ConceptNet as a source for commonsense knowledge and find the existence of biases in those models as well. We further propose a filtered-based bias-mitigation approach and examine its effectiveness. We show that our mitigation approach can reduce the issues in both resource and models but leads to a performance drop, leaving room for future work to build fairer and stronger commonsense models.
Published: 2021

14. Exacerbating Algorithmic Bias through Fairness Attacks

Author: Mehrabi, Ninareh, Naveed, Muhammad, Morstatter, Fred, and Galstyan, Aram
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Cryptography and Security
Abstract: Algorithmic fairness has attracted significant attention in recent years, with many quantitative measures suggested for characterizing the fairness of different machine learning algorithms. Despite this interest, the robustness of those fairness measures with respect to an intentional adversarial attack has not been properly addressed. Indeed, most adversarial machine learning has focused on the impact of malicious attacks on the accuracy of the system, without any regard to the system's fairness. We propose new types of data poisoning attacks where an adversary intentionally targets the fairness of a system. Specifically, we propose two families of attacks that target fairness measures. In the anchoring attack, we skew the decision boundary by placing poisoned points near specific target points to bias the outcome. In the influence attack on fairness, we aim to maximize the covariance between the sensitive attributes and the decision outcome and affect the fairness of the model. We conduct extensive experiments that indicate the effectiveness of our proposed attacks.
Published: 2020

15. The Leaky Pipeline in Physics Publishing

Author: Ross, Clara O, Gupta, Aditya, Mehrabi, Ninareh, Muric, Goran, and Lerman, Kristina
Subjects: Physics - Physics and Society, Computer Science - Digital Libraries
Abstract: Women make up a shrinking portion of physics faculty in senior positions, a phenomenon known as a "leaky pipeline." While fixing this problem has been a priority in academic institutions, efforts have been stymied by the diverse sources of leaks. In this paper we identify a bias potentially contributing to the leaky pipeline. We analyze bibliographic data provided by the American Physical Society (APS), a leading publisher of physics research. By inferring the gender of authors from names, we are able to measure the fraction of women authors over past decades. We show that the more selective, higher impact APS journals have lower fractions of women authors compared to other APS journals. Correcting this bias may help more women publish in prestigious APS journals, and in turn help improve their academic promotion cases.
Published: 2020

16. Statistical Equity: A Fairness Classification Objective

Author: Mehrabi, Ninareh, Huang, Yuzhong, and Morstatter, Fred
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Statistics - Machine Learning
Abstract: Machine learning systems have been shown to propagate the societal errors of the past. In light of this, a wealth of research focuses on designing solutions that are "fair." Even with this abundance of work, there is no singular definition of fairness, mainly because fairness is subjective and context dependent. We propose a new fairness definition, motivated by the principle of equity, that considers existing biases in the data and attempts to make equitable decisions that account for these previous historical biases. We formalize our definition of fairness, and motivate it with its appropriate contexts. Next, we operationalize it for equitable classification. We perform multiple automatic and human evaluations to show the effectiveness of our definition and demonstrate its utility for aspects of fairness, such as the feedback loop.
Published: 2020

17. Man is to Person as Woman is to Location: Measuring Gender Bias in Named Entity Recognition

Author: Mehrabi, Ninareh, Gowda, Thamme, Morstatter, Fred, Peng, Nanyun, and Galstyan, Aram
Subjects: Computer Science - Information Retrieval, Computer Science - Computation and Language
Abstract: We study the bias in several state-of-the-art named entity recognition (NER) models---specifically, a difference in the ability to recognize male and female names as PERSON entity types. We evaluate NER models on a dataset containing 139 years of U.S. census baby names and find that relatively more female names, as opposed to male names, are not recognized as PERSON entities. We study the extent of this bias in several NER systems that are used prominently in industry and academia. In addition, we also report a bias in the datasets on which these models were trained. The result of this analysis yields a new benchmark for gender bias evaluation in named entity recognition systems. The data and code for the application of this benchmark will be publicly available for researchers to use.
Published: 2019

18. A Survey on Bias and Fairness in Machine Learning

Author: Mehrabi, Ninareh, Morstatter, Fred, Saxena, Nripsuta, Lerman, Kristina, and Galstyan, Aram
Subjects: Computer Science - Machine Learning
Abstract: With the widespread use of AI systems and applications in our everyday lives, it is important to take fairness issues into consideration while designing and engineering these types of systems. Such systems can be used in many sensitive environments to make important and life-changing decisions; thus, it is crucial to ensure that the decisions do not reflect discriminatory behavior toward certain groups or populations. We have recently seen work in machine learning, natural language processing, and deep learning that addresses such challenges in different subdomains. With the commercialization of these systems, researchers are becoming aware of the biases that these applications can contain and have attempted to address them. In this survey we investigated different real-world applications that have shown biases in various ways, and we listed different sources of biases that can affect AI applications. We then created a taxonomy for fairness definitions that machine learning researchers have defined in order to avoid the existing bias in AI systems. In addition to that, we examined different domains and subdomains in AI showing what researchers have observed with regard to unfair outcomes in the state-of-the-art methods and how they have tried to address them. There are still many future directions and solutions that can be taken to mitigate the problem of bias in AI systems. We are hoping that this survey will motivate researchers to tackle these issues in the near future by observing existing work in their respective fields.
Published: 2019

19. Debiasing Community Detection: The Importance of Lowly-Connected Nodes

Author: Mehrabi, Ninareh, Morstatter, Fred, Peng, Nanyun, and Galstyan, Aram
Subjects: Computer Science - Social and Information Networks, Physics - Physics and Society
Abstract: Community detection is an important task in social network analysis, allowing us to identify and understand the communities within the social structures. However, many community detection approaches either fail to assign low degree (or lowly-connected) users to communities, or assign them to trivially small communities that prevent them from being included in analysis. In this work, we investigate how excluding these users can bias analysis results. We then introduce an approach that is more inclusive for lowly-connected users by incorporating them into larger groups. Experiments show that our approach outperforms the existing state-of-the-art in terms of F1 and Jaccard similarity scores while reducing the bias towards low-degree users.
Published: 2019

20. DynamicGEM: A Library for Dynamic Graph Embedding Methods

Author: Goyal, Palash, Chhetri, Sujit Rokka, Mehrabi, Ninareh, Ferrara, Emilio, and Canedo, Arquimedes
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Social and Information Networks, Statistics - Machine Learning
Abstract: DynamicGEM is an open-source Python library for learning node representations of dynamic graphs. It consists of state-of-the-art algorithms for defining embeddings of nodes whose connections evolve over time. The library also contains the evaluation framework for four downstream tasks on the network: graph reconstruction, static and temporal link prediction, node classification, and temporal visualization. We have implemented various metrics to evaluate the state-of-the-art methods, and examples of evolving networks from various domains. We have easy-to-use functions to call and evaluate the methods and have extensive usage documentation. Furthermore, DynamicGEM provides a template to add new algorithms with ease to facilitate further research on the topic.
Published: 2018

21. Resolving Ambiguities in Text-to-Image Generative Models

Author: Mehrabi, Ninareh, primary, Goyal, Palash, additional, Verma, Apurv, additional, Dhamala, Jwala, additional, Kumar, Varun, additional, Hu, Qian, additional, Chang, Kai-Wei, additional, Zemel, Richard, additional, Galstyan, Aram, additional, and Gupta, Rahul, additional
Published: 2023
Full Text: View/download PDF

22. Where Does Bias in Common Sense Knowledge Models Come From?

Author: Melotte, Sara, primary, Ilievski, Filip, additional, Zhang, Linglan, additional, Malte, Aditya, additional, Mutha, Namita, additional, Morstatter, Fred, additional, and Mehrabi, Ninareh, additional
Published: 2022
Full Text: View/download PDF

23. Attributing Fair Decisions with Attention Interventions

Author: Mehrabi, Ninareh, primary, Gupta, Umang, additional, Morstatter, Fred, additional, Steeg, Greg Ver, additional, and Galstyan, Aram, additional
Published: 2022
Full Text: View/download PDF

24. Robust Conversational Agents against Imperceptible Toxicity Triggers

Author: Mehrabi, Ninareh, primary, Beirami, Ahmad, additional, Morstatter, Fred, additional, and Galstyan, Aram, additional
Published: 2022
Full Text: View/download PDF

25. A Survey on Bias and Fairness in Machine Learning

Author: Mehrabi, Ninareh, primary, Morstatter, Fred, additional, Saxena, Nripsuta, additional, Lerman, Kristina, additional, and Galstyan, Aram, additional
Published: 2021
Full Text: View/download PDF

26. Exacerbating Algorithmic Bias through Fairness Attacks

Author: Mehrabi, Ninareh, primary, Naveed, Muhammad, additional, Morstatter, Fred, additional, and Galstyan, Aram, additional
Published: 2021
Full Text: View/download PDF

27. A Survey on Bias and Fairness in Machine Learning.

Author: MEHRABI, NINAREH, MORSTATTER, FRED, SAXENA, NRIPSUTA, LERMAN, KRISTINA, and GALSTYAN, ARAM
Subjects: *MACHINE learning, *DEEP learning, *ARTIFICIAL intelligence, *ENGINEERING design, *NATURAL language processing, *FAIRNESS
Abstract: With the widespread use of artificial intelligence (AI) systems and applications in our everyday lives, accounting for fairness has gained significant importance in designing and engineering of such systems. AI systems can be used in many sensitive environments to make important and life-changing decisions; thus, it is crucial to ensure that these decisions do not reflect discriminatory behavior toward certain groups or populations. More recently some work has been developed in traditional machine learning and deep learning that address such challenges in different subdomains. With the commercialization of these systems, researchers are becoming more aware of the biases that these applications can contain and are attempting to address them. In this survey, we investigated different real-world applications that have shown biases in various ways, and we listed different sources of biases that can affect AI applications. We then created a taxonomy for fairness definitions that machine learning researchers have defined to avoid the existing bias in AI systems. In addition to that, we examined different domains and subdomains in AI showing what researchers have observed with regard to unfair outcomes in the state-of-the-art methods and ways they have tried to address them. There are still many future directions and solutions that can be taken to mitigate the problem of bias in AI systems. We are hoping that this survey will motivate researchers to tackle these issues in the near future by observing existing work in their respective fields. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

28. Lawyers are Dishonest? Quantifying Representational Harms in Commonsense Knowledge Resources

Author: Mehrabi, Ninareh, primary, Zhou, Pei, additional, Morstatter, Fred, additional, Pujara, Jay, additional, Ren, Xiang, additional, and Galstyan, Aram, additional
Published: 2021
Full Text: View/download PDF

29. Man is to Person as Woman is to Location

Author: Mehrabi, Ninareh, primary, Gowda, Thamme, additional, Morstatter, Fred, additional, Peng, Nanyun, additional, and Galstyan, Aram, additional
Published: 2020
Full Text: View/download PDF

30. Debiasing community detection

Author: Mehrabi, Ninareh, primary, Morstatter, Fred, additional, Peng, Nanyun, additional, and Galstyan, Aram, additional
Published: 2019
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

30 results on '"Mehrabi, Ninareh"'

1. Attribute Controlled Fine-tuning for Large Language Models: A Case Study on Detoxification

2. Data Advisor: Dynamic Data Curation for Safety Alignment of Large Language Models

3. Tree-of-Traversals: A Zero-Shot Reasoning Algorithm for Augmenting Black-box Language Models with Knowledge Graphs

4. Prompt Perturbation Consistency Learning for Robust Language Models

5. Tokenization Matters: Navigating Data-Scarce Tokenization for Gender Inclusive Language Technologies

6. JAB: Joint Adversarial Prompting and Belief Augmentation

7. On the steerability of large language models toward data-driven personas

8. FLIRT: Feedback Loop In-context Red Teaming

9. Is the Elephant Flying? Resolving Ambiguities in Text-to-Image Generative Models

10. Robust Conversational Agents against Imperceptible Toxicity Triggers

11. Towards Multi-Objective Statistically Fair Federated Learning

12. Attributing Fair Decisions with Attention Interventions

13. Lawyers are Dishonest? Quantifying Representational Harms in Commonsense Knowledge Resources

14. Exacerbating Algorithmic Bias through Fairness Attacks

15. The Leaky Pipeline in Physics Publishing

16. Statistical Equity: A Fairness Classification Objective

17. Man is to Person as Woman is to Location: Measuring Gender Bias in Named Entity Recognition

18. A Survey on Bias and Fairness in Machine Learning

19. Debiasing Community Detection: The Importance of Lowly-Connected Nodes

20. DynamicGEM: A Library for Dynamic Graph Embedding Methods

21. Resolving Ambiguities in Text-to-Image Generative Models

22. Where Does Bias in Common Sense Knowledge Models Come From?

23. Attributing Fair Decisions with Attention Interventions

24. Robust Conversational Agents against Imperceptible Toxicity Triggers

25. A Survey on Bias and Fairness in Machine Learning

26. Exacerbating Algorithmic Bias through Fairness Attacks

27. A Survey on Bias and Fairness in Machine Learning.

28. Lawyers are Dishonest? Quantifying Representational Harms in Commonsense Knowledge Resources

29. Man is to Person as Woman is to Location

30. Debiasing community detection

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

30 results on '"Mehrabi, Ninareh"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources