12 results on '"Wang, Yizhen"'
Search Results
2. Developing IncidentUI -- A Ride Comfort and Disengagement Evaluation Application for Autonomous Vehicles
- Author
-
Mehta, Manas, Chkhaidze, Nugzar, Wang, Yizhen, Mehta, Manas, Chkhaidze, Nugzar, and Wang, Yizhen
- Abstract
This report details the design, development, and implementation of IncidentUI, an Android tablet application designed to measure user-experienced ride comfort and record disengagement data for autonomous vehicles (AV) during test drives. The goal of our project was to develop an Android application to run on a peripheral tablet and communicate with the Drive Pegasus AGX, the AI Computing Platform for Nvidia's AV Level 2 Autonomy Solution Architecture [1], to detect AV disengagements and report ride comfort. We designed and developed an Android XML-based intuitive user interface for IncidentUI. The development of IncidentUI required a redesign of the system architecture by redeveloping the system communications protocol in Java and implementing the Protocol Buffers (Protobufs) in Java using the existing system Protobuf definitions. The final iteration of IncidentUI yielded the desired functionality while testing on an AV test drive. We also received positive feedback from Nvidia's AV Platform Team during our final IncidentUI demonstration., Comment: Previously embargoed by Nvidia. Nvidia owns the rights
- Published
- 2023
3. Privacy-Preserving Financial Anomaly Detection via Federated Learning & Multi-Party Computation
- Author
-
Arora, Sunpreet, Beams, Andrew, Chatzigiannis, Panagiotis, Meiser, Sebastian, Patel, Karan, Raghuraman, Srinivasan, Rindal, Peter, Shah, Harshal, Wang, Yizhen, Wu, Yuhang, Yang, Hao, Zamani, Mahdi, Arora, Sunpreet, Beams, Andrew, Chatzigiannis, Panagiotis, Meiser, Sebastian, Patel, Karan, Raghuraman, Srinivasan, Rindal, Peter, Shah, Harshal, Wang, Yizhen, Wu, Yuhang, Yang, Hao, and Zamani, Mahdi
- Abstract
One of the main goals of financial institutions (FIs) today is combating fraud and financial crime. To this end, FIs use sophisticated machine-learning models trained using data collected from their customers. The output of machine learning models may be manually reviewed for critical use cases, e.g., determining the likelihood of a transaction being anomalous and the subsequent course of action. While advanced machine learning models greatly aid an FI in anomaly detection, model performance could be significantly improved using additional customer data from other FIs. In practice, however, an FI may not have appropriate consent from customers to share their data with other FIs. Additionally, data privacy regulations may prohibit FIs from sharing clients' sensitive data in certain geographies. Combining customer data to jointly train highly accurate anomaly detection models is therefore challenging for FIs in operational settings. In this paper, we describe a privacy-preserving framework that allows FIs to jointly train highly accurate anomaly detection models. The framework combines the concept of federated learning with efficient multi-party computation and noisy aggregates inspired by differential privacy. The presented framework was submitted as a winning entry to the financial crime detection track of the US/UK PETs Challenge. The challenge considered an architecture where banks hold customer data and execute transactions through a central network. We show that our solution enables the network to train a highly accurate anomaly detection model while preserving privacy of customer data. Experimental results demonstrate that use of additional customer data using the proposed approach results in improvement of our anomaly detection model's AUPRC from 0.6 to 0.7. We discuss how our framework, can be generalized to other similar scenarios., Comment: 12 pages
- Published
- 2023
4. Burning the Adversarial Bridges: Robust Windows Malware Detection Against Binary-level Mutations
- Author
-
Abusnaina, Ahmed, Wang, Yizhen, Arora, Sunpreet, Wang, Ke, Christodorescu, Mihai, Mohaisen, David, Abusnaina, Ahmed, Wang, Yizhen, Arora, Sunpreet, Wang, Ke, Christodorescu, Mihai, and Mohaisen, David
- Abstract
Toward robust malware detection, we explore the attack surface of existing malware detection systems. We conduct root-cause analyses of the practical binary-level black-box adversarial malware examples. Additionally, we uncover the sensitivity of volatile features within the detection engines and exhibit their exploitability. Highlighting volatile information channels within the software, we introduce three software pre-processing steps to eliminate the attack surface, namely, padding removal, software stripping, and inter-section information resetting. Further, to counter the emerging section injection attacks, we propose a graph-based section-dependent information extraction scheme for software representation. The proposed scheme leverages aggregated information within various sections in the software to enable robust malware detection and mitigate adversarial settings. Our experimental results show that traditional malware detection models are ineffective against adversarial threats. However, the attack surface can be largely reduced by eliminating the volatile information. Therefore, we propose simple-yet-effective methods to mitigate the impacts of binary manipulation attacks. Overall, our graph-based malware detection scheme can accurately detect malware with an area under the curve score of 88.32\% and a score of 88.19% under a combination of binary manipulation attacks, exhibiting the efficiency of our proposed scheme., Comment: 12 pages
- Published
- 2023
5. Robust Machine Learning in Adversarial Setting with Provable Guarantee
- Author
-
Wang, Yizhen, Chaudhuri, Kamalika1, Wang, Yizhen, Wang, Yizhen, Chaudhuri, Kamalika1, and Wang, Yizhen
- Abstract
Over the last decade, machine learning systems have achieved state-of-the-art performance in many fields, and are now used in increasing number of applications. However, recent research work has revealed multiple attacks to machine learning systems that significantly reduce the performance by manipulating the training or test data. As machine learning is increasingly involved in high-stake decision making processes, the robustness of machine learning systems in adversarial environment becomes a major concern. This dissertation attempts to build machine learning systems robust to such adversarial manipulation with the emphasis on providing theoretical performance guarantees. We consider adversaries in both test and training time, and make the following contributions. First, we study the robustness of machine learning algorithms and model to test-time adversarial examples. We analyze the distributional and finite sample robustness of nearest neighbor classification, and propose a modified 1-Nearest-Neighbor classifier that both has theoretical guarantee and empirical improvement in robustness. Second, we examine the robustness of malware detectors to program transformation. We propose novel attacks that evade existing detectors using program transformation, and then show program normalization as a provably robust defense against such transformation. Finally, we investigate data poisoning attacks and defenses for online learning, in which models update and predict over data stream in real-time. We show efficient attacks for general adversarial objectives, analyze the conditions for which filtering based defenses are effective, and provide practical guidance on choosing defense mechanisms and parameters.
- Published
- 2020
6. Robust Machine Learning in Adversarial Setting with Provable Guarantee
- Author
-
Wang, Yizhen, Chaudhuri, Kamalika1, Wang, Yizhen, Wang, Yizhen, Chaudhuri, Kamalika1, and Wang, Yizhen
- Abstract
Over the last decade, machine learning systems have achieved state-of-the-art performance in many fields, and are now used in increasing number of applications. However, recent research work has revealed multiple attacks to machine learning systems that significantly reduce the performance by manipulating the training or test data. As machine learning is increasingly involved in high-stake decision making processes, the robustness of machine learning systems in adversarial environment becomes a major concern. This dissertation attempts to build machine learning systems robust to such adversarial manipulation with the emphasis on providing theoretical performance guarantees. We consider adversaries in both test and training time, and make the following contributions. First, we study the robustness of machine learning algorithms and model to test-time adversarial examples. We analyze the distributional and finite sample robustness of nearest neighbor classification, and propose a modified 1-Nearest-Neighbor classifier that both has theoretical guarantee and empirical improvement in robustness. Second, we examine the robustness of malware detectors to program transformation. We propose novel attacks that evade existing detectors using program transformation, and then show program normalization as a provably robust defense against such transformation. Finally, we investigate data poisoning attacks and defenses for online learning, in which models update and predict over data stream in real-time. We show efficient attacks for general adversarial objectives, analyze the conditions for which filtering based defenses are effective, and provide practical guidance on choosing defense mechanisms and parameters.
- Published
- 2020
7. Robust and Accurate Authorship Attribution via Program Normalization
- Author
-
Wang, Yizhen, Alhanahnah, Mohannad, Wang, Ke, Christodorescu, Mihai, Jha, Somesh, Wang, Yizhen, Alhanahnah, Mohannad, Wang, Ke, Christodorescu, Mihai, and Jha, Somesh
- Abstract
Source code attribution approaches have achieved remarkable accuracy thanks to the rapid advances in deep learning. However, recent studies shed light on their vulnerability to adversarial attacks. In particular, they can be easily deceived by adversaries who attempt to either create a forgery of another author or to mask the original author. To address these emerging issues, we formulate this security challenge into a general threat model, the $\textit{relational adversary}$, that allows an arbitrary number of the semantics-preserving transformations to be applied to an input in any problem space. Our theoretical investigation shows the conditions for robustness and the trade-off between robustness and accuracy in depth. Motivated by these insights, we present a novel learning framework, $\textit{normalize-and-predict}$ ($\textit{N&P}$), that in theory guarantees the robustness of any authorship-attribution approach. We conduct an extensive evaluation of $\textit{N&P}$ in defending two of the latest authorship-attribution approaches against state-of-the-art attack methods. Our evaluation demonstrates that $\textit{N&P}$ improves the accuracy on adversarial inputs by as much as 70% over the vanilla models. More importantly, $\textit{N&P}$ also increases robust accuracy to 45% higher than adversarial training while running over 40 times faster.
- Published
- 2020
8. Robustness for Non-Parametric Classification: A Generic Attack and Defense
- Author
-
Yang, Yao-Yuan, Rashtchian, Cyrus, Wang, Yizhen, Chaudhuri, Kamalika, Yang, Yao-Yuan, Rashtchian, Cyrus, Wang, Yizhen, and Chaudhuri, Kamalika
- Abstract
Adversarially robust machine learning has received much recent attention. However, prior attacks and defenses for non-parametric classifiers have been developed in an ad-hoc or classifier-specific basis. In this work, we take a holistic look at adversarial examples for non-parametric classifiers, including nearest neighbors, decision trees, and random forests. We provide a general defense method, adversarial pruning, that works by preprocessing the dataset to become well-separated. To test our defense, we provide a novel attack that applies to a wide range of non-parametric classifiers. Theoretically, we derive an optimally robust classifier, which is analogous to the Bayes Optimal. We show that adversarial pruning can be viewed as a finite sample approximation to this optimal classifier. We empirically show that our defense and attack are either better than or competitive with prior work on non-parametric classifiers. Overall, our results provide a strong and broadly-applicable baseline for future work on robust non-parametrics. Code available at https://github.com/yangarbiter/adversarial-nonparametrics/ ., Comment: AISTATS 2020
- Published
- 2019
9. An Investigation of Data Poisoning Defenses for Online Learning
- Author
-
Wang, Yizhen, Jha, Somesh, Chaudhuri, Kamalika, Wang, Yizhen, Jha, Somesh, and Chaudhuri, Kamalika
- Abstract
Data poisoning attacks -- where an adversary can modify a small fraction of training data, with the goal of forcing the trained classifier to high loss -- are an important threat for machine learning in many applications. While a body of prior work has developed attacks and defenses, there is not much general understanding on when various attacks and defenses are effective. In this work, we undertake a rigorous study of defenses against data poisoning for online learning. First, we study four standard defenses in a powerful threat model, and provide conditions under which they can allow or resist rapid poisoning. We then consider a weaker and more realistic threat model, and show that the success of the adversary in the presence of data poisoning defenses there depends on the "ease" of the learning problem.
- Published
- 2019
10. Data Poisoning Attacks against Online Learning
- Author
-
Wang, Yizhen, Chaudhuri, Kamalika, Wang, Yizhen, and Chaudhuri, Kamalika
- Abstract
We consider data poisoning attacks, a class of adversarial attacks on machine learning where an adversary has the power to alter a small fraction of the training data in order to make the trained classifier satisfy certain objectives. While there has been much prior work on data poisoning, most of it is in the offline setting, and attacks for online learning, where training data arrives in a streaming manner, are not well understood. In this work, we initiate a systematic investigation of data poisoning attacks for online learning. We formalize the problem into two settings, and we propose a general attack strategy, formulated as an optimization problem, that applies to both with some modifications. We propose three solution strategies, and perform extensive experimental evaluation. Finally, we discuss the implications of our findings for building successful defenses.
- Published
- 2018
11. Analyzing the Robustness of Nearest Neighbors to Adversarial Examples
- Author
-
Wang, Yizhen, Jha, Somesh, Chaudhuri, Kamalika, Wang, Yizhen, Jha, Somesh, and Chaudhuri, Kamalika
- Abstract
Motivated by safety-critical applications, test-time attacks on classifiers via adversarial examples has recently received a great deal of attention. However, there is a general lack of understanding on why adversarial examples arise; whether they originate due to inherent properties of data or due to lack of training samples remains ill-understood. In this work, we introduce a theoretical framework analogous to bias-variance theory for understanding these effects. We use our framework to analyze the robustness of a canonical non-parametric classifier - the k-nearest neighbors. Our analysis shows that its robustness properties depend critically on the value of k - the classifier may be inherently non-robust for small k, but its robustness approaches that of the Bayes Optimal classifier for fast-growing k. We propose a novel modified 1-nearest neighbor classifier, and guarantee its robustness in the large sample limit. Our experiments suggest that this classifier may have good robustness properties even for reasonable data set sizes.
- Published
- 2017
12. Pufferfish Privacy Mechanisms for Correlated Data
- Author
-
Song, Shuang, Wang, Yizhen, Chaudhuri, Kamalika, Song, Shuang, Wang, Yizhen, and Chaudhuri, Kamalika
- Abstract
Many modern databases include personal and sensitive correlated data, such as private information on users connected together in a social network, and measurements of physical activity of single subjects across time. However, differential privacy, the current gold standard in data privacy, does not adequately address privacy issues in this kind of data. This work looks at a recent generalization of differential privacy, called Pufferfish, that can be used to address privacy in correlated data. The main challenge in applying Pufferfish is a lack of suitable mechanisms. We provide the first mechanism -- the Wasserstein Mechanism -- which applies to any general Pufferfish framework. Since this mechanism may be computationally inefficient, we provide an additional mechanism that applies to some practical cases such as physical activity measurements across time, and is computationally efficient. Our experimental evaluations indicate that this mechanism provides privacy and utility for synthetic as well as real data in two separate domains.
- Published
- 2016
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.