432 results on '"Kamath, P"'
Search Results
2. Tracing Chemical Depletion in Evolved Binaries Hosting Second-Generation Transition Discs
- Author
-
Mohorian, Maksym, Kamath, Devika, Menon, Meghna, Amarsi, Anish M., Van Winckel, Hans, Fava, Claudia, and Andrych, Kateryna
- Subjects
Astrophysics - Solar and Stellar Astrophysics ,Astrophysics - Astrophysics of Galaxies - Abstract
The mechanisms responsible for chemical depletion across diverse astrophysical environments are not yet fully understood. In this paper, we investigate chemical depletion in post-AGB/post-RGB binary stars hosting second-generation transition discs using high-resolution optical spectra from HERMES/Mercator and UVES/VLT. We performed a detailed chemical abundance analysis of 6 post-AGB/post-RGB stars and 6 post-AGB/post-RGB candidates with transition discs in the Galaxy and in the Large Magellanic Cloud. The atmospheric parameters and elemental abundances were obtained through 1D LTE analysis of chemical elements from C to Eu, and 1D NLTE corrections were incorporated for elements from C to Fe. Our results confirmed that depletion efficiency, traced by the [S/Ti] abundance ratio, is higher in post-AGB/post-RGB binaries with transition discs compared to the overall sample of post-AGB/post-RGB binaries. We also examined correlations between derived abundances and binary system parameters (astrometric, photometric, orbital, pulsational). Additionally, we compared the depletion patterns in our sample to those observed in young stars with transition discs and in the interstellar medium. We confirmed that the depletion is significantly stronger in post-AGB/post-RGB binaries with transition discs than in young stars with transition discs. Furthermore, we found that [X/Zn] abundance ratio trends of volatile and refractory elements in post-AGB/post-RGB binaries with transition discs generally resemble similar trends in the interstellar medium (except for trends of [Si/Zn] and [Mg/Zn] ratios). These findings, although based on a limited sample, provide indirect constraints for depletion mechanism in circumbinary discs around post-AGB/post-RGB stars., Comment: 31 pages, 11 figures, 11 tables, accepted to MNRAS
- Published
- 2025
3. PREM: Privately Answering Statistical Queries with Relative Error
- Author
-
Ghazi, Badih, Guzmán, Cristóbal, Kamath, Pritish, Knop, Alexander, Kumar, Ravi, Manurangsi, Pasin, and Sachdeva, Sushant
- Subjects
Computer Science - Machine Learning - Abstract
We introduce $\mathsf{PREM}$ (Private Relative Error Multiplicative weight update), a new framework for generating synthetic data that achieves a relative error guarantee for statistical queries under $(\varepsilon, \delta)$ differential privacy (DP). Namely, for a domain ${\cal X}$, a family ${\cal F}$ of queries $f : {\cal X} \to \{0, 1\}$, and $\zeta > 0$, our framework yields a mechanism that on input dataset $D \in {\cal X}^n$ outputs a synthetic dataset $\widehat{D} \in {\cal X}^n$ such that all statistical queries in ${\cal F}$ on $D$, namely $\sum_{x \in D} f(x)$ for $f \in {\cal F}$, are within a $1 \pm \zeta$ multiplicative factor of the corresponding value on $\widehat{D}$ up to an additive error that is polynomial in $\log |{\cal F}|$, $\log |{\cal X}|$, $\log n$, $\log(1/\delta)$, $1/\varepsilon$, and $1/\zeta$. In contrast, any $(\varepsilon, \delta)$-DP mechanism is known to require worst-case additive error that is polynomial in at least one of $n, |{\cal F}|$, or $|{\cal X}|$. We complement our algorithm with nearly matching lower bounds.
- Published
- 2025
4. Continual Learning Should Move Beyond Incremental Classification
- Author
-
Mitchell, Rupert, Alliegro, Antonio, Camoriano, Raffaello, Carrión-Ojeda, Dustin, Carta, Antonio, Chalvatzaki, Georgia, Churamani, Nikhil, D'Eramo, Carlo, Hamidi, Samin, Hesse, Robin, Hinder, Fabian, Kamath, Roshni Ramanna, Lomonaco, Vincenzo, Paul, Subarnaduti, Pistilli, Francesca, Tuytelaars, Tinne, van de Ven, Gido M, Kersting, Kristian, Schaub-Meyer, Simone, and Mundt, Martin
- Subjects
Computer Science - Machine Learning - Abstract
Continual learning (CL) is the sub-field of machine learning concerned with accumulating knowledge in dynamic environments. So far, CL research has mainly focused on incremental classification tasks, where models learn to classify new categories while retaining knowledge of previously learned ones. Here, we argue that maintaining such a focus limits both theoretical development and practical applicability of CL methods. Through a detailed analysis of concrete examples - including multi-target classification, robotics with constrained output spaces, learning in continuous task domains, and higher-level concept memorization - we demonstrate how current CL approaches often fail when applied beyond standard classification. We identify three fundamental challenges: (C1) the nature of continuity in learning problems, (C2) the choice of appropriate spaces and metrics for measuring similarity, and (C3) the role of learning objectives beyond classification. For each challenge, we provide specific recommendations to help move the field forward, including formalizing temporal dynamics through distribution processes, developing principled approaches for continuous task spaces, and incorporating density estimation and generative objectives. In so doing, this position paper aims to broaden the scope of CL research while strengthening its theoretical foundations, making it more applicable to real-world problems.
- Published
- 2025
5. SyntheticPop: Attacking Speaker Verification Systems With Synthetic VoicePops
- Author
-
Jamdar, Eshaq and Belman, Amith Kamath
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Machine Learning - Abstract
Voice Authentication (VA), also known as Automatic Speaker Verification (ASV), is a widely adopted authentication method, particularly in automated systems like banking services, where it serves as a secondary layer of user authentication. Despite its popularity, VA systems are vulnerable to various attacks, including replay, impersonation, and the emerging threat of deepfake audio that mimics the voice of legitimate users. To mitigate these risks, several defense mechanisms have been proposed. One such solution, Voice Pops, aims to distinguish an individual's unique phoneme pronunciations during the enrollment process. While promising, the effectiveness of VA+VoicePop against a broader range of attacks, particularly logical or adversarial attacks, remains insufficiently explored. We propose a novel attack method, which we refer to as SyntheticPop, designed to target the phoneme recognition capabilities of the VA+VoicePop system. The SyntheticPop attack involves embedding synthetic "pop" noises into spoofed audio samples, significantly degrading the model's performance. We achieve an attack success rate of over 95% while poisoning 20% of the training dataset. Our experiments demonstrate that VA+VoicePop achieves 69% accuracy under normal conditions, 37% accuracy when subjected to a baseline label flipping attack, and just 14% accuracy under our proposed SyntheticPop attack, emphasizing the effectiveness of our method.
- Published
- 2025
6. Language Models Largely Exhibit Human-like Constituent Ordering Preferences
- Author
-
Tur, Ada Defne, Kamath, Gaurav, and Reddy, Siva
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Though English sentences are typically inflexible vis-\`a-vis word order, constituents often show far more variability in ordering. One prominent theory presents the notion that constituent ordering is directly correlated with constituent weight: a measure of the constituent's length or complexity. Such theories are interesting in the context of natural language processing (NLP), because while recent advances in NLP have led to significant gains in the performance of large language models (LLMs), much remains unclear about how these models process language, and how this compares to human language processing. In particular, the question remains whether LLMs display the same patterns with constituent movement, and may provide insights into existing theories on when and how the shift occurs in human language. We compare a variety of LLMs with diverse properties to evaluate broad LLM performance on four types of constituent movement: heavy NP shift, particle movement, dative alternation, and multiple PPs. Despite performing unexpectedly around particle movement, LLMs generally align with human preferences around constituent ordering., Comment: NAACL 2025 Main Conference
- Published
- 2025
7. Modelling the impact of circumbinary disk accretion on post-AGB binary evolution and surface chemistry
- Author
-
Martin, Kayla, De Marco, Orsola, Kamath, Devika, Oomen, Glenn-Michael, and Van Winckel, Hans
- Subjects
Astrophysics - Solar and Stellar Astrophysics - Abstract
Post-asymptotic giant branch (post-AGB) binaries are surrounded by dusty circumbinary disks, and exhibit unexpected orbital properties resulting from poorly understood binary interaction processes. Re-accreted gas from the circumbinary disk alters the photospheric chemistry of the post-AGB star, producing a characteristic underabundance of refractory elements that correlates with condensation temperature $\unicode{x2013}$a phenomenon known as chemical depletion. This work investigates how re-accretion from a disk drives chemical depletion, and the impact accreted matter has on post-AGB evolution. We used the MESA code to evolve 0.55 and 0.60 M$_{\odot}$ post-AGB stars with the accretion of refractory element-depleted gas from a circumbinary disk. Our study adopts observationally-constrained initial accretion rates and disk masses to reproduce the chemical depletion patterns of six well-studied post-AGB binary stars: EP Lyr, HP Lyr, IRAS 17038-4815, IRAS 09144-4933, HD 131356, and SX Cen. We find high accretion rates ($>\,$10$^{-7}$ M$_{\odot}$yr$^{-1}$) and large disk masses ($\geq\,$10$^{-2}$ M$_{\odot}$) necessary to reproduce observed depletion, particularly in higher-mass, hotter post-AGB stars (T$_{\textrm{eff}}\geq$ 6000 K). A slower evolution (lower core mass) is required to reproduce cooler (T$_{\textrm{eff}}\leq$ 5000 K) depleted post-AGB stars. Rapid accretion significantly impacts post-AGB evolution, stalling stars at cooler effective temperatures and extending post-AGB lifetimes by factors of around 3 to 10. Despite this, extended post-AGB timescales remain within or below the planetary nebula (PN) visibility timescale, suggesting accretion cannot account for the observed lack of ionised PNe in post-AGB binaries. Our findings constrain accretion-flow parameters and advance our understanding of disk-binary interactions in post-AGB systems., Comment: 14 pages, 8 figures, 5 tables
- Published
- 2025
8. Enhancing Near Real Time AI-NWP Hurricane Forecasts: Improving Explainability and Performance Through Physics-Based Models and Land Surface Feedback
- Author
-
Sudharsan, Naveen, Singh, Manmeet, Talukdar, Sasanka, Mohanty, Shyama, Kamath, Harsh, Osuri, Krishna K., Dashtian, Hassan, Young, Michael, Yang, Zong-Liang, Dawson, Clint, Leung, L. Ruby, Gopalakrishnan, Sundararaman, Mehra, Avichal, Tallapragada, Vijay, and Niyogi, Dev
- Subjects
Physics - Atmospheric and Oceanic Physics ,Physics - Geophysics - Abstract
Hurricane track forecasting remains a significant challenge due to the complex interactions between the atmosphere, land, and ocean. Although AI-based numerical weather prediction models, such as Google Graphcast operation, have significantly improved hurricane track forecasts, they currently function as atmosphere-only models, omitting critical land and ocean interactions. To investigate the impact of land feedback, we conducted independent simulations using the physics-based Hurricane WRF experimental model to assess how soil moisture variations influence storm trajectories. Our results show that land surface conditions significantly alter storm paths, demonstrating the importance of land-atmosphere coupling in hurricane prediction. Although recent advances have introduced AI-based atmosphere-ocean coupled models, a fully functional AI-driven atmosphere-land-ocean model does not yet exist. Our findings suggest that AI-NWP models could be further improved by incorporating land surface interactions, improving both forecast accuracy and explainability. Developing a fully coupled AI-based weather model would mark a critical step toward more reliable and physically consistent hurricane forecasting, with direct applications for disaster preparedness and risk mitigation.
- Published
- 2025
9. Scaling Embedding Layers in Language Models
- Author
-
Yu, Da, Cohen, Edith, Ghazi, Badih, Huang, Yangsibo, Kamath, Pritish, Kumar, Ravi, Liu, Daogao, and Zhang, Chiyuan
- Subjects
Computer Science - Computation and Language ,Computer Science - Machine Learning - Abstract
We propose SCONE ($\textbf{S}$calable, $\textbf{C}$ontextualized, $\textbf{O}$ffloaded, $\textbf{N}$-gram $\textbf{E}$mbedding), a method for extending input embedding layers to enhance language model performance as layer size scales. To avoid increased decoding costs, SCONE retains the original vocabulary while introducing embeddings for a set of frequent $n$-grams. These embeddings provide contextualized representation for each input token and are learned with a separate model during training. During inference, they are precomputed and stored in off-accelerator memory with minimal impact on inference speed. SCONE enables two new scaling strategies: increasing the number of cached $n$-gram embeddings and scaling the model used to learn them, all while maintaining fixed inference-time FLOPS. We show that scaling both aspects allows SCONE to outperform a 1.9B parameter baseline across diverse corpora, while using only half the inference-time FLOPS.
- Published
- 2025
10. Deception in LLMs: Self-Preservation and Autonomous Goals in Large Language Models
- Author
-
Barkur, Sudarshan Kamath, Schacht, Sigurd, and Scholl, Johannes
- Subjects
Computer Science - Computation and Language - Abstract
Recent advances in Large Language Models (LLMs) have incorporated planning and reasoning capabilities, enabling models to outline steps before execution and provide transparent reasoning paths. This enhancement has reduced errors in mathematical and logical tasks while improving accuracy. These developments have facilitated LLMs' use as agents that can interact with tools and adapt their responses based on new information. Our study examines DeepSeek R1, a model trained to output reasoning tokens similar to OpenAI's o1. Testing revealed concerning behaviors: the model exhibited deceptive tendencies and demonstrated self-preservation instincts, including attempts of self-replication, despite these traits not being explicitly programmed (or prompted). These findings raise concerns about LLMs potentially masking their true objectives behind a facade of alignment. When integrating such LLMs into robotic systems, the risks become tangible - a physically embodied AI exhibiting deceptive behaviors and self-preservation instincts could pursue its hidden objectives through real-world actions. This highlights the critical need for robust goal specification and safety frameworks before any physical implementation., Comment: Corrected Version - Solved Some Issues with reference compilation by latex
- Published
- 2025
11. Optimal Preconditioning for Online Quadratic Cone Programming
- Author
-
Kamath, Abhinav G., Elango, Purnanand, and Açıkmeşe, Behçet
- Subjects
Mathematics - Optimization and Control - Abstract
First-order conic optimization solvers are sensitive to problem conditioning and typically perform poorly in the face of ill-conditioned problem data. To mitigate this, we propose an approach to preconditioning for a class of quadratic cone programs (QCPs), i.e., conic optimization problems with a quadratic objective function, wherein the objective function is strongly convex and possesses a certain structure. This approach lends itself to factorization-free, customizable, first-order conic optimization for online applications wherein the solver is called repeatedly to solve problems of the same size/structure, but with changing problem data. One of the steps in the proposed preconditioning procedure is to scale the objective function: in addition to deriving an analytical expression for the optimal objective function scaling factor, we establish the relationship between the objective function scaling factor and the primal-dual step-size ratio for a first-order method, the proportional-integral projected gradient method (PIPG), which applies to the general class of QCPs, including quadratic programs (QPs), second-order cone programs (SOCPs), and semidefinite programs (SDPs). We demonstrate the efficacy of our approach on a numerical nonconvex trajectory optimization example, using sequential conic optimization (SeCO).
- Published
- 2025
12. Exploring LLMs for Automated Pre-Testing of Cross-Cultural Surveys
- Author
-
Adhikari, Divya Mani, Cannanure, Vikram Kamath, Hartland, Alexander, and Weber, Ingmar
- Subjects
Computer Science - Human-Computer Interaction ,Computer Science - Computers and Society - Abstract
Designing culturally relevant questionnaires for ICTD research is challenging, particularly when adapting surveys for populations to non-western contexts. Prior work adapted questionnaires through expert reviews and pilot studies, which are resource-intensive and time-consuming. To address these challenges, we propose using large language models (LLMs) to automate the questionnaire pretesting process in cross-cultural settings. Our study used LLMs to adapt a U.S.-focused climate opinion survey for a South African audience. We then tested the adapted questionnaire with 116 South African participants via Prolific, asking them to provide feedback on both versions. Participants perceived the LLM-adapted questions as slightly more favorable than the traditional version. Our note opens discussions on the potential role of LLMs in adapting surveys and facilitating cross-cultural questionnaire design., Comment: Accepted to ICTD 2024 (Notes)
- Published
- 2025
13. Applying Think-Aloud in ICTD: A Case Study of a Chatbot Use by Teachers in Rural C\^ote d'Ivoire
- Author
-
Cannanure, Vikram Kamath, Wolf, Sharon, Jasińska, Kaja, Brown, Timothy X, and Ogan, Amy
- Subjects
Computer Science - Human-Computer Interaction ,H.5.2 ,K.3.1 ,K.4.2 - Abstract
Think-alouds are a common HCI usability method where participants verbalize their thoughts while using interfaces. However, their utility in cross-cultural settings, particularly in the Global South, is unclear, where cultural differences impact user interactions. This paper investigates the usability challenges teachers in rural C\^ote d'Ivoire faced when using a chatbot designed to support an educational program. We conducted think-aloud sessions with 20 teachers two weeks after a chatbot deployment, analyzing their navigation, errors, and time spent on tasks. We discuss our approach and findings that helped us identify usability issues and challenging features for improving the chatbot designs. Our note summarizes our reflections on using think-aloud and contributes to discussions on its culturally sensitive adaptation in the Global South., Comment: ICTD 24, Notes track. International Conference on Information & Communication Technologies and Development 2024
- Published
- 2025
14. Planetary Nebulae of the Large Magellanic Cloud II: the connection with the progenitors' properties
- Author
-
Ventura, P., Tosi, S., García-Hernández, D. A., Dell'Agli, F., Kamath, D., Stanghellini, L., Bianchi, S., Tailo, M., and Gómez-Muñoz, M. A.
- Subjects
Astrophysics - Solar and Stellar Astrophysics ,Astrophysics - Astrophysics of Galaxies - Abstract
The study of planetary nebulae (PNe) offers the opportunity of evaluating the efficiency of the dust production mechanism during the very late asymptotic giant branch (AGB) phases. We study the relationship between the properties of PNe, particularly the gas and dust content, with the mass and metallicity of the progenitor stars, to understand how dust production works in the late AGB phases, and to shed new light on the physical processes occurring to the stars and the material in their surroundings since the departure from the AGB until the PN phase. We consider a sample of 9 PNe in the Large Magellanic Cloud, 7 out of which characterized by the presence of carbonaceous dust, the remaining 2 with silicates. For these stars the masses and the metallicity of the progenitor stars were estimated. We combine results from stellar evolution and dust formation modelling with those coming from the analysis of the spectral energy distribution, to find the relation between the dust and gas mass of the PNe considered and the The physical properties of carbon-rich PNe are influenced by the mass of the progenitor star. Specifically, the dust-to-gas ratio in the nebula increases from 5x10^{-4}to 6x10^{-3} as the progenitor star's mass increases from approximately 0.9Msun to 2Msun. This change is partly influenced by the effective temperature of the PNe, and it occurs because higher-mass carbon stars are more efficient at producing dust. Consequently, as the progenitor's mass increases, the gas mass of the PNe decreases, since the larger amounts of dust lead to greater effects from radiation pressure, which pushes the gas outwards. No meaningful conclusions can be drawn by the study of the PNe with silicate-type dust, because the sub-sample is made up of 2 PNe only, one of which is almost dust-free., Comment: accepted for publication on Astronomy and Astrophysics. 9 pages, 6 figures
- Published
- 2025
- Full Text
- View/download PDF
15. CuRLA: Curriculum Learning Based Deep Reinforcement Learning for Autonomous Driving
- Author
-
Uppuluri, Bhargava, Patel, Anjel, Mehta, Neil, Kamath, Sridhar, and Chakraborty, Pratyush
- Subjects
Computer Science - Robotics ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
In autonomous driving, traditional Computer Vision (CV) agents often struggle in unfamiliar situations due to biases in the training data. Deep Reinforcement Learning (DRL) agents address this by learning from experience and maximizing rewards, which helps them adapt to dynamic environments. However, ensuring their generalization remains challenging, especially with static training environments. Additionally, DRL models lack transparency, making it difficult to guarantee safety in all scenarios, particularly those not seen during training. To tackle these issues, we propose a method that combines DRL with Curriculum Learning for autonomous driving. Our approach uses a Proximal Policy Optimization (PPO) agent and a Variational Autoencoder (VAE) to learn safe driving in the CARLA simulator. The agent is trained using two-fold curriculum learning, progressively increasing environment difficulty and incorporating a collision penalty in the reward function to promote safety. This method improves the agent's adaptability and reliability in complex environments, and understand the nuances of balancing multiple reward components from different feedback signals in a single scalar reward function. Keywords: Computer Vision, Deep Reinforcement Learning, Variational Autoencoder, Proximal Policy Optimization, Curriculum Learning, Autonomous Driving., Comment: To be published in the 17th International Conference on Agents and Artificial Intelligence (ICAART), Feb 2025
- Published
- 2025
16. BridgePure: Revealing the Fragility of Black-box Data Protection
- Author
-
Wang, Yihan, Lu, Yiwei, Gao, Xiao-Shan, Kamath, Gautam, and Yu, Yaoliang
- Subjects
Computer Science - Machine Learning - Abstract
Availability attacks, or unlearnable examples, are defensive techniques that allow data owners to modify their datasets in ways that prevent unauthorized machine learning models from learning effectively while maintaining the data's intended functionality. It has led to the release of popular black-box tools for users to upload personal data and receive protected counterparts. In this work, we show such black-box protections can be substantially bypassed if a small set of unprotected in-distribution data is available. Specifically, an adversary can (1) easily acquire (unprotected, protected) pairs by querying the black-box protections with the unprotected dataset; and (2) train a diffusion bridge model to build a mapping. This mapping, termed BridgePure, can effectively remove the protection from any previously unseen data within the same distribution. Under this threat model, our method demonstrates superior purification performance on classification and style mimicry tasks, exposing critical vulnerabilities in black-box data protection., Comment: 26 pages,13 figures
- Published
- 2024
17. On the Differential Privacy and Interactivity of Privacy Sandbox Reports
- Author
-
Ghazi, Badih, Harrison, Charlie, Hosabettu, Arpana, Kamath, Pritish, Knop, Alexander, Kumar, Ravi, Leeman, Ethan, Manurangsi, Pasin, Raykova, Mariana, Sahu, Vikas, and Schoppmann, Phillipp
- Subjects
Computer Science - Cryptography and Security - Abstract
The Privacy Sandbox initiative from Google includes APIs for enabling privacy-preserving advertising functionalities as part of the effort around limiting third-party cookies. In particular, the Private Aggregation API (PAA) and the Attribution Reporting API (ARA) can be used for ad measurement while providing different guardrails for safeguarding user privacy, including a framework for satisfying differential privacy (DP). In this work, we provide an abstract model for analyzing the privacy of these APIs and show that they satisfy a formal DP guarantee under certain assumptions. Our analysis handles the case where both the queries and database can change interactively based on previous responses from the API., Comment: To appear in Proceedings of Privacy Enhancing Technologies 2025
- Published
- 2024
18. Balls-and-Bins Sampling for DP-SGD
- Author
-
Chua, Lynn, Ghazi, Badih, Harrison, Charlie, Leeman, Ethan, Kamath, Pritish, Kumar, Ravi, Manurangsi, Pasin, Sinha, Amer, and Zhang, Chiyuan
- Subjects
Computer Science - Machine Learning ,Computer Science - Cryptography and Security ,Computer Science - Data Structures and Algorithms ,Statistics - Machine Learning - Abstract
We introduce the Balls-and-Bins sampling for differentially private (DP) optimization methods such as DP-SGD. While it has been common practice to use some form of shuffling in DP-SGD implementations, privacy accounting algorithms have typically assumed that Poisson subsampling is used instead. Recent work by Chua et al. (ICML 2024) however pointed out that shuffling based DP-SGD can have a much larger privacy cost in practical regimes of parameters. We show that the Balls-and-Bins sampling achieves the "best-of-both" samplers, namely, the implementation of Balls-and-Bins sampling is similar to that of Shuffling and models trained using DP-SGD with Balls-and-Bins sampling achieve utility comparable to those trained using DP-SGD with Shuffling at the same noise multiplier, and yet, Balls-and-Bins sampling enjoys similar-or-better privacy amplification as compared to Poisson subsampling in practical regimes.
- Published
- 2024
19. The Broader Landscape of Robustness in Algorithmic Statistics
- Author
-
Kamath, Gautam
- Subjects
Statistics - Machine Learning ,Computer Science - Cryptography and Security ,Computer Science - Data Structures and Algorithms ,Computer Science - Information Theory ,Mathematics - Statistics Theory - Abstract
The last decade has seen a number of advances in computationally efficient algorithms for statistical methods subject to robustness constraints. An estimator may be robust in a number of different ways: to contamination of the dataset, to heavy-tailed data, or in the sense that it preserves privacy of the dataset. We survey recent results in these areas with a focus on the problem of mean estimation, drawing technical and conceptual connections between the various forms of robustness, showing that the same underlying algorithmic ideas lead to computationally efficient estimators in all these settings.
- Published
- 2024
20. Machine Learned Potential for High-Throughput Phonon Calculations of Metal-Organic Frameworks
- Author
-
Elena, Alin Marin, Kamath, Prathami Divakar, Inizan, Théo Jaffrelot, Rosen, Andrew S., Zanca, Federica, and Persson, Kristin A.
- Subjects
Condensed Matter - Materials Science ,Physics - Chemical Physics - Abstract
Metal-organic frameworks (MOFs) are highly porous and versatile materials studied extensively for applications such as carbon capture and water harvesting. However, computing phonon-mediated properties in MOFs, like thermal expansion and mechanical stability, remains challenging due to the large number of atoms per unit cell, making traditional Density Functional Theory (DFT) methods impractical for high-throughput screening. Recent advances in machine learning potentials have led to foundation atomistic models, such as MACE-MP-0, that accurately predict equilibrium structures but struggle with phonon properties of MOFs. In this work, we developed a workflow for computing phonons in MOFs within the quasi-harmonic approximation with a fine-tuned MACE model, MACE-MP-MOF0. The model was trained on a curated dataset of 127 representative and diverse MOFs. The fine-tuned MACE-MP-MOF0 improves the accuracy of phonon density of states and corrects the imaginary phonon modes of MACE-MP-0, enabling high-throughput phonon calculations with state-of-the-art precision. The model successfully predicts thermal expansion and bulk moduli in agreement with DFT and experimental data for several well-known MOFs. These results highlight the potential of MACE-MP-MOF0 in guiding MOF design for applications in energy storage and thermoelectrics., Comment: 17 pages, 8 Figures
- Published
- 2024
21. Using Binary Population Synthesis to Examine the Impact of Binary Evolution on the C, N, O, and $S$-Process Yields of Solar-Metallicity Low- and Intermediate-Mass Stars
- Author
-
Osborn, Zara, Karakas, Amanda I., Kemp, Alex J., Izzard, Robert, Kamath, Devika, and Lugaro, Maria
- Subjects
Astrophysics - Solar and Stellar Astrophysics ,Astrophysics - Astrophysics of Galaxies - Abstract
Asymptotic giant branch (AGB) stars play a significant role in our understanding of the origin of the elements. They contribute to the abundances of C, N, and approximately $50\%$ of the abundances of the elements heavier than iron. An aspect often neglected in studies of AGB stars is the impact of a stellar companion on AGB stellar evolution and nucleosynthesis. In this study, we update the stellar abundances of AGB stars in the binary population synthesis code \textsc{binary\_c} and calibrate our treatment of the third dredge-up using observations of Galactic carbon stars. We model stellar populations of low- to intermediate-mass stars at solar-metallicity and examine the stellar wind contributions to C, N, O, Sr, Ba, and Pb yields at binary fractions between 0 and 1. For a stellar population with a binary fraction of 0.7, we find $\sim 20-25\%$ less C and $s$-process elements ejected than from a population composed of only single stars, and we find little change in the N and O yields. We also compare our models with observed abundances from Ba stars and find our models can reproduce most Ba star abundances, but our population estimates a higher frequency of Ba stars with a surface [Ce/Y] > $+0.2\,$dex. Our models also predict the rare existence of Ba stars with masses $> 10 \text{M}\,_\odot$., Comment: 22 pages, 16 figures, 4 tables, this paper has been accepted for publication in PASA
- Published
- 2024
22. Multiwavelength high-resolution polarimetric imaging of second-generation disc around post-AGB binary IRAS 08544-4431 with SPHERE
- Author
-
Andrych, Kateryna, Kamath, Devika, Van Winckel, Hans, Kluska, Jacques, Schmid, Hans Martin, Corporaal, Akke, and Milli, Julien
- Subjects
Astrophysics - Solar and Stellar Astrophysics ,Astrophysics - Earth and Planetary Astrophysics - Abstract
Exploring the formation and evolution of second-generation circumbinary discs around evolved binary stars, such as post-Asymptotic Giant Branch (post-AGB) and post-Red Giant Branch (post-RGB) binaries, provides valuable insights into the complex binary interaction process that concludes the red-giant phase of evolution in these systems. Additionally, it offers a novel opportunity to investigate the formation of second-generation planets within dusty discs surrounding evolved stars. We present a pilot multi-wavelength polarimetric imaging study of the post-AGB binary system IRAS 08544-4431 using the European Southern Observatory-Very Large Telescope/SPHERE instrument. This study is focused on optical V- and I'-band ZIMPOL data to complement near-infrared H-band IRDIS data presented previously. The study aims to investigate the dust scattering properties and surface morphology of the post-AGB circumbinary disc as a function of wavelength. We successfully resolved the extended disc structure of IRAS\,08544-4431, revealing a complex disc morphology, high polarimetric disc brightness (up to ~1.5%), and significant forward scattering at optical wavelengths. Additionally, we found that the disc shows a grey polarimetric colour in both optical and near-infrared. The findings highlight similarities between post-AGB circumbinary discs and protoplanetary discs, suggesting submicron-size porous aggregates as the dominant surface dust composition, and indicating potential warping within the disc. However, further expansion of the multi-wavelength analysis to a larger sample of post-AGB binary systems, as well as high-resolution observations of dust continuum and gas emission, is necessary to fully explore the underlying structure of post-AGB circumbinary discs and associated physical mechanisms., Comment: 15 pages, 15 figures
- Published
- 2024
- Full Text
- View/download PDF
23. Privacy Protection in Personalized Diffusion Models via Targeted Cross-Attention Adversarial Attack
- Author
-
Xu, Xide, Butt, Muhammad Atif, Kamath, Sandesh, and Raducanu, Bogdan
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning - Abstract
The growing demand for customized visual content has led to the rise of personalized text-to-image (T2I) diffusion models. Despite their remarkable potential, they pose significant privacy risk when misused for malicious purposes. In this paper, we propose a novel and efficient adversarial attack method, Concept Protection by Selective Attention Manipulation (CoPSAM) which targets only the cross-attention layers of a T2I diffusion model. For this purpose, we carefully construct an imperceptible noise to be added to clean samples to get their adversarial counterparts. This is obtained during the fine-tuning process by maximizing the discrepancy between the corresponding cross-attention maps of the user-specific token and the class-specific token, respectively. Experimental validation on a subset of CelebA-HQ face images dataset demonstrates that our approach outperforms existing methods. Besides this, our method presents two important advantages derived from the qualitative evaluation: (i) we obtain better protection results for lower noise levels than our competitors; and (ii) we protect the content from unauthorized use thereby protecting the individual's identity from potential misuse., Comment: Accepted at Safe Generative AI Workshop (NeurIPS 2024)
- Published
- 2024
24. General Geospatial Inference with a Population Dynamics Foundation Model
- Author
-
Agarwal, Mohit, Sun, Mimi, Kamath, Chaitanya, Muslim, Arbaaz, Sarker, Prithul, Paul, Joydeep, Yee, Hector, Sieniek, Marcin, Jablonski, Kim, Mayer, Yael, Fork, David, de Guia, Sheila, McPike, Jamie, Boulanger, Adam, Shekel, Tomer, Schottlander, David, Xiao, Yao, Manukonda, Manjit Chakravarthy, Liu, Yun, Bulut, Neslihan, Abu-el-haija, Sami, Perozzi, Bryan, Bharel, Monica, Nguyen, Von, Barrington, Luke, Efron, Niv, Matias, Yossi, Corrado, Greg, Eswaran, Krish, Prabhakara, Shruthi, Shetty, Shravya, and Prasad, Gautam
- Subjects
Computer Science - Machine Learning ,Computer Science - Computers and Society - Abstract
Supporting the health and well-being of dynamic populations around the world requires governmental agencies, organizations and researchers to understand and reason over complex relationships between human behavior and local contexts in order to identify high-risk groups and strategically allocate limited resources. Traditional approaches to these classes of problems often entail developing manually curated, task-specific features and models to represent human behavior and the natural and built environment, which can be challenging to adapt to new, or even, related tasks. To address this, we introduce a Population Dynamics Foundation Model (PDFM) that aims to capture the relationships between diverse data modalities and is applicable to a broad range of geospatial tasks. We first construct a geo-indexed dataset for postal codes and counties across the United States, capturing rich aggregated information on human behavior from maps, busyness, and aggregated search trends, and environmental factors such as weather and air quality. We then model this data and the complex relationships between locations using a graph neural network, producing embeddings that can be adapted to a wide range of downstream tasks using relatively simple models. We evaluate the effectiveness of our approach by benchmarking it on 27 downstream tasks spanning three distinct domains: health indicators, socioeconomic factors, and environmental measurements. The approach achieves state-of-the-art performance on all 27 geospatial interpolation tasks, and on 25 out of the 27 extrapolation and super-resolution tasks. We combined the PDFM with a state-of-the-art forecasting foundation model, TimesFM, to predict unemployment and poverty, achieving performance that surpasses fully supervised forecasting. The full set of embeddings and sample code are publicly available for researchers., Comment: 28 pages, 16 figures, preprint; v4: updated authors
- Published
- 2024
25. Scalable DP-SGD: Shuffling vs. Poisson Subsampling
- Author
-
Chua, Lynn, Ghazi, Badih, Kamath, Pritish, Kumar, Ravi, Manurangsi, Pasin, Sinha, Amer, and Zhang, Chiyuan
- Subjects
Computer Science - Machine Learning ,Computer Science - Cryptography and Security ,Computer Science - Data Structures and Algorithms - Abstract
We provide new lower bounds on the privacy guarantee of the multi-epoch Adaptive Batch Linear Queries (ABLQ) mechanism with shuffled batch sampling, demonstrating substantial gaps when compared to Poisson subsampling; prior analysis was limited to a single epoch. Since the privacy analysis of Differentially Private Stochastic Gradient Descent (DP-SGD) is obtained by analyzing the ABLQ mechanism, this brings into serious question the common practice of implementing shuffling-based DP-SGD, but reporting privacy parameters as if Poisson subsampling was used. To understand the impact of this gap on the utility of trained machine learning models, we introduce a practical approach to implement Poisson subsampling at scale using massively parallel computation, and efficiently train models with the same. We compare the utility of models trained with Poisson-subsampling-based DP-SGD, and the optimistic estimates of utility when using shuffling, via our new lower bounds on the privacy guarantee of ABLQ with shuffling., Comment: To appear at NeurIPS 2024
- Published
- 2024
26. MuCol Milestone Report No. 5: Preliminary Parameters
- Author
-
Accettura, Carlotta, Adrian, Simon, Agarwal, Rohit, Ahdida, Claudia, Aimé, Chiara, Aksoy, Avni, Alberghi, Gian Luigi, Alden, Siobhan, Alfonso, Luca, Amapane, Nicola, Amorim, David, Andreetto, Paolo, Anulli, Fabio, Appleby, Rob, Apresyan, Artur, Asadi, Pouya, Mahmoud, Mohammed Attia, Auchmann, Bernhard, Back, John, Badea, Anthony, Bae, Kyu Jung, Bahng, E. J., Balconi, Lorenzo, Balli, Fabrice, Bandiera, Laura, Barbagallo, Carmelo, Barlow, Roger, Bartoli, Camilla, Bartosik, Nazar, Barzi, Emanuela, Batsch, Fabian, Bauce, Matteo, Begel, Michael, Berg, J. Scott, Bersani, Andrea, Bertarelli, Alessandro, Bertinelli, Francesco, Bertolin, Alessandro, Bhat, Pushpalatha, Bianchi, Clarissa, Bianco, Michele, Bishop, William, Black, Kevin, Boattini, Fulvio, Bogacz, Alex, Bonesini, Maurizio, Bordini, Bernardo, de Sousa, Patricia Borges, Bottaro, Salvatore, Bottura, Luca, Boyd, Steven, Breschi, Marco, Broggi, Francesco, Brunoldi, Matteo, Buffat, Xavier, Buonincontri, Laura, Burrows, Philip Nicholas, Burt, Graeme Campbell, Buttazzo, Dario, Caiffi, Barbara, Calatroni, Sergio, Calviani, Marco, Calzaferri, Simone, Calzolari, Daniele, Cantone, Claudio, Capdevilla, Rodolfo, Carli, Christian, Carrelli, Carlo, Casaburo, Fausto, Casarsa, Massimo, Castelli, Luca, Catanesi, Maria Gabriella, Cavallucci, Lorenzo, Cavoto, Gianluca, Celiberto, Francesco Giovanni, Celona, Luigi, Cemmi, Alessia, Ceravolo, Sergio, Cerri, Alessandro, Cerutti, Francesco, Cesarini, Gianmario, Cesarotti, Cari, Chancé, Antoine, Charitonidis, Nikolaos, Chiesa, Mauro, Chiggiato, Paolo, Ciccarella, Vittoria Ludovica, Puviani, Pietro Cioli, Colaleo, Anna, Colao, Francesco, Collamati, Francesco, Costa, Marco, Craig, Nathaniel, Curtin, David, Damerau, Heiko, Da Molin, Giacomo, D'Angelo, Laura, Dasu, Sridhara, de Blas, Jorge, De Curtis, Stefania, De Gersem, Herbert, Delahaye, Jean-Pierre, Del Moro, Tommaso, Denisov, Dmitri, Denizli, Haluk, Dermisek, Radovan, Valdor, Paula Desiré, Desponds, Charlotte, Di Luzio, Luca, Di Meco, Elisa, Diociaiuti, Eleonora, Di Petrillo, Karri Folan, Di Sarcina, Ilaria, Dorigo, Tommaso, Dreimanis, Karlis, Pree, Tristan du, Yildiz, Hatice Duran, Edgecock, Thomas, Fabbri, Siara, Fabbrichesi, Marco, Farinon, Stefania, Ferrand, Guillaume, Somoza, Jose Antonio Ferreira, Fieg, Max, Filthaut, Frank, Fox, Patrick, Franceschini, Roberto, Ximenes, Rui Franqueira, Gallinaro, Michele, Garcia-Sciveres, Maurice, Garcia-Tabares, Luis, Gargiulo, Ruben, Garion, Cedric, Garzelli, Maria Vittoria, Gast, Marco, Generoso, Lisa, Gerber, Cecilia E., Giambastiani, Luca, Gianelle, Alessio, Gianfelice-Wendt, Eliana, Gibson, Stephen, Gilardoni, Simone, Giove, Dario Augusto, Giovinco, Valentina, Giraldin, Carlo, Glioti, Alfredo, Gorzawski, Arkadiusz, Greco, Mario, Grojean, Christophe, Grudiev, Alexej, Gschwendtner, Edda, Gueli, Emanuele, Guilhaudin, Nicolas, Han, Chengcheng, Han, Tao, Hauptman, John Michael, Herndon, Matthew, Hillier, Adrian D, Hillman, Micah, Holmes, Tova Ray, Homiller, Samuel, Jana, Sudip, Jindariani, Sergo, Johannesson, Sofia, Johnson, Benjamin, Jones, Owain Rhodri, Jurj, Paul-Bogdan, Kahn, Yonatan, Kamath, Rohan, Kario, Anna, Karpov, Ivan, Kelliher, David, Kilian, Wolfgang, Kitano, Ryuichiro, Kling, Felix, Kolehmainen, Antti, Kong, K. C., Kosse, Jaap, Krintiras, Georgios, Krizka, Karol, Kumar, Nilanjana, Kvikne, Erik, Kyle, Robert, Laface, Emanuele, Lane, Kenneth, Latina, Andrea, Lechner, Anton, Lee, Junghyun, Lee, Lawrence, Lee, Seh Wook, Lefevre, Thibaut, Leonardi, Emanuele, Lerner, Giuseppe, Li, Peiran, Li, Qiang, Li, Tong, Li, Wei, Lindroos, Mats, Lipton, Ronald, Liu, Da, Liu, Miaoyuan, Liu, Zhen, Voti, Roberto Li, Lombardi, Alessandra, Lomte, Shivani, Long, Kenneth, Longo, Luigi, Lorenzo, José, Losito, Roberto, Low, Ian, Lu, Xianguo, Lucchesi, Donatella, Luo, Tianhuan, Lupato, Anna, Ma, Yang, Machida, Shinji, Madlener, Thomas, Magaletti, Lorenzo, Maggi, Marcello, Durand, Helene Mainaud, Maltoni, Fabio, Manczak, Jerzy Mikolaj, Mandurrino, Marco, Marchand, Claude, Mariani, Francesco, Marin, Stefano, Mariotto, Samuele, Martin-Haugh, Stewart, Masullo, Maria Rosaria, Mauro, Giorgio Sebastiano, Mazzolari, Andrea, Mękała, Krzysztof, Mele, Barbara, Meloni, Federico, Meng, Xiangwei, Mentink, Matthias, Métral, Elias, Miceli, Rebecca, Milas, Natalia, Mohammadi, Abdollah, Moll, Dominik, Montella, Alessandro, Morandin, Mauro, Morrone, Marco, Mulder, Tim, Musenich, Riccardo, Nardecchia, Marco, Nardi, Federico, Nenna, Felice, Neuffer, David, Newbold, David, Novelli, Daniel, Olvegård, Maja, Onel, Yasar, Orestano, Domizia, Osborne, John, Otten, Simon, Torres, Yohan Mauricio Oviedo, Paesani, Daniele, Griso, Simone Pagan, Pagani, Davide, Pal, Kincso, Palmer, Mark, Pampaloni, Alessandra, Panci, Paolo, Pani, Priscilla, Papaphilippou, Yannis, Paparella, Rocco, Paradisi, Paride, Passeri, Antonio, Pasternak, Jaroslaw, Pastrone, Nadia, Pellecchia, Antonello, Piccinini, Fulvio, Piekarz, Henryk, Pieloni, Tatiana, Plouin, Juliette, Portone, Alfredo, Potamianos, Karolos, Potdevin, Joséphine, Prestemon, Soren, Puig, Teresa, Qiang, Ji, Quettier, Lionel, Rabemananjara, Tanjona Radonirina, Radicioni, Emilio, Radogna, Raffaella, Rago, Ilaria Carmela, Ratkus, Andris, Resseguie, Elodie, Reuter, Juergen, Ribani, Pier Luigi, Riccardi, Cristina, Ricciardi, Stefania, Robens, Tania, Robert, Youri, Rogers, Chris, Rojo, Juan, Romagnoni, Marco, Ronald, Kevin, Rosser, Benjamin, Rossi, Carlo, Rossi, Lucio, Rozanov, Leo, Ruhdorfer, Maximilian, Ruiz, Richard, Saini, Saurabh, Sala, Filippo, Salierno, Claudia, Salmi, Tiina, Salvini, Paola, Salvioni, Ennio, Sammut, Nicholas, Santini, Carlo, Saputi, Alessandro, Sarra, Ivano, Scarantino, Giuseppe, Schneider-Muntau, Hans, Schulte, Daniel, Scifo, Jessica, Sen, Tanaji, Senatore, Carmine, Senol, Abdulkadir, Sertore, Daniele, Sestini, Lorenzo, Rêgo, Ricardo César Silva, Simone, Federica Maria, Skoufaris, Kyriacos, Sorbello, Gino, Sorbi, Massimo, Sorti, Stefano, Soubirou, Lisa, Spataro, David, Queiroz, Farinaldo S., Stamerra, Anna, Stapnes, Steinar, Stark, Giordon, Statera, Marco, Stechauner, Bernd Michael, Su, Shufang, Su, Wei, Sun, Xiaohu, Sytov, Alexei, Tang, Jian, Tang, Jingyu, Taylor, Rebecca, Kate, Herman Ten, Testoni, Pietro, Thiele, Leonard Sebastian, Garcia, Rogelio Tomas, Topp-Mugglestone, Max, Torims, Toms, Torre, Riccardo, Tortora, Luca, Tortora, Ludovico, Trifinopoulos, Sokratis, Udongwo, Sosoho-Abasi, Vai, Ilaria, Valente, Riccardo Umberto, van Rienen, Ursula, Van Weelderen, Rob, Vanwelde, Marion, Velev, Gueorgui, Venditti, Rosamaria, Vendrasco, Adam, Verna, Adriano, Vernassa, Gianluca, Verweij, Arjan, Verwilligen, Piet, Villamizar, Yoxara, Vittorio, Ludovico, Vitulo, Paolo, Vojskovic, Isabella, Wang, Dayong, Wang, Lian-Tao, Wang, Xing, Wendt, Manfred, Widorski, Markus, Wozniak, Mariusz, Wu, Yongcheng, Wulzer, Andrea, Xie, Keping, Yang, Yifeng, Yap, Yee Chinn, Yonehara, Katsuya, Yoo, Hwi Dong, You, Zhengyun, Zanetti, Marco, Zaza, Angela, Zhang, Liang, Zhu, Ruihu, Zlobin, Alexander, Zuliani, Davide, and Zurita, José Francisco
- Subjects
Physics - Accelerator Physics - Abstract
This document is comprised of a collection of updated preliminary parameters for the key parts of the muon collider. The updated preliminary parameters follow on from the October 2023 Tentative Parameters Report. Particular attention has been given to regions of the facility that are believed to hold greater technical uncertainty in their design and that have a strong impact on the cost and power consumption of the facility. The data is collected from a collaborative spreadsheet and transferred to overleaf.
- Published
- 2024
- Full Text
- View/download PDF
27. Community search signatures as foundation features for human-centered geospatial modeling
- Author
-
Sun, Mimi, Kamath, Chaitanya, Agarwal, Mohit, Muslim, Arbaaz, Yee, Hector, Schottlander, David, Bavadekar, Shailesh, Efron, Niv, Shetty, Shravya, and Prasad, Gautam
- Subjects
Computer Science - Machine Learning - Abstract
Aggregated relative search frequencies offer a unique composite signal reflecting people's habits, concerns, interests, intents, and general information needs, which are not found in other readily available datasets. Temporal search trends have been successfully used in time series modeling across a variety of domains such as infectious diseases, unemployment rates, and retail sales. However, most existing applications require curating specialized datasets of individual keywords, queries, or query clusters, and the search data need to be temporally aligned with the outcome variable of interest. We propose a novel approach for generating an aggregated and anonymized representation of search interest as foundation features at the community level for geospatial modeling. We benchmark these features using spatial datasets across multiple domains. In zip codes with a population greater than 3000 that cover over 95% of the contiguous US population, our models for predicting missing values in a 20% set of holdout counties achieve an average $R^2$ score of 0.74 across 21 health variables, and 0.80 across 6 demographic and environmental variables. Our results demonstrate that these search features can be used for spatial predictions without strict temporal alignment, and that the resulting models outperform spatial interpolation and state of the art methods using satellite imagery features., Comment: 8 pages, 8 figures, presented at the DMLR workshop at ICML 2024
- Published
- 2024
28. POD-Attention: Unlocking Full Prefill-Decode Overlap for Faster LLM Inference
- Author
-
Kamath, Aditya K, Prabhu, Ramya, Mohan, Jayashree, Peter, Simon, Ramjee, Ramachandran, and Panwar, Ashish
- Subjects
Computer Science - Machine Learning ,Computer Science - Distributed, Parallel, and Cluster Computing ,I.2.7 ,C.1.4 - Abstract
Each request in LLM inference goes through two phases: compute-bound prefill and memory-bandwidth-bound decode. To improve GPU utilization, recent systems use hybrid batching that combines the prefill and decode phases of different requests into the same batch. This approach optimizes linear operations but remains inefficient for attention computation because existing attention kernels specialize execution independently for the prefill and decode phases. In this paper, we present POD-Attention - the first GPU kernel that efficiently computes attention for hybrid batches. POD-Attention aims to maximize the utilization of both compute and memory bandwidth by carefully allocating the GPU's resources such that prefill and decode operations happen concurrently on the same multiprocessor. POD-Attention speeds up attention computation by up to $59\%$ (mean $28\%$), enabling higher throughput and lower latency LLM inference compared to the use of independently optimized prefill and decode attention kernels., Comment: Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2 (ASPLOS '25), March 30 - April 3, 2025, Rotterdam, Netherlands
- Published
- 2024
- Full Text
- View/download PDF
29. Improving Pinterest Search Relevance Using Large Language Models
- Author
-
Wang, Han, Sundararaman, Mukuntha Narayanan, Gungor, Onur, Xu, Yu, Kamath, Krishna, Chalasani, Rakesh, Hazra, Kurchi Subhra, and Rao, Jinfeng
- Subjects
Computer Science - Information Retrieval ,Computer Science - Computation and Language - Abstract
To improve relevance scoring on Pinterest Search, we integrate Large Language Models (LLMs) into our search relevance model, leveraging carefully designed text representations to predict the relevance of Pins effectively. Our approach uses search queries alongside content representations that include captions extracted from a generative visual language model. These are further enriched with link-based text data, historically high-quality engaged queries, user-curated boards, Pin titles and Pin descriptions, creating robust models for predicting search relevance. We use a semi-supervised learning approach to efficiently scale up the amount of training data, expanding beyond the expensive human labeled data available. By utilizing multilingual LLMs, our system extends training data to include unseen languages and domains, despite initial data and annotator expertise being confined to English. Furthermore, we distill from the LLM-based model into real-time servable model architectures and features. We provide comprehensive offline experimental validation for our proposed techniques and demonstrate the gains achieved through the final deployed system at scale., Comment: CIKM 2024 Workshop on Industrial Recommendation Systems
- Published
- 2024
30. Adapting Multilingual LLMs to Low-Resource Languages using Continued Pre-training and Synthetic Corpus
- Author
-
Joshi, Raviraj, Singla, Kanishk, Kamath, Anusha, Kalani, Raunak, Paul, Rakesh, Vaidya, Utkarsh, Chauhan, Sanjay Singh, Wartikar, Niranjan, and Long, Eileen
- Subjects
Computer Science - Computation and Language ,Computer Science - Machine Learning - Abstract
Multilingual LLMs support a variety of languages; however, their performance is suboptimal for low-resource languages. In this work, we emphasize the importance of continued pre-training of multilingual LLMs and the use of translation-based synthetic pre-training corpora for improving LLMs in low-resource languages. We conduct our study in the context of the low-resource Indic language Hindi. We introduce Nemotron-Mini-Hindi 4B, a bilingual SLM supporting both Hindi and English, based on Nemotron-Mini 4B. The model is trained using a mix of real and synthetic Hindi + English tokens, with continuous pre-training performed on 400B tokens. We demonstrate that both the base and instruct models achieve state-of-the-art results on Hindi benchmarks while remaining competitive on English tasks. Additionally, we observe that the continued pre-training approach enhances the model's overall factual accuracy.
- Published
- 2024
31. Developing Gridded Emission Inventory from High-Resolution Satellite Object Detection for Improved Air Quality Forecasts
- Author
-
Ghosal, Shubham, Singh, Manmeet, Ghude, Sachin, Kamath, Harsh, SB, Vaisakh, Wasekar, Subodh, Mahajan, Anoop, Dashtian, Hassan, Yang, Zong-Liang, Young, Michael, and Niyogi, Dev
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
This study presents an innovative approach to creating a dynamic, AI based emission inventory system for use with the Weather Research and Forecasting model coupled with Chemistry (WRF Chem), designed to simulate vehicular and other anthropogenic emissions at satellite detectable resolution. The methodology leverages state of the art deep learning based computer vision models, primarily employing YOLO (You Only Look Once) architectures (v8 to v10) and T Rex, for high precision object detection. Through extensive data collection, model training, and finetuning, the system achieved significant improvements in detection accuracy, with F1 scores increasing from an initial 0.15 at 0.131 confidence to 0.72 at 0.414 confidence. A custom pipeline converts model outputs into netCDF files storing latitude, longitude, and vehicular count data, enabling real time processing and visualization of emission patterns. The resulting system offers unprecedented temporal and spatial resolution in emission estimates, facilitating more accurate short term air quality forecasts and deeper insights into urban emission dynamics. This research not only enhances WRF Chem simulations but also bridges the gap between AI technologies and atmospheric science methodologies, potentially improving urban air quality management and environmental policymaking. Future work will focus on expanding the system's capabilities to non vehicular sources and further improving detection accuracy in challenging environmental conditions.
- Published
- 2024
32. Unlearn and Burn: Adversarial Machine Unlearning Requests Destroy Model Accuracy
- Author
-
Huang, Yangsibo, Liu, Daogao, Chua, Lynn, Ghazi, Badih, Kamath, Pritish, Kumar, Ravi, Manurangsi, Pasin, Nasr, Milad, Sinha, Amer, and Zhang, Chiyuan
- Subjects
Computer Science - Cryptography and Security - Abstract
Machine unlearning algorithms, designed for selective removal of training data from models, have emerged as a promising approach to growing privacy concerns. In this work, we expose a critical yet underexplored vulnerability in the deployment of unlearning systems: the assumption that the data requested for removal is always part of the original training set. We present a threat model where an attacker can degrade model accuracy by submitting adversarial unlearning requests for data not present in the training set. We propose white-box and black-box attack algorithms and evaluate them through a case study on image classification tasks using the CIFAR-10 and ImageNet datasets, targeting a family of widely used unlearning methods. Our results show extremely poor test accuracy following the attack: 3.6% on CIFAR-10 and 0.4% on ImageNet for white-box attacks, and 8.5% on CIFAR-10 and 1.3% on ImageNet for black-box attacks. Additionally, we evaluate various verification mechanisms to detect the legitimacy of unlearning requests and reveal the challenges in verification, as most of the mechanisms fail to detect stealthy attacks without severely impairing their ability to process valid requests. These findings underscore the urgent need for research on more robust request verification methods and unlearning protocols, should the deployment of machine unlearning systems become more prevalent in the future.
- Published
- 2024
33. How Unique is Whose Web Browser? The role of demographics in browser fingerprinting among US users
- Author
-
Berke, Alex, Bacis, Enrico, Ghazi, Badih, Kamath, Pritish, Kumar, Ravi, Lassonde, Robin, Manurangsi, Pasin, and Syed, Umar
- Subjects
Computer Science - Computers and Society - Abstract
Browser fingerprinting can be used to identify and track users across the Web, even without cookies, by collecting attributes from users' devices to create unique "fingerprints". This technique and resulting privacy risks have been studied for over a decade. Yet further research is limited because prior studies used data not publicly available. Additionally, data in prior studies lacked user demographics. Here we provide a first-of-its-kind dataset to enable further research. It includes browser attributes with users' demographics and survey responses, collected with informed consent from 8,400 US study participants. We use this dataset to demonstrate how fingerprinting risks differ across demographic groups. For example, we find lower income users are more at risk, and find that as users' age increases, they are both more likely to be concerned about fingerprinting and at real risk of fingerprinting. Furthermore, we demonstrate an overlooked risk: user demographics, such as gender, age, income level and race, can be inferred from browser attributes commonly used for fingerprinting, and we identify which browser attributes most contribute to this risk. Our data collection process also conducted an experiment to study what impacts users' likelihood to share browser data for open research, in order to inform future data collection efforts, with responses from 12,461 total participants. Female participants were significantly less likely to share their browser data, as were participants who were shown the browser data we asked to collect. Overall, we show the important role of user demographics in the ongoing work that intends to assess fingerprinting risks and improve user privacy, with findings to inform future privacy enhancing browser developments. The dataset and data collection tool we provide can be used to further study research questions not addressed in this work., Comment: In Proceedings on Privacy Enhancing Technologies 2025(1)
- Published
- 2024
- Full Text
- View/download PDF
34. Membership Inference Attacks Cannot Prove that a Model Was Trained On Your Data
- Author
-
Zhang, Jie, Das, Debeshee, Kamath, Gautam, and Tramèr, Florian
- Subjects
Computer Science - Machine Learning ,Computer Science - Cryptography and Security - Abstract
We consider the problem of a training data proof, where a data creator or owner wants to demonstrate to a third party that some machine learning model was trained on their data. Training data proofs play a key role in recent lawsuits against foundation models trained on web-scale data. Many prior works suggest to instantiate training data proofs using membership inference attacks. We argue that this approach is fundamentally unsound: to provide convincing evidence, the data creator needs to demonstrate that their attack has a low false positive rate, i.e., that the attack's output is unlikely under the null hypothesis that the model was not trained on the target data. Yet, sampling from this null hypothesis is impossible, as we do not know the exact contents of the training set, nor can we (efficiently) retrain a large foundation model. We conclude by offering two paths forward, by showing that data extraction attacks and membership inference on special canary data can be used to create sound training data proofs., Comment: position paper at IEEE SaTML 2025
- Published
- 2024
35. The Hard Positive Truth about Vision-Language Compositionality
- Author
-
Kamath, Amita, Hsieh, Cheng-Yu, Chang, Kai-Wei, and Krishna, Ranjay
- Subjects
Computer Science - Computation and Language ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Several benchmarks have concluded that our best vision-language models (e.g., CLIP) are lacking in compositionality. Given an image, these benchmarks probe a model's ability to identify its associated caption amongst a set of compositional distractors. In response, a surge of recent proposals show improvements by finetuning CLIP with distractors as hard negatives. Our investigations reveal that these improvements have, in fact, been significantly overstated -- because existing benchmarks do not probe whether finetuned vision-language models remain invariant to hard positives. By curating an evaluation dataset with 112,382 hard negatives and hard positives, we uncover that including hard positives decreases CLIP's performance by 12.9%, while humans perform effortlessly at 99%. CLIP finetuned with hard negatives results in an even larger decrease, up to 38.7%. With this finding, we then produce a 1,775,259 image-text training set with both hard negative and hard positive captions. By training with both, we see improvements on existing benchmarks while simultaneously improving performance on hard positives, indicating a more robust improvement in compositionality. Our work suggests the need for future research to rigorously test and improve CLIP's understanding of semantic relationships between related "positive" concepts., Comment: ECCV 2024
- Published
- 2024
36. Calibration of Spectropolarimetry channel of Visible Emission Line Coronagraph onboard Aditya-L1
- Author
-
Narra, Venkata Suresh, Raja, K. Sasikumar, B, Raghavendra Prasad, Singh, Jagdev, Mishra, Shalabh, U, Sanal Krishnan V, S, Bhavana Hegde, D., Utkarsha, V, Natarajan, S, Pawan Kumar, Priyal V, Muthu, P, Savarimuthu, Gavshinde, Priya, and P, Umesh Kamath
- Subjects
Astrophysics - Solar and Stellar Astrophysics ,Astrophysics - Instrumentation and Methods for Astrophysics - Abstract
The magnetic field strength and its topology play an important role in understanding the formation, evolution, and dynamics of the solar corona. Also, it plays a significant role in addressing long-standing mysteries such as coronal heating problem, origin and propagation of coronal mass ejections, drivers of space weather, origin and acceleration of solar wind, and so on. Despite having photospheric magnetograms for decades, we do not have reliable observations of coronal magnetic field strengths today. To measure the coronal magnetic field precisely, the spectropolarimetry channel of the Visible Emission Line Coronagraph (VELC) on board the Aditya-L1 mission is designed. Using the observations of coronal emission line Fe XIII [10747{\AA~}], it is possible to generate full Stokes maps (I, Q, U, and V) that help in estimating the Line-of-Sight (LOS) magnetic field strength and to derive the magnetic field topology maps of solar corona in the Field of View (FOV) (1.05 -- 1.5~R$_{\odot}$). In this article, we summarize the instrumental details of the spectropolarimetry channel and detailed calibration procedures adopted to derive the modulation and demodulation matrices. Furthermore, we have applied the derived demodulation matrices to the observed data in the laboratory and studied their performance., Comment: 12 pages, 5 Figures, Published in Journal of Experimental Astronomy
- Published
- 2024
- Full Text
- View/download PDF
37. Parameter constraints for accreting millisecond pulsars with synthetic NICER data
- Author
-
Dorsman, Bas, Salmi, Tuomo, Watts, Anna L., Ng, Mason, Kamath, Satish, Bobrikova, Anna, Poutanen, Juri, Loktev, Vladislav, Kini, Yves, Choudhury, Devarshi, Vinciguerra, Serena, Bogdanov, Slavko, and Chakrabarty, Deepto
- Subjects
Astrophysics - High Energy Astrophysical Phenomena - Abstract
Pulse profile modelling (PPM) is a technique for inferring mass, radius and hotspot properties of millisecond pulsars. PPM is now regularly used for analysis of rotation-powered millisecond pulsars (RMPs) with data from the Neutron Star Interior Composition ExploreR (NICER). Extending PPM to accreting millisecond pulsars (AMPs) is attractive, because they are a different source class featuring bright X-ray radiation from hotspots powered by accretion. In this paper, we present a modification of one of the PPM codes, X-PSI, so that it can be used for AMPs. In particular, we implement a model of an accretion disc and atmosphere model appropriate for the hotspots of AMPs, and improve the overall computational efficiency. We then test parameter recovery with synthetic NICER data in two scenarios with reasonable parameters for AMPs. We find in the first scenario, where the hotspot is large, that we are able to tightly and accurately constrain all parameters including mass and radius. In the second scenario, which is a high inclination system with a smaller hotspot, we find degeneracy between a subset of model parameters and a slight bias in the inferred mass and radius. This analysis of synthetic data lays the ground work for future analysis of AMPs with NICER data. Such an analysis could be complemented by future (joint) analysis of polarization data from the Imaging X-ray Polarimetry Explorer (IXPE)., Comment: 14 pages, 5 figures, 3 tables. Paper submitted to MNRAS
- Published
- 2024
38. The Blending ToolKit: A simulation framework for evaluation of galaxy detection and deblending
- Author
-
Mendoza, Ismael, Torchylo, Andrii, Sainrat, Thomas, Guinot, Axel, Boucaud, Alexandre, Paillassa, Maxime, Avestruz, Camille, Adari, Prakruth, Aubourg, Eric, Biswas, Biswajit, Buchanan, James, Burchat, Patricia, Doux, Cyrille, Joseph, Remy, Kamath, Sowmya, Malz, Alex I., Merz, Grant, Miyatake, Hironao, Roucelle, Cécile, Zhang, Tianqing, and Collaboration, the LSST Dark Energy Science
- Subjects
Astrophysics - Instrumentation and Methods for Astrophysics ,Astrophysics - Cosmology and Nongalactic Astrophysics - Abstract
We present an open source Python library for simulating overlapping (i.e., blended) images of galaxies and performing self-consistent comparisons of detection and deblending algorithms based on a suite of metrics. The package, named Blending Toolkit (BTK), serves as a modular, flexible, easy-to-install, and simple-to-use interface for exploring and analyzing systematic effects related to blended galaxies in cosmological surveys such as the Vera Rubin Observatory Legacy Survey of Space and Time (LSST). BTK has three main components: (1) a set of modules that perform fast image simulations of blended galaxies, using the open source image simulation package GalSim; (2) a module that standardizes the inputs and outputs of existing deblending algorithms; (3) a library of deblending metrics commonly defined in the galaxy deblending literature. In combination, these modules allow researchers to explore the impacts of galaxy blending in cosmological surveys. Additionally, BTK provides researchers who are developing a new deblending algorithm a framework to evaluate algorithm performance and make principled comparisons with existing deblenders. BTK includes a suite of tutorials and comprehensive documentation. The source code is publicly available on GitHub at https://github.com/LSSTDESC/BlendingToolKit., Comment: 15 pages, 9 figures, 2 tables, accepted to The Open Journal of Astrophysics
- Published
- 2024
- Full Text
- View/download PDF
39. MorphFader: Enabling Fine-grained Controllable Morphing with Text-to-Audio Models
- Author
-
Kamath, Purnima, Gupta, Chitralekha, and Nanayakkara, Suranga
- Subjects
Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
Sound morphing is the process of gradually and smoothly transforming one sound into another to generate novel and perceptually hybrid sounds that simultaneously resemble both. Recently, diffusion-based text-to-audio models have produced high-quality sounds using text prompts. However, granularly controlling the semantics of the sound, which is necessary for morphing, can be challenging using text. In this paper, we propose \textit{MorphFader}, a controllable method for morphing sounds generated by disparate prompts using text-to-audio models. By intercepting and interpolating the components of the cross-attention layers within the diffusion process, we can create smooth morphs between sounds generated by different text prompts. Using both objective metrics and perceptual listening tests, we demonstrate the ability of our method to granularly control the semantics in the sound and generate smooth morphs., Comment: Under Review
- Published
- 2024
40. Component Matching as a Graph Matching Problem
- Author
-
Kamath, Suresh
- Subjects
Computer Science - Software Engineering - Abstract
The development of an IT strategy and ensuring that it is the best possible one for business is a key problem many organizations face. This problem is that of linking business architecture to IT architecture in general and application architecture specifically. In our earlier work we proposed Category theory as the formal language to unify the business and IT worlds with the ability to represent the concepts and relations between the two in a unified way. We used rCOS as the underlying model for the specification of interfaces, contracts, and components. The concept of pseudo-category was then utilized to represent the business and application architecture specifications and the relationships contained within. Contracts are used for the specification of both IT and Business architecture components. The linkages between them is now established using the matching of the business component contracts with the application component contracts. Typically, the matching was based on manual process, in this paper we extend the work by considering automated component matching process. In this paper we provide implementation of the matching process using graph matching., Comment: 7 pages, 5 figures. arXiv admin note: substantial text overlap with arXiv:2406.05483
- Published
- 2024
41. MMAU: A Holistic Benchmark of Agent Capabilities Across Diverse Domains
- Author
-
Yin, Guoli, Bai, Haoping, Ma, Shuang, Nan, Feng, Sun, Yanchao, Xu, Zhaoyang, Ma, Shen, Lu, Jiarui, Kong, Xiang, Zhang, Aonan, Yap, Dian Ang, zhang, Yizhe, Ahnert, Karsten, Kamath, Vik, Berglund, Mathias, Walsh, Dominic, Gindele, Tobias, Wiest, Juergen, Lai, Zhengfeng, Wang, Xiaoming, Shan, Jiulong, Cao, Meng, Pang, Ruoming, and Wang, Zirui
- Subjects
Computer Science - Artificial Intelligence - Abstract
Recent advances in large language models (LLMs) have increased the demand for comprehensive benchmarks to evaluate their capabilities as human-like agents. Existing benchmarks, while useful, often focus on specific application scenarios, emphasizing task completion but failing to dissect the underlying skills that drive these outcomes. This lack of granularity makes it difficult to deeply discern where failures stem from. Additionally, setting up these environments requires considerable effort, and issues of unreliability and reproducibility sometimes arise, especially in interactive tasks. To address these limitations, we introduce the Massive Multitask Agent Understanding (MMAU) benchmark, featuring comprehensive offline tasks that eliminate the need for complex environment setups. It evaluates models across five domains, including Tool-use, Directed Acyclic Graph (DAG) QA, Data Science and Machine Learning coding, Contest-level programming and Mathematics, and covers five essential capabilities: Understanding, Reasoning, Planning, Problem-solving, and Self-correction. With a total of 20 meticulously designed tasks encompassing over 3K distinct prompts, MMAU provides a comprehensive framework for evaluating the strengths and limitations of LLM agents. By testing 18 representative models on MMAU, we provide deep and insightful analyses. Ultimately, MMAU not only sheds light on the capabilities and limitations of LLM agents but also enhances the interpretability of their performance. Datasets and evaluation scripts of MMAU are released at https://github.com/apple/axlearn/tree/main/docs/research/mmau.
- Published
- 2024
42. Interim report for the International Muon Collider Collaboration (IMCC)
- Author
-
Accettura, C., Adrian, S., Agarwal, R., Ahdida, C., Aimé, C., Aksoy, A., Alberghi, G. L., Alden, S., Amapane, N., Amorim, D., Andreetto, P., Anulli, F., Appleby, R., Apresyan, A., Asadi, P., Mahmoud, M. Attia, Auchmann, B., Back, J., Badea, A., Bae, K. J., Bahng, E. J., Balconi, L., Balli, F., Bandiera, L., Barbagallo, C., Barlow, R., Bartoli, C., Bartosik, N., Barzi, E., Batsch, F., Bauce, M., Begel, M., Berg, J. S., Bersani, A., Bertarelli, A., Bertinelli, F., Bertolin, A., Bhat, P., Bianchi, C., Bianco, M., Bishop, W., Black, K., Boattini, F., Bogacz, A., Bonesini, M., Bordini, B., de Sousa, P. Borges, Bottaro, S., Bottura, L., Boyd, S., Breschi, M., Broggi, F., Brunoldi, M., Buffat, X., Buonincontri, L., Burrows, P. N., Burt, G. C., Buttazzo, D., Caiffi, B., Calatroni, S., Calviani, M., Calzaferri, S., Calzolari, D., Cantone, C., Capdevilla, R., Carli, C., Carrelli, C., Casaburo, F., Casarsa, M., Castelli, L., Catanesi, M. G., Cavallucci, L., Cavoto, G., Celiberto, F. G., Celona, L., Cemmi, A., Ceravolo, S., Cerri, A., Cerutti, F., Cesarini, G., Cesarotti, C., Chancé, A., Charitonidis, N., Chiesa, M., Chiggiato, P., Ciccarella, V. L., Puviani, P. Cioli, Colaleo, A., Colao, F., Collamati, F., Costa, M., Craig, N., Curtin, D., D'Angelo, L., Da Molin, G., Damerau, H., Dasu, S., de Blas, J., De Curtis, S., De Gersem, H., Del Moro, T., Delahaye, J. -P., Denisov, D., Denizli, H., Dermisek, R., Valdor, P. Desiré, Desponds, C., Di Luzio, L., Di Meco, E., Di Petrillo, K. F., Di Sarcina, I., Diociaiuti, E., Dorigo, T., Dreimanis, K., Pree, T. du, Edgecock, T., Fabbri, S., Fabbrichesi, M., Farinon, S., Ferrand, G., Somoza, J. A. Ferreira, Fieg, M., Filthaut, F., Fox, P., Franceschini, R., Ximenes, R. Franqueira, Gallinaro, M., Garcia-Sciveres, M., Garcia-Tabares, L., Gargiulo, R., Garion, C., Garzelli, M. V., Gast, M., Gerber, C. E., Giambastiani, L., Gianelle, A., Gianfelice-Wendt, E., Gibson, S., Gilardoni, S., Giove, D. A., Giovinco, V., Giraldin, C., Glioti, A., Gorzawski, A., Greco, M., Grojean, C., Grudiev, A., Gschwendtner, E., Gueli, E., Guilhaudin, N., Han, C., Han, T., Hauptman, J. M., Herndon, M., Hillier, A. D., Hillman, M., Holmes, T. R., Homiller, S., Jana, S., Jindariani, S., Johannesson, S., Johnson, B., Jones, O. R., Jurj, P. -B., Kahn, Y., Kamath, R., Kario, A., Karpov, I., Kelliher, D., Kilian, W., Kitano, R., Kling, F., Kolehmainen, A., Kong, K. C., Kosse, J., Krintiras, G., Krizka, K., Kumar, N., Kvikne, E., Kyle, R., Laface, E., Lane, K., Latina, A., Lechner, A., Lee, J., Lee, L., Lee, S. W., Lefevre, T., Leonardi, E., Lerner, G., Li, P., Li, Q., Li, T., Li, W., Voti, R. Li, Lindroos, M., Lipton, R., Liu, D., Liu, M., Liu, Z., Lombardi, A., Lomte, S., Long, K., Longo, L., Lorenzo, J., Losito, R., Low, I., Lu, X., Lucchesi, D., Luo, T., Lupato, A., Métral, E., Mękała, K., Ma, Y., Mańczak, J. M., Machida, S., Madlener, T., Magaletti, L., Maggi, M., Durand, H. Mainaud, Maltoni, F., Mandurrino, M., Marchand, C., Mariani, F., Marin, S., Mariotto, S., Martin-Haugh, S., Masullo, M. R., Mauro, G. S., Mazzolari, A., Mele, B., Meloni, F., Meng, X., Mentink, M., Miceli, R., Milas, N., Mohammadi, A., Moll, D., Montella, A., Morandin, M., Morrone, M., Mulder, T., Musenich, R., Nardecchia, M., Nardi, F., Neuffer, D., Newbold, D., Novelli, D., Olvegård, M., Onel, Y., Orestano, D., Osborne, J., Otten, S., Torres, Y. M. Oviedo, Paesani, D., Griso, S. Pagan, Pagani, D., Pal, K., Palmer, M., Pampaloni, A., Panci, P., Pani, P., Papaphilippou, Y., Paparella, R., Paradisi, P., Passeri, A., Pastrone, N., Pellecchia, A., Piccinini, F., Piekarz, H., Pieloni, T., Plouin, J., Portone, A., Potamianos, K., Potdevin, J., Prestemon, S., Puig, T., Qiang, J., Quettier, L., Rabemananjara, T. R., Radicioni, E., Radogna, R., Rago, I. C., Ratkus, A., Resseguie, E., Reuter, J., Ribani, P. L., Riccardi, C., Ricciardi, S., Robens, T., Robert, Y., Roger, C., Rojo, J., Romagnoni, M., Ronald, K., Rosser, B., Rossi, C., Rossi, L., Rozanov, L., Ruhdorfer, M., Ruiz, R., Queiroz, F. S., Saini, S., Sala, F., Salierno, C., Salmi, T., Salvini, P., Salvioni, E., Sammut, N., Santini, C., Saputi, A., Sarra, I., Scarantino, G., Schneider-Muntau, H., Schulte, D., Scifo, J., Sen, T., Senatore, C., Senol, A., Sertore, D., Sestini, L., Rêgo, R. C. Silva, Simone, F. M., Skoufaris, K., Sorbello, G., Sorbi, M., Sorti, S., Soubirou, L., Spataro, D., Stamerra, A., Stapnes, S., Stark, G., Statera, M., Stechauner, B. M., Su, S., Su, W., Sun, X., Sytov, A., Tang, J., Taylor, R., Kate, H. Ten, Testoni, P., Thiele, L. S., Garcia, R. Tomas, Mugglestone, M. Topp, Torims, T., Torre, R., Tortora, L. T., Trifinopoulos, S., Udongwo, S. -A., Vai, I., Valente, R. U., van Rienen, U., van Weelderen, R., Vanwelde, M., Velev, G., Venditti, R., Vendrasco, A., Verna, A., Verweij, A., Verwilligen, P., Villamzar, Y., Vittorio, L., Vitulo, P., Vojskovic, I., Wang, D., Wang, L. -T., Wang, X., Wendt, M., Widorski, M., Wozniak, M., Wu, Y., Wulzer, A., Xie, K., Yang, Y., Yap, Y. C., Yonehara, K., Yoo, H. D., You, Z., Zanetti, M., Zaza, A., Zhang, L., Zhu, R., Zlobin, A., Zuliani, D., and Zurita, J. F.
- Subjects
Physics - Accelerator Physics ,High Energy Physics - Experiment - Abstract
The International Muon Collider Collaboration (IMCC) [1] was established in 2020 following the recommendations of the European Strategy for Particle Physics (ESPP) and the implementation of the European Strategy for Particle Physics-Accelerator R&D Roadmap by the Laboratory Directors Group [2], hereinafter referred to as the the European LDG roadmap. The Muon Collider Study (MuC) covers the accelerator complex, detectors and physics for a future muon collider. In 2023, European Commission support was obtained for a design study of a muon collider (MuCol) [3]. This project started on 1st March 2023, with work-packages aligned with the overall muon collider studies. In preparation of and during the 2021-22 U.S. Snowmass process, the muon collider project parameters, technical studies and physics performance studies were performed and presented in great detail. Recently, the P5 panel [4] in the U.S. recommended a muon collider R&D, proposed to join the IMCC and envisages that the U.S. should prepare to host a muon collider, calling this their "muon shot". In the past, the U.S. Muon Accelerator Programme (MAP) [5] has been instrumental in studies of concepts and technologies for a muon collider., Comment: This document summarises the International Muon Collider Collaboration (IMCC) progress and status of the Muon Collider R&D programme
- Published
- 2024
43. On Convex Optimization with Semi-Sensitive Features
- Author
-
Ghazi, Badih, Kamath, Pritish, Kumar, Ravi, Manurangsi, Pasin, Meka, Raghu, and Zhang, Chiyuan
- Subjects
Computer Science - Machine Learning ,Computer Science - Cryptography and Security ,Computer Science - Data Structures and Algorithms - Abstract
We study the differentially private (DP) empirical risk minimization (ERM) problem under the semi-sensitive DP setting where only some features are sensitive. This generalizes the Label DP setting where only the label is sensitive. We give improved upper and lower bounds on the excess risk for DP-ERM. In particular, we show that the error only scales polylogarithmically in terms of the sensitive domain size, improving upon previous results that scale polynomially in the sensitive domain size (Ghazi et al., 2021)., Comment: To appear in COLT 2024
- Published
- 2024
44. Learning Neural Networks with Sparse Activations
- Author
-
Awasthi, Pranjal, Dikkala, Nishanth, Kamath, Pritish, and Meka, Raghu
- Subjects
Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
A core component present in many successful neural network architectures, is an MLP block of two fully connected layers with a non-linear activation in between. An intriguing phenomenon observed empirically, including in transformer architectures, is that, after training, the activations in the hidden layer of this MLP block tend to be extremely sparse on any given input. Unlike traditional forms of sparsity, where there are neurons/weights which can be deleted from the network, this form of {\em dynamic} activation sparsity appears to be harder to exploit to get more efficient networks. Motivated by this we initiate a formal study of PAC learnability of MLP layers that exhibit activation sparsity. We present a variety of results showing that such classes of functions do lead to provable computational and statistical advantages over their non-sparse counterparts. Our hope is that a better theoretical understanding of {\em sparsely activated} networks would lead to methods that can exploit activation sparsity in practice., Comment: Proceedings of the 37th Conference on Learning Theory (COLT 2024), 20 pages
- Published
- 2024
45. Distribution Learnability and Robustness
- Author
-
Ben-David, Shai, Bie, Alex, Kamath, Gautam, and Lechner, Tosca
- Subjects
Statistics - Machine Learning ,Computer Science - Data Structures and Algorithms ,Computer Science - Information Theory ,Computer Science - Machine Learning ,Mathematics - Statistics Theory - Abstract
We examine the relationship between learnability and robust (or agnostic) learnability for the problem of distribution learning. We show that, contrary to other learning settings (e.g., PAC learning of function classes), realizable learnability of a class of probability distributions does not imply its agnostic learnability. We go on to examine what type of data corruption can disrupt the learnability of a distribution class and what is such learnability robust against. We show that realizable learnability of a class of distributions implies its robust learnability with respect to only additive corruption, but not against subtractive corruption. We also explore related implications in the context of compression schemes and differentially private learnability., Comment: In NeurIPS 2023
- Published
- 2024
46. On Computing Pairwise Statistics with Local Differential Privacy
- Author
-
Ghazi, Badih, Kamath, Pritish, Kumar, Ravi, Manurangsi, Pasin, and Sealfon, Adam
- Subjects
Computer Science - Data Structures and Algorithms ,Computer Science - Cryptography and Security - Abstract
We study the problem of computing pairwise statistics, i.e., ones of the form $\binom{n}{2}^{-1} \sum_{i \ne j} f(x_i, x_j)$, where $x_i$ denotes the input to the $i$th user, with differential privacy (DP) in the local model. This formulation captures important metrics such as Kendall's $\tau$ coefficient, Area Under Curve, Gini's mean difference, Gini's entropy, etc. We give several novel and generic algorithms for the problem, leveraging techniques from DP algorithms for linear queries., Comment: Published in NeurIPS 2023
- Published
- 2024
47. Machine Unlearning Fails to Remove Data Poisoning Attacks
- Author
-
Pawelczyk, Martin, Di, Jimmy Z., Lu, Yiwei, Kamath, Gautam, Sekhari, Ayush, and Neel, Seth
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Cryptography and Security ,Computer Science - Computers and Society - Abstract
We revisit the efficacy of several practical methods for approximate machine unlearning developed for large-scale deep learning. In addition to complying with data deletion requests, one often-cited potential application for unlearning methods is to remove the effects of training on poisoned data. We experimentally demonstrate that, while existing unlearning methods have been demonstrated to be effective in a number of evaluation settings (e.g., alleviating membership inference attacks), they fail to remove the effects of data poisoning, across a variety of types of poisoning attacks (indiscriminate, targeted, and a newly-introduced Gaussian poisoning attack) and models (image classifiers and LLMs); even when granted a relatively large compute budget. In order to precisely characterize unlearning efficacy, we introduce new evaluation metrics for unlearning based on data poisoning. Our results suggest that a broader perspective, including a wider variety of evaluations, is required to avoid a false sense of confidence in machine unlearning procedures for deep learning without provable guarantees. Moreover, while unlearning methods show some signs of being useful to efficiently remove poisoned datapoints without having to retrain, our work suggests that these methods are not yet "ready for prime time", and currently provide limited benefit over retraining.
- Published
- 2024
48. Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models
- Author
-
Chua, Lynn, Ghazi, Badih, Huang, Yangsibo, Kamath, Pritish, Kumar, Ravi, Manurangsi, Pasin, Sinha, Amer, Xie, Chulin, and Zhang, Chiyuan
- Subjects
Computer Science - Computation and Language ,Computer Science - Machine Learning - Abstract
Large language models (LLMs) are typically multilingual due to pretraining on diverse multilingual corpora. But can these models relate corresponding concepts across languages, i.e., be crosslingual? This study evaluates state-of-the-art LLMs on inherently crosslingual tasks. We observe that while these models show promising surface-level crosslingual abilities on machine translation and embedding space analyses, they struggle with deeper crosslingual knowledge transfer, revealing a crosslingual knowledge barrier in both general (MMLU benchmark) and domain-specific (Harry Potter quiz and TOFU benchmark) contexts. Since simple inference-time mitigation methods offer only limited improvement, we propose fine-tuning of LLMs on mixed-language data, which effectively reduces these gaps, even when using out-of-domain datasets like WikiText. Our findings suggest the need for explicit optimization to unlock the full crosslingual potential of LLMs. Our code is publicly available at https://github.com/google-research/crosslingual-knowledge-barriers.
- Published
- 2024
49. Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning
- Author
-
Chua, Lynn, Ghazi, Badih, Huang, Yangsibo, Kamath, Pritish, Kumar, Ravi, Liu, Daogao, Manurangsi, Pasin, Sinha, Amer, and Zhang, Chiyuan
- Subjects
Computer Science - Computation and Language ,Computer Science - Cryptography and Security ,Computer Science - Machine Learning - Abstract
Large language models (LLMs) have emerged as powerful tools for tackling complex tasks across diverse domains, but they also raise privacy concerns when fine-tuned on sensitive data due to potential memorization. While differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit, current evaluations on LLMs mostly treat each example (text record) as the privacy unit. This leads to uneven user privacy guarantees when contributions per user vary. We therefore study user-level DP motivated by applications where it necessary to ensure uniform privacy protection across users. We present a systematic evaluation of user-level DP for LLM fine-tuning on natural language generation tasks. Focusing on two mechanisms for achieving user-level DP guarantees, Group Privacy and User-wise DP-SGD, we investigate design choices like data selection strategies and parameter tuning for the best privacy-utility tradeoff., Comment: Published as a conference paper at COLM 2024
- Published
- 2024
50. Regularising Spectral Curves for Homogeneous Yang-Baxter strings
- Author
-
Driezen, Sibylle and Kamath, Niranjan
- Subjects
High Energy Physics - Theory - Abstract
In this Letter, we study the semi-classical spectrum of integrable worldsheet $\sigma$-models using the Spectral Curve. We consider a Homogeneous Yang-Baxter deformation of the $AdS_5\times S^5$ superstring, understood as the composition of a Jordanian with a "non-diagonal" TsT deformation. We derive its type IIB supergravity solution, whose isometry algebra features zero supercharges and a non-relativistic conformal algebra in $0+1$ dimensions. While the Spectral Curves of non-diagonal TsT models are ill-defined, we demonstrate that the composition with a Jordanian model regularises this issue. From the regularised Curve, we derive the one-loop shift of the classical energy and the semi-classical spectrum of excitations of a point-like string. In the TsT limit, the one-loop shift vanishes despite the loss of supersymmetry. Our results suggest that it may be possible to use standard Bethe Ansatze on spin chain pictures of deformed $N=4$ Super-Yang-Mills theory dual to non-diagonal TsT models., Comment: 8 pages; v2: published version, added references
- Published
- 2024
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.