Author: "Shetty A." - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Shetty A."' showing total 89,982 results

Start Over Author "Shetty A."

89,982 results on '"Shetty A."'

1. Audio-Based Classification of Insect Species Using Machine Learning Models: Cicada, Beetle, Termite, and Cricket

Author: Shetty, Manas V and Kumar, Yoga Disha Sendhil
Subjects: Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: This project addresses the challenge of classifying insect species: Cicada, Beetle, Termite, and Cricket using sound recordings. Accurate species identification is crucial for ecological monitoring and pest management. We employ machine learning models such as XGBoost, Random Forest, and K Nearest Neighbors (KNN) to analyze audio features, including Mel Frequency Cepstral Coefficients (MFCC). The potential novelty of this work lies in the combination of diverse audio features and machine learning models to tackle insect classification, specifically focusing on capturing subtle acoustic variations between species that have not been fully leveraged in previous research. The dataset is compiled from various open sources, and we anticipate achieving high classification accuracy, contributing to improved automated insect detection systems.
Published: 2025

2. VITAL: A New Dataset for Benchmarking Pluralistic Alignment in Healthcare

Author: Shetty, Anudeex, Beheshti, Amin, Dras, Mark, and Naseem, Usman
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Alignment techniques have become central to ensuring that Large Language Models (LLMs) generate outputs consistent with human values. However, existing alignment paradigms often model an averaged or monolithic preference, failing to account for the diversity of perspectives across cultures, demographics, and communities. This limitation is particularly critical in health-related scenarios, where plurality is essential due to the influence of culture, religion, personal values, and conflicting opinions. Despite progress in pluralistic alignment, no prior work has focused on health, likely due to the unavailability of publicly available datasets. To address this gap, we introduce VITAL, a new benchmark dataset comprising 13.1K value-laden situations and 5.4K multiple-choice questions focused on health, designed to assess and benchmark pluralistic alignment methodologies. Through extensive evaluation of eight LLMs of varying sizes, we demonstrate that existing pluralistic alignment techniques fall short in effectively accommodating diverse healthcare beliefs, underscoring the need for tailored AI alignment in specific domains. This work highlights the limitations of current approaches and lays the groundwork for developing health-specific alignment solutions., Comment: Under review
Published: 2025

3. Low-Rank Thinning

Author: Carrell, Annabelle Michael, Gong, Albert, Shetty, Abhishek, Dwivedi, Raaz, and Mackey, Lester
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning, Mathematics - Optimization and Control, Mathematics - Statistics Theory, Statistics - Methodology
Abstract: The goal in thinning is to summarize a dataset using a small set of representative points. Remarkably, sub-Gaussian thinning algorithms like Kernel Halving and Compress can match the quality of uniform subsampling while substantially reducing the number of summary points. However, existing guarantees cover only a restricted range of distributions and kernel-based quality measures and suffer from pessimistic dimension dependence. To address these deficiencies, we introduce a new low-rank analysis of sub-Gaussian thinning that applies to any distribution and any kernel, guaranteeing high-quality compression whenever the kernel or data matrix is approximately low-rank. To demonstrate the broad applicability of the techniques, we design practical sub-Gaussian thinning approaches that improve upon the best known guarantees for approximating attention in transformers, accelerating stochastic gradient training through reordering, and distinguishing distributions in near-linear time.
Published: 2025

4. PolyPath: Adapting a Large Multimodal Model for Multi-slide Pathology Report Generation

Author: Ahmed, Faruk, Yang, Lin, Jaroensri, Tiam, Sellergren, Andrew, Matias, Yossi, Hassidim, Avinatan, Corrado, Greg S., Webster, Dale R., Shetty, Shravya, Prabhakara, Shruthi, Liu, Yun, Golden, Daniel, Wulczyn, Ellery, and Steiner, David F.
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: The interpretation of histopathology cases underlies many important diagnostic and treatment decisions in medicine. Notably, this process typically requires pathologists to integrate and summarize findings across multiple slides per case. Existing vision-language capabilities in computational pathology have so far been largely limited to small regions of interest, larger regions at low magnification, or single whole-slide images (WSIs). This limits interpretation of findings that span multiple high-magnification regions across multiple WSIs. By making use of Gemini 1.5 Flash, a large multimodal model (LMM) with a 1-million token context window, we demonstrate the ability to generate bottom-line diagnoses from up to 40,000 768x768 pixel image patches from multiple WSIs at 10X magnification. This is the equivalent of up to 11 hours of video at 1 fps. Expert pathologist evaluations demonstrate that the generated report text is clinically accurate and equivalent to or preferred over the original reporting for 68% (95% CI: [60%, 76%]) of multi-slide examples with up to 5 slides. While performance decreased for examples with 6 or more slides, this study demonstrates the promise of leveraging the long-context capabilities of modern LMMs for the uniquely challenging task of medical report generation where each case can contain thousands of image patches., Comment: 8 main pages, 21 pages in total
Published: 2025

5. Enhancing Age-Related Robustness in Children Speaker Verification

Author: Shetty, Vishwas M., Zheng, Jiusi, Lulich, Steven M., and Alwan, Abeer
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Sound
Abstract: One of the main challenges in children's speaker verification (C-SV) is the significant change in children's voices as they grow. In this paper, we propose two approaches to improve age-related robustness in C-SV. We first introduce a Feature Transform Adapter (FTA) module that integrates local patterns into higher-level global representations, reducing overfitting to specific local features and improving the inter-year SV performance of the system. We then employ Synthetic Audio Augmentation (SAA) to increase data diversity and size, thereby improving robustness against age-related changes. Since the lack of longitudinal speech datasets makes it difficult to measure age-related robustness of C-SV systems, we introduce a longitudinal dataset to assess inter-year verification robustness of C-SV systems. By integrating both of our proposed methods, the average equal error rate was reduced by 19.4%, 13.0%, and 6.1% in the one-year, two-year, and three-year gap inter-year evaluation sets, respectively, compared to the baseline., Comment: Accepted to ICASSP 2025
Published: 2025

6. Small Loss Bounds for Online Learning Separated Function Classes: A Gaussian Process Perspective

Author: Block, Adam and Shetty, Abhishek
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: In order to develop practical and efficient algorithms while circumventing overly pessimistic computational lower bounds, recent work has been interested in developing oracle-efficient algorithms in a variety of learning settings. Two such settings of particular interest are online and differentially private learning. While seemingly different, these two fields are fundamentally connected by the requirement that successful algorithms in each case satisfy stability guarantees; in particular, recent work has demonstrated that algorithms for online learning whose performance adapts to beneficial problem instances, attaining the so-called small-loss bounds, require a form of stability similar to that of differential privacy. In this work, we identify the crucial role that separation plays in allowing oracle-efficient algorithms to achieve this strong stability. Our notion, which we term $\rho$-separation, generalizes and unifies several previous approaches to enforcing this strong stability, including the existence of small-separator sets and the recent notion of $\gamma$-approximability. We present an oracle-efficient algorithm that is capable of achieving small-loss bounds with improved rates in greater generality than previous work, as well as a variant for differentially private learning that attains optimal rates, again under our separation condition. In so doing, we prove a new stability result for minimizers of a Gaussian process that strengthens and generalizes previous work.
Published: 2025

7. Motion Control in Multi-Rotor Aerial Robots Using Deep Reinforcement Learning

Author: Shetty, Gaurav, Ramezani, Mahya, Habibi, Hamed, Voos, Holger, and Sanchez-Lopez, Jose Luis
Subjects: Computer Science - Robotics, Computer Science - Artificial Intelligence
Abstract: This paper investigates the application of Deep Reinforcement (DRL) Learning to address motion control challenges in drones for additive manufacturing (AM). Drone-based additive manufacturing promises flexible and autonomous material deposition in large-scale or hazardous environments. However, achieving robust real-time control of a multi-rotor aerial robot under varying payloads and potential disturbances remains challenging. Traditional controllers like PID often require frequent parameter re-tuning, limiting their applicability in dynamic scenarios. We propose a DRL framework that learns adaptable control policies for multi-rotor drones performing waypoint navigation in AM tasks. We compare Deep Deterministic Policy Gradient (DDPG) and Twin Delayed Deep Deterministic Policy Gradient (TD3) within a curriculum learning scheme designed to handle increasing complexity. Our experiments show TD3 consistently balances training stability, accuracy, and success, particularly when mass variability is introduced. These findings provide a scalable path toward robust, autonomous drone control in additive manufacturing.
Published: 2025

8. Real-Time Brain Tumor Detection in Intraoperative Ultrasound Using YOLO11: From Model Training to Deployment in the Operating Room

Author: Cepeda, Santiago, Esteban-Sinovas, Olga, Romero, Roberto, Singh, Vikas, Shetty, Prakash, Moiyadi, Aliasgar, Zemmoura, Ilyess, Giammalva, Giuseppe Roberto, Del Bene, Massimiliano, Barbotti, Arianna, DiMeco, Francesco, West, Timothy R., Nahed, Brian V., Arrese, Ignacio, Hornero, Roberto, and Sarabia, Rosario
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: Intraoperative ultrasound (ioUS) is a valuable tool in brain tumor surgery due to its versatility, affordability, and seamless integration into the surgical workflow. However, its adoption remains limited, primarily because of the challenges associated with image interpretation and the steep learning curve required for effective use. This study aimed to enhance the interpretability of ioUS images by developing a real-time brain tumor detection system deployable in the operating room. We collected 2D ioUS images from the Brain Tumor Intraoperative Database (BraTioUS) and the public ReMIND dataset, annotated with expert-refined tumor labels. Using the YOLO11 architecture and its variants, we trained object detection models to identify brain tumors. The dataset included 1,732 images from 192 patients, divided into training, validation, and test sets. Data augmentation expanded the training set to 11,570 images. In the test dataset, YOLO11s achieved the best balance of precision and computational efficiency, with a mAP@50 of 0.95, mAP@50-95 of 0.65, and a processing speed of 34.16 frames per second. The proposed solution was prospectively validated in a cohort of 15 consecutively operated patients diagnosed with brain tumors. Neurosurgeons confirmed its seamless integration into the surgical workflow, with real-time predictions accurately delineating tumor regions. These findings highlight the potential of real-time object detection algorithms to enhance ioUS-guided brain tumor surgery, addressing key challenges in interpretation and providing a foundation for future development of computer vision-based tools for neuro-oncological surgery.
Published: 2025

9. Dynamic Metal-Support Interaction Dictates Cu Nanoparticle Sintering on Al$_2$O$_3$ Surfaces

Author: Xu, Jiayan, Das, Shreeja, Pathak, Amar Deep, Patra, Abhirup, Shetty, Sharan, Hohl, Detlef, and Car, Roberto
Subjects: Condensed Matter - Materials Science, Physics - Chemical Physics
Abstract: Nanoparticle sintering remains a critical challenge in heterogeneous catalysis. In this work, we present a unified deep potential (DP) model for Cu nanoparticles on three Al$_2$O$_3$ surfaces ($\gamma$-Al$_2$O$_3$(100), $\gamma$-Al$_2$O$_3$(110), and $\alpha$-Al$_2$O$_3$(0001)). Using DP-accelerated simulations, we reveal striking facet-dependent nanoparticle stability and mobility patterns across the three surfaces. The nanoparticles diffuse several times faster on $\alpha$-Al$_2$O$_3$(0001) than on $\gamma$-Al$_2$O$_3$(100) at 800 K while expected to be more sluggish based on their larger binding energy at 0 K. Diffusion is facilitated by dynamic metal-support interaction (MSI), where the Al atoms switch out of the surface plane to optimize contact with the nanoparticle and relax back to the plane as the nanoparticle moves away. In contrast, the MSI on $\gamma$-Al$_2$O$_3$(100) and on $\gamma$-Al$_2$O$_3$(110) is dominated by more stable and directional Cu-O bonds, consistent with the limited diffusion observed on these surfaces. Our extended long-time MD simulations provide quantitative insights into the sintering processes, showing that the dispersity of nanoparticles (the initial inter-nanoparticle distance) strongly influences coalescence driven by nanoparticle diffusion. We observed that the coalescence of Cu$_{13}$ nanoparticles on $\alpha$-Al$_2$O$_3$(0001) can occur in a short time (10 ns) at 800 K even with an initial inter-nanoparticle distance increased to 30 \r{A}, while the coalescence on $\gamma$-Al$_2$O$_3$(100) is inhibited significantly by increasing the initial inter-nanoparticle distance from 15 \r{A} to 30 \r{A}. These findings demonstrate that the dynamics of the supporting surface is crucial to understanding the sintering mechanism and offer guidance for designing sinter-resistant catalysts by engineering the support morphology., Comment: 33 pages, 5 figures; update plot style of fig. 3
Published: 2025

10. AI Guide Dog: Egocentric Path Prediction on Smartphone

Author: Jadhav, Aishwarya, Cao, Jeffery, Shetty, Abhishree, Kumar, Urvashi Priyam, Sharma, Aditi, Sukboontip, Ben, Tamarapalli, Jayant Sravan, Zhang, Jingyi, and Koul, Anirudh
Subjects: Computer Science - Robotics, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Human-Computer Interaction, Computer Science - Machine Learning
Abstract: This paper presents AI Guide Dog (AIGD), a lightweight egocentric (first-person) navigation system for visually impaired users, designed for real-time deployment on smartphones. AIGD employs a vision-only multi-label classification approach to predict directional commands, ensuring safe navigation across diverse environments. We introduce a novel technique for goal-based outdoor navigation by integrating GPS signals and high-level directions, while also handling uncertain multi-path predictions for destination-free indoor navigation. As the first navigation assistance system to handle both goal-oriented and exploratory navigation across indoor and outdoor settings, AIGD establishes a new benchmark in blind navigation. We present methods, datasets, evaluations, and deployment insights to encourage further innovations in assistive navigation systems., Comment: Accepted at the AAAI 2025 Spring Symposium on Human-Compatible AI for Well-being: Harnessing Potential of GenAI for AI-Powered Science
Published: 2025

11. AIOpsLab: A Holistic Framework to Evaluate AI Agents for Enabling Autonomous Clouds

Author: Chen, Yinfang, Shetty, Manish, Somashekar, Gagan, Ma, Minghua, Simmhan, Yogesh, Mace, Jonathan, Bansal, Chetan, Wang, Rujia, and Rajmohan, Saravan
Subjects: Computer Science - Artificial Intelligence, Computer Science - Distributed, Parallel, and Cluster Computing, Computer Science - Multiagent Systems, Computer Science - Software Engineering
Abstract: AI for IT Operations (AIOps) aims to automate complex operational tasks, such as fault localization and root cause analysis, to reduce human workload and minimize customer impact. While traditional DevOps tools and AIOps algorithms often focus on addressing isolated operational tasks, recent advances in Large Language Models (LLMs) and AI agents are revolutionizing AIOps by enabling end-to-end and multitask automation. This paper envisions a future where AI agents autonomously manage operational tasks throughout the entire incident lifecycle, leading to self-healing cloud systems, a paradigm we term AgentOps. Realizing this vision requires a comprehensive framework to guide the design, development, and evaluation of these agents. To this end, we present AIOPSLAB, a framework that not only deploys microservice cloud environments, injects faults, generates workloads, and exports telemetry data but also orchestrates these components and provides interfaces for interacting with and evaluating agents. We discuss the key requirements for such a holistic framework and demonstrate how AIOPSLAB can facilitate the evaluation of next-generation AIOps agents. Through evaluations of state-of-the-art LLM agents within the benchmark created by AIOPSLAB, we provide insights into their capabilities and limitations in handling complex operational tasks in cloud environments.
Published: 2025

12. Culturally Responsive Teaching through Spatial Justice in Urban Neighborhoods

Author: M. Beth Schlemper, Sujata Shetty, Owusua Yamoah, Kevin Czajkowski, and Victoria Stewart
Abstract: As part of a National Science Foundation (NSF)-funded project to create culturally responsive curriculum that uses critical spatial thinking and geospatial technologies to address spatial justice in urban neighborhoods, students from a predominantly African American inner-city public high school in Toledo, Ohio participated in two summer workshops in 2015 and 2016. Using a range of data sources including sketch maps, interviews, and neighborhood-based student projects, this paper addresses two research questions: (a) How did students' spatial narratives of their community change as a result of using a culturally responsive teaching approach to explore neighborhood challenges? (b) How does a culturally responsive teaching approach and a critical geography perspective support spatial justice among youth?
Published: 2025
Full Text: View/download PDF

13. Deliberate Design: Creating Electricity Rates with Purpose

Author: Hledik, Ryan, Sergici, Sanem, Shetty, Sai, and Cappers, Peter
Abstract: Today’s electricity rates often are legacy designs that do not reflect the dynamics of an evolving power grid or align with current policy objectives. Four steps will assist utilities, regulators, and industry stakeholders in modernizing outdated electricity rate designs.1. Understand the context for rate design change: The power system is changing at a pace that the industry has not experienced for decades. It is essential to understand the implications of these changes so rates can evolve to remain consistent with changes to the underlying cost profile, customer preferences, and power system requirements.2. Establish ratemaking objectives: Rates can do more than recover utility costs. They can be a tool for promoting desired outcomes such as improved energy affordability, flexible and efficient electricity consumption, or promoting technology adoption. First, these objectives must be clearly defined and prioritized.3. Account for tradeoffs when designing new rates: Rate design is the art of balancing tradeoffs. It is essential to understand these tradeoffs when designing new rates, particularly if the rates are being used as a tool for accomplishing policy objectives that extend beyond the basic goal of cost reflectivity.4. Transition to the new rates with a plan: The move to well-designed rates requires a transition plan. This will ensure that rate design changes do not happen in isolation and are consistent with a long-term, holistic vision.The report, published as an interactive web tool for which the content can be separately downloaded as a standalone document, is intended to allow state energy regulators, utility rates staff, and other industry stakeholders with an interest in rate design to selectively “drill down” on content that is relevant to their interests and situation.
Published: 2025

14. Sampling-Based Constrained Motion Planning with Products of Experts

Author: Razmjoo, Amirreza, Xue, Teng, Shetty, Suhan, and Calinon, Sylvain
Subjects: Computer Science - Robotics
Abstract: We present a novel approach to enhance the performance of sampling-based Model Predictive Control (MPC) in constrained optimization by leveraging products of experts. Our methodology divides the main problem into two components: one focused on optimality and the other on feasibility. By combining the solutions from each component, represented as distributions, we apply products of experts to implement a project-then-sample strategy. In this strategy, the optimality distribution is projected into the feasible area, allowing for more efficient sampling. This approach contrasts with the traditional sample-then-project method, leading to more diverse exploration and reducing the accumulation of samples on the boundaries. We demonstrate an effective implementation of this principle using a tensor train-based distribution model, which is characterized by its non-parametric nature, ease of combination with other distributions at the task level, and straightforward sampling technique. We adapt existing tensor train models to suit this purpose and validate the efficacy of our approach through experiments in various tasks, including obstacle avoidance, non-prehensile manipulation, and tasks involving staying on manifolds. Our experimental results demonstrate that the proposed method consistently outperforms known baselines, providing strong empirical support for its effectiveness.
Published: 2024

15. Syzygy: Dual Code-Test C to (safe) Rust Translation using LLMs and Dynamic Analysis

Author: Shetty, Manish, Jain, Naman, Godbole, Adwait, Seshia, Sanjit A., and Sen, Koushik
Subjects: Computer Science - Software Engineering, Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Computer Science - Programming Languages, I.2, D.2, D.3
Abstract: Despite extensive usage in high-performance, low-level systems programming applications, C is susceptible to vulnerabilities due to manual memory management and unsafe pointer operations. Rust, a modern systems programming language, offers a compelling alternative. Its unique ownership model and type system ensure memory safety without sacrificing performance. In this paper, we present Syzygy, an automated approach to translate C to safe Rust. Our technique uses a synergistic combination of LLM-driven code and test translation guided by dynamic-analysis-generated execution information. This paired translation runs incrementally in a loop over the program in dependency order of the code elements while maintaining per-step correctness. Our approach exposes novel insights on combining the strengths of LLMs and dynamic analysis in the context of scaling and combining code generation with testing. We apply our approach to successfully translate Zopfli, a high-performance compression library with ~3000 lines of code and 98 functions. We validate the translation by testing equivalence with the source C program on a set of inputs. To our knowledge, this is the largest automated and test-validated C to safe Rust code translation achieved so far., Comment: Project webpage at https://syzygy-project.github.io/. Preliminary version accepted at LLM4Code 2025, 34 pages
Published: 2024

16. On the Use of Abundant Road Speed Data for Travel Demand Calibration of Urban Traffic Simulators

Author: Vishnoi, Suyash, Shetty, Akhil, Tsogsuren, Iveel, Arora, Neha, and Osorio, Carolina
Subjects: Computer Science - Multiagent Systems
Abstract: This work develops a compute-efficient algorithm to tackle a fundamental problem in transportation: that of urban travel demand estimation. It focuses on the calibration of origin-destination travel demand input parameters for high-resolution traffic simulation models. It considers the use of abundant traffic road speed data. The travel demand calibration problem is formulated as a continuous, high-dimensional, simulation-based optimization (SO) problem with bound constraints. There is a lack of compute efficient algorithms to tackle this problem. We propose the use of an SO algorithm that relies on an efficient, analytical, differentiable, physics-based traffic model, known as a metamodel or surrogate model. We formulate a metamodel that enables the use of road speed data. Tests are performed on a Salt Lake City network. We study how the amount of data, as well as the congestion levels, impact both in-sample and out-of-sample performance. The proposed method outperforms the benchmark for both in-sample and out-of-sample performance by 84.4% and 72.2% in terms of speeds and counts, respectively. Most importantly, the proposed method yields the highest compute efficiency, identifying solutions with good performance within few simulation function evaluations (i.e., with small samples)., Comment: 4 pages
Published: 2024
Full Text: View/download PDF

17. Robust Contact-rich Manipulation through Implicit Motor Adaptation

Author: Xue, Teng, Razmjoo, Amirreza, Shetty, Suhan, and Calinon, Sylvain
Subjects: Computer Science - Robotics
Abstract: Contact-rich manipulation plays a vital role in daily human activities, yet uncertain physical parameters pose significant challenges for both model-based and model-free planning and control. A promising approach to address this challenge is to develop policies robust to a wide range of parameters. Domain adaptation and domain randomization are commonly used to achieve such policies but often compromise generalization to new instances or perform conservatively due to neglecting instance-specific information. \textit{Explicit motor adaptation} addresses these issues by estimating system parameters online and then retrieving the parameter-conditioned policy from a parameter-augmented base policy. However, it typically relies on precise system identification or additional high-quality policy retraining, presenting substantial challenges for contact-rich tasks with diverse physical parameters. In this work, we propose \textit{implicit motor adaptation}, which leverages tensor factorization as an implicit representation of the base policy. Given a roughly estimated parameter distribution, the parameter-conditioned policy can be efficiently derived by exploiting the separable structure of tensor cores from the base policy. This framework eliminates the need for precise system estimation and policy retraining while preserving optimal behavior and strong generalization. We provide a theoretical analysis validating this method, supported by numerical evaluations on three contact-rich manipulation primitives. Both simulation and real-world experiments demonstrate its ability to generate robust policies for diverse instances.
Published: 2024

18. A Survey of Large Language Model-Based Generative AI for Text-to-SQL: Benchmarks, Applications, Use Cases, and Challenges

Author: Singh, Aditi, Shetty, Akash, Ehtesham, Abul, Kumar, Saket, and Khoei, Tala Talaei
Subjects: Computer Science - Artificial Intelligence, Computer Science - Databases
Abstract: Text-to-SQL systems facilitate smooth interaction with databases by translating natural language queries into Structured Query Language (SQL), bridging the gap between non-technical users and complex database management systems. This survey provides a comprehensive overview of the evolution of AI-driven text-to-SQL systems, highlighting their foundational components, advancements in large language model (LLM) architectures, and the critical role of datasets such as Spider, WikiSQL, and CoSQL in driving progress. We examine the applications of text-to-SQL in domains like healthcare, education, and finance, emphasizing their transformative potential for improving data accessibility. Additionally, we analyze persistent challenges, including domain generalization, query optimization, support for multi-turn conversational interactions, and the limited availability of datasets tailored for NoSQL databases and dynamic real-world scenarios. To address these challenges, we outline future research directions, such as extending text-to-SQL capabilities to support NoSQL databases, designing datasets for dynamic multi-turn interactions, and optimizing systems for real-world scalability and robustness. By surveying current advancements and identifying key gaps, this paper aims to guide the next generation of research and applications in LLM-based text-to-SQL systems.
Published: 2024

19. Privacy Drift: Evolving Privacy Concerns in Incremental Learning

Author: Ahamed, Sayyed Farid, Banerjee, Soumya, Roy, Sandip, Kapoor, Aayush, Vucovich, Marc, Choi, Kevin, Rahman, Abdul, Bowen, Edward, and Shetty, Sachin
Subjects: Computer Science - Machine Learning, Computer Science - Cryptography and Security
Abstract: In the evolving landscape of machine learning (ML), Federated Learning (FL) presents a paradigm shift towards decentralized model training while preserving user data privacy. This paper introduces the concept of ``privacy drift", an innovative framework that parallels the well-known phenomenon of concept drift. While concept drift addresses the variability in model accuracy over time due to changes in the data, privacy drift encapsulates the variation in the leakage of private information as models undergo incremental training. By defining and examining privacy drift, this study aims to unveil the nuanced relationship between the evolution of model performance and the integrity of data privacy. Through rigorous experimentation, we investigate the dynamics of privacy drift in FL systems, focusing on how model updates and data distribution shifts influence the susceptibility of models to privacy attacks, such as membership inference attacks (MIA). Our results highlight a complex interplay between model accuracy and privacy safeguards, revealing that enhancements in model performance can lead to increased privacy risks. We provide empirical evidence from experiments on customized datasets derived from CIFAR-100 (Canadian Institute for Advanced Research, 100 classes), showcasing the impact of data and concept drift on privacy. This work lays the groundwork for future research on privacy-aware machine learning, aiming to achieve a delicate balance between model accuracy and data privacy in decentralized environments., Comment: 6 pages, 7 figures, Accepted in IEEE ICNC 25
Published: 2024

20. Designing a Secure, Scalable, and Cost-Effective Cloud Storage Solution: A Novel Approach to Data Management using NextCloud, TrueNAS, and QEMU/KVM

Author: Aryan, Prakash and Shetty, Sujala Deepak
Subjects: Computer Science - Databases, Computer Science - Cryptography and Security
Abstract: This paper presents a novel approach to cloud storage challenges by integrating NextCloud, TrueNAS, and QEMU/KVM. Our research demonstrates how this combination creates a robust, flexible, and economical cloud storage system suitable for various applications. We detail the architecture, highlighting TrueNAS's ZFS-based storage, QEMU/KVM's virtualization, and NextCloud's user interface. Extensive testing showssuperior data integrity and protection compared to traditional solutions. Performance benchmarks reveal high read/write speeds(up to 1.22 GB/s for sequential reads and 620 MB/s for writes) and also efficient small file handling. We demonstrate the solution's scalability under increasing workloads. Security analysis showcases effective jail isolation techniques in TrueNAS. Cost analysis indicates potential 50% reduction in total ownership cost over five years compared to commercial alternatives. This research contributes a practical, high-performance, cost-effective alternative to proprietary solutions, paving new ways for organizations to implement secure, scalable cloud storage while maintaining data control. Future work will focus on improving automated scaling and integration with emerging technologies like containerization and serverless computing.
Published: 2024

21. ProPLIKS: Probablistic 3D human body pose estimation

Author: Shetty, Karthik, Birkhold, Annette, Egger, Bernhard, Jaganathan, Srikrishna, Strobel, Norbert, Kowarschik, Markus, and Maier, Andreas
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: We present a novel approach for 3D human pose estimation by employing probabilistic modeling. This approach leverages the advantages of normalizing flows in non-Euclidean geometries to address uncertain poses. Specifically, our method employs normalizing flow tailored to the SO(3) rotational group, incorporating a coupling mechanism based on the M\"obius transformation. This enables the framework to accurately represent any distribution on SO(3), effectively addressing issues related to discontinuities. Additionally, we reinterpret the challenge of reconstructing 3D human figures from 2D pixel-aligned inputs as the task of mapping these inputs to a range of probable poses. This perspective acknowledges the intrinsic ambiguity of the task and facilitates a straightforward integration method for multi-view scenarios. The combination of these strategies showcases the effectiveness of probabilistic models in complex scenarios for human pose estimation techniques. Our approach notably surpasses existing methods in the field of pose estimation. We also validate our methodology on human pose estimation from RGB images as well as medical X-Ray datasets.
Published: 2024

22. Understanding the World's Museums through Vision-Language Reasoning

Author: Balauca, Ada-Astrid, Garai, Sanjana, Balauca, Stefan, Shetty, Rasesh Udayakumar, Agrawal, Naitik, Shah, Dhwanil Subhashbhai, Fu, Yuqian, Wang, Xi, Toutanova, Kristina, Paudel, Danda Pani, and Van Gool, Luc
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Computation and Language
Abstract: Museums serve as vital repositories of cultural heritage and historical artifacts spanning diverse epochs, civilizations, and regions, preserving well-documented collections. Data reveal key attributes such as age, origin, material, and cultural significance. Understanding museum exhibits from their images requires reasoning beyond visual features. In this work, we facilitate such reasoning by (a) collecting and curating a large-scale dataset of 65M images and 200M question-answer pairs in the standard museum catalog format for exhibits from all around the world; (b) training large vision-language models on the collected dataset; (c) benchmarking their ability on five visual question answering tasks. The complete dataset is labeled by museum experts, ensuring the quality as well as the practical significance of the labels. We train two VLMs from different categories: the BLIP model, with vision-language aligned embeddings, but lacking the expressive power of large language models, and the LLaVA model, a powerful instruction-tuned LLM enriched with vision-language reasoning capabilities. Through exhaustive experiments, we provide several insights on the complex and fine-grained understanding of museum exhibits. In particular, we show that some questions whose answers can often be derived directly from visual features are well answered by both types of models. On the other hand, questions that require the grounding of the visual features in repositories of human knowledge are better answered by the large vision-language models, thus demonstrating their superior capacity to perform the desired reasoning. Find our dataset, benchmarks, and source code at: https://github.com/insait-institute/Museum-65
Published: 2024

23. Health AI Developer Foundations

Author: Kiraly, Atilla P., Baur, Sebastien, Philbrick, Kenneth, Mahvar, Fereshteh, Yatziv, Liron, Chen, Tiffany, Sterling, Bram, George, Nick, Jamil, Fayaz, Tang, Jing, Bailey, Kai, Ahmed, Faruk, Goel, Akshay, Ward, Abbi, Yang, Lin, Sellergren, Andrew, Matias, Yossi, Hassidim, Avinatan, Shetty, Shravya, Golden, Daniel, Azizi, Shekoofeh, Steiner, David F., Liu, Yun, Thelin, Tim, Pilgrim, Rory, and Kirmizibayrak, Can
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Multimedia, Electrical Engineering and Systems Science - Image and Video Processing
Abstract: Robust medical Machine Learning (ML) models have the potential to revolutionize healthcare by accelerating clinical research, improving workflows and outcomes, and producing novel insights or capabilities. Developing such ML models from scratch is cost prohibitive and requires substantial compute, data, and time (e.g., expert labeling). To address these challenges, we introduce Health AI Developer Foundations (HAI-DEF), a suite of pre-trained, domain-specific foundation models, tools, and recipes to accelerate building ML for health applications. The models cover various modalities and domains, including radiology (X-rays and computed tomography), histopathology, dermatological imaging, and audio. These models provide domain specific embeddings that facilitate AI development with less labeled data, shorter training times, and reduced computational costs compared to traditional approaches. In addition, we utilize a common interface and style across these models, and prioritize usability to enable developers to integrate HAI-DEF efficiently. We present model evaluations across various tasks and conclude with a discussion of their application and evaluation, covering the importance of ensuring efficacy, fairness, and equity. Finally, while HAI-DEF and specifically the foundation models lower the barrier to entry for ML in healthcare, we emphasize the importance of validation with problem- and population-specific data for each desired usage setting. This technical report will be updated over time as more modalities and features are added., Comment: 16 pages, 8 figures
Published: 2024

24. General Geospatial Inference with a Population Dynamics Foundation Model

Author: Agarwal, Mohit, Sun, Mimi, Kamath, Chaitanya, Muslim, Arbaaz, Sarker, Prithul, Paul, Joydeep, Yee, Hector, Sieniek, Marcin, Jablonski, Kim, Mayer, Yael, Fork, David, de Guia, Sheila, McPike, Jamie, Boulanger, Adam, Shekel, Tomer, Schottlander, David, Xiao, Yao, Manukonda, Manjit Chakravarthy, Liu, Yun, Bulut, Neslihan, Abu-el-haija, Sami, Perozzi, Bryan, Bharel, Monica, Nguyen, Von, Barrington, Luke, Efron, Niv, Matias, Yossi, Corrado, Greg, Eswaran, Krish, Prabhakara, Shruthi, Shetty, Shravya, and Prasad, Gautam
Subjects: Computer Science - Machine Learning, Computer Science - Computers and Society
Abstract: Supporting the health and well-being of dynamic populations around the world requires governmental agencies, organizations and researchers to understand and reason over complex relationships between human behavior and local contexts in order to identify high-risk groups and strategically allocate limited resources. Traditional approaches to these classes of problems often entail developing manually curated, task-specific features and models to represent human behavior and the natural and built environment, which can be challenging to adapt to new, or even, related tasks. To address this, we introduce a Population Dynamics Foundation Model (PDFM) that aims to capture the relationships between diverse data modalities and is applicable to a broad range of geospatial tasks. We first construct a geo-indexed dataset for postal codes and counties across the United States, capturing rich aggregated information on human behavior from maps, busyness, and aggregated search trends, and environmental factors such as weather and air quality. We then model this data and the complex relationships between locations using a graph neural network, producing embeddings that can be adapted to a wide range of downstream tasks using relatively simple models. We evaluate the effectiveness of our approach by benchmarking it on 27 downstream tasks spanning three distinct domains: health indicators, socioeconomic factors, and environmental measurements. The approach achieves state-of-the-art performance on all 27 geospatial interpolation tasks, and on 25 out of the 27 extrapolation and super-resolution tasks. We combined the PDFM with a state-of-the-art forecasting foundation model, TimesFM, to predict unemployment and poverty, achieving performance that surpasses fully supervised forecasting. The full set of embeddings and sample code are publicly available for researchers., Comment: 28 pages, 16 figures, preprint; v4: updated authors
Published: 2024

25. Community search signatures as foundation features for human-centered geospatial modeling

Author: Sun, Mimi, Kamath, Chaitanya, Agarwal, Mohit, Muslim, Arbaaz, Yee, Hector, Schottlander, David, Bavadekar, Shailesh, Efron, Niv, Shetty, Shravya, and Prasad, Gautam
Subjects: Computer Science - Machine Learning
Abstract: Aggregated relative search frequencies offer a unique composite signal reflecting people's habits, concerns, interests, intents, and general information needs, which are not found in other readily available datasets. Temporal search trends have been successfully used in time series modeling across a variety of domains such as infectious diseases, unemployment rates, and retail sales. However, most existing applications require curating specialized datasets of individual keywords, queries, or query clusters, and the search data need to be temporally aligned with the outcome variable of interest. We propose a novel approach for generating an aggregated and anonymized representation of search interest as foundation features at the community level for geospatial modeling. We benchmark these features using spatial datasets across multiple domains. In zip codes with a population greater than 3000 that cover over 95% of the contiguous US population, our models for predicting missing values in a 20% set of holdout counties achieve an average $R^2$ score of 0.74 across 21 health variables, and 0.80 across 6 demographic and environmental variables. Our results demonstrate that these search features can be used for spatial predictions without strict temporal alignment, and that the resulting models outperform spatial interpolation and state of the art methods using satellite imagery features., Comment: 8 pages, 8 figures, presented at the DMLR workshop at ICML 2024
Published: 2024

26. GPT-4o System Card

Author: OpenAI, Hurst, Aaron, Lerer, Adam, Goucher, Adam P., Perelman, Adam, Ramesh, Aditya, Clark, Aidan, Ostrow, AJ, Welihinda, Akila, Hayes, Alan, Radford, Alec, Mądry, Aleksander, Baker-Whitcomb, Alex, Beutel, Alex, Borzunov, Alex, Carney, Alex, Chow, Alex, Kirillov, Alex, Nichol, Alex, Paino, Alex, Renzin, Alex, Passos, Alex Tachard, Kirillov, Alexander, Christakis, Alexi, Conneau, Alexis, Kamali, Ali, Jabri, Allan, Moyer, Allison, Tam, Allison, Crookes, Amadou, Tootoochian, Amin, Tootoonchian, Amin, Kumar, Ananya, Vallone, Andrea, Karpathy, Andrej, Braunstein, Andrew, Cann, Andrew, Codispoti, Andrew, Galu, Andrew, Kondrich, Andrew, Tulloch, Andrew, Mishchenko, Andrey, Baek, Angela, Jiang, Angela, Pelisse, Antoine, Woodford, Antonia, Gosalia, Anuj, Dhar, Arka, Pantuliano, Ashley, Nayak, Avi, Oliver, Avital, Zoph, Barret, Ghorbani, Behrooz, Leimberger, Ben, Rossen, Ben, Sokolowsky, Ben, Wang, Ben, Zweig, Benjamin, Hoover, Beth, Samic, Blake, McGrew, Bob, Spero, Bobby, Giertler, Bogo, Cheng, Bowen, Lightcap, Brad, Walkin, Brandon, Quinn, Brendan, Guarraci, Brian, Hsu, Brian, Kellogg, Bright, Eastman, Brydon, Lugaresi, Camillo, Wainwright, Carroll, Bassin, Cary, Hudson, Cary, Chu, Casey, Nelson, Chad, Li, Chak, Shern, Chan Jun, Conger, Channing, Barette, Charlotte, Voss, Chelsea, Ding, Chen, Lu, Cheng, Zhang, Chong, Beaumont, Chris, Hallacy, Chris, Koch, Chris, Gibson, Christian, Kim, Christina, Choi, Christine, McLeavey, Christine, Hesse, Christopher, Fischer, Claudia, Winter, Clemens, Czarnecki, Coley, Jarvis, Colin, Wei, Colin, Koumouzelis, Constantin, Sherburn, Dane, Kappler, Daniel, Levin, Daniel, Levy, Daniel, Carr, David, Farhi, David, Mely, David, Robinson, David, Sasaki, David, Jin, Denny, Valladares, Dev, Tsipras, Dimitris, Li, Doug, Nguyen, Duc Phong, Findlay, Duncan, Oiwoh, Edede, Wong, Edmund, Asdar, Ehsan, Proehl, Elizabeth, Yang, Elizabeth, Antonow, Eric, Kramer, Eric, Peterson, Eric, Sigler, Eric, Wallace, Eric, Brevdo, Eugene, Mays, Evan, Khorasani, Farzad, Such, Felipe Petroski, Raso, Filippo, Zhang, Francis, von Lohmann, Fred, Sulit, Freddie, Goh, Gabriel, Oden, Gene, Salmon, Geoff, Starace, Giulio, Brockman, Greg, Salman, Hadi, Bao, Haiming, Hu, Haitang, Wong, Hannah, Wang, Haoyu, Schmidt, Heather, Whitney, Heather, Jun, Heewoo, Kirchner, Hendrik, Pinto, Henrique Ponde de Oliveira, Ren, Hongyu, Chang, Huiwen, Chung, Hyung Won, Kivlichan, Ian, O'Connell, Ian, Osband, Ian, Silber, Ian, Sohl, Ian, Okuyucu, Ibrahim, Lan, Ikai, Kostrikov, Ilya, Sutskever, Ilya, Kanitscheider, Ingmar, Gulrajani, Ishaan, Coxon, Jacob, Menick, Jacob, Pachocki, Jakub, Aung, James, Betker, James, Crooks, James, Lennon, James, Kiros, Jamie, Leike, Jan, Park, Jane, Kwon, Jason, Phang, Jason, Teplitz, Jason, Wei, Jason, Wolfe, Jason, Chen, Jay, Harris, Jeff, Varavva, Jenia, Lee, Jessica Gan, Shieh, Jessica, Lin, Ji, Yu, Jiahui, Weng, Jiayi, Tang, Jie, Yu, Jieqi, Jang, Joanne, Candela, Joaquin Quinonero, Beutler, Joe, Landers, Joe, Parish, Joel, Heidecke, Johannes, Schulman, John, Lachman, Jonathan, McKay, Jonathan, Uesato, Jonathan, Ward, Jonathan, Kim, Jong Wook, Huizinga, Joost, Sitkin, Jordan, Kraaijeveld, Jos, Gross, Josh, Kaplan, Josh, Snyder, Josh, Achiam, Joshua, Jiao, Joy, Lee, Joyce, Zhuang, Juntang, Harriman, Justyn, Fricke, Kai, Hayashi, Kai, Singhal, Karan, Shi, Katy, Karthik, Kavin, Wood, Kayla, Rimbach, Kendra, Hsu, Kenny, Nguyen, Kenny, Gu-Lemberg, Keren, Button, Kevin, Liu, Kevin, Howe, Kiel, Muthukumar, Krithika, Luther, Kyle, Ahmad, Lama, Kai, Larry, Itow, Lauren, Workman, Lauren, Pathak, Leher, Chen, Leo, Jing, Li, Guy, Lia, Fedus, Liam, Zhou, Liang, Mamitsuka, Lien, Weng, Lilian, McCallum, Lindsay, Held, Lindsey, Ouyang, Long, Feuvrier, Louis, Zhang, Lu, Kondraciuk, Lukas, Kaiser, Lukasz, Hewitt, Luke, Metz, Luke, Doshi, Lyric, Aflak, Mada, Simens, Maddie, Boyd, Madelaine, Thompson, Madeleine, Dukhan, Marat, Chen, Mark, Gray, Mark, Hudnall, Mark, Zhang, Marvin, Aljubeh, Marwan, Litwin, Mateusz, Zeng, Matthew, Johnson, Max, Shetty, Maya, Gupta, Mayank, Shah, Meghan, Yatbaz, Mehmet, Yang, Meng Jia, Zhong, Mengchao, Glaese, Mia, Chen, Mianna, Janner, Michael, Lampe, Michael, Petrov, Michael, Wu, Michael, Wang, Michele, Fradin, Michelle, Pokrass, Michelle, Castro, Miguel, de Castro, Miguel Oom Temudo, Pavlov, Mikhail, Brundage, Miles, Wang, Miles, Khan, Minal, Murati, Mira, Bavarian, Mo, Lin, Molly, Yesildal, Murat, Soto, Nacho, Gimelshein, Natalia, Cone, Natalie, Staudacher, Natalie, Summers, Natalie, LaFontaine, Natan, Chowdhury, Neil, Ryder, Nick, Stathas, Nick, Turley, Nick, Tezak, Nik, Felix, Niko, Kudige, Nithanth, Keskar, Nitish, Deutsch, Noah, Bundick, Noel, Puckett, Nora, Nachum, Ofir, Okelola, Ola, Boiko, Oleg, Murk, Oleg, Jaffe, Oliver, Watkins, Olivia, Godement, Olivier, Campbell-Moore, Owen, Chao, Patrick, McMillan, Paul, Belov, Pavel, Su, Peng, Bak, Peter, Bakkum, Peter, Deng, Peter, Dolan, Peter, Hoeschele, Peter, Welinder, Peter, Tillet, Phil, Pronin, Philip, Tillet, Philippe, Dhariwal, Prafulla, Yuan, Qiming, Dias, Rachel, Lim, Rachel, Arora, Rahul, Troll, Rajan, Lin, Randall, Lopes, Rapha Gontijo, Puri, Raul, Miyara, Reah, Leike, Reimar, Gaubert, Renaud, Zamani, Reza, Wang, Ricky, Donnelly, Rob, Honsby, Rob, Smith, Rocky, Sahai, Rohan, Ramchandani, Rohit, Huet, Romain, Carmichael, Rory, Zellers, Rowan, Chen, Roy, Chen, Ruby, Nigmatullin, Ruslan, Cheu, Ryan, Jain, Saachi, Altman, Sam, Schoenholz, Sam, Toizer, Sam, Miserendino, Samuel, Agarwal, Sandhini, Culver, Sara, Ethersmith, Scott, Gray, Scott, Grove, Sean, Metzger, Sean, Hermani, Shamez, Jain, Shantanu, Zhao, Shengjia, Wu, Sherwin, Jomoto, Shino, Wu, Shirong, Shuaiqi, Xia, Phene, Sonia, Papay, Spencer, Narayanan, Srinivas, Coffey, Steve, Lee, Steve, Hall, Stewart, Balaji, Suchir, Broda, Tal, Stramer, Tal, Xu, Tao, Gogineni, Tarun, Christianson, Taya, Sanders, Ted, Patwardhan, Tejal, Cunninghman, Thomas, Degry, Thomas, Dimson, Thomas, Raoux, Thomas, Shadwell, Thomas, Zheng, Tianhao, Underwood, Todd, Markov, Todor, Sherbakov, Toki, Rubin, Tom, Stasi, Tom, Kaftan, Tomer, Heywood, Tristan, Peterson, Troy, Walters, Tyce, Eloundou, Tyna, Qi, Valerie, Moeller, Veit, Monaco, Vinnie, Kuo, Vishal, Fomenko, Vlad, Chang, Wayne, Zheng, Weiyi, Zhou, Wenda, Manassra, Wesam, Sheu, Will, Zaremba, Wojciech, Patil, Yash, Qian, Yilei, Kim, Yongjik, Cheng, Youlong, Zhang, Yu, He, Yuchen, Zhang, Yuchen, Jin, Yujia, Dai, Yunxing, and Malkov, Yury
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Computers and Society, Computer Science - Machine Learning, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: GPT-4o is an autoregressive omni model that accepts as input any combination of text, audio, image, and video, and generates any combination of text, audio, and image outputs. It's trained end-to-end across text, vision, and audio, meaning all inputs and outputs are processed by the same neural network. GPT-4o can respond to audio inputs in as little as 232 milliseconds, with an average of 320 milliseconds, which is similar to human response time in conversation. It matches GPT-4 Turbo performance on text in English and code, with significant improvement on text in non-English languages, while also being much faster and 50\% cheaper in the API. GPT-4o is especially better at vision and audio understanding compared to existing models. In line with our commitment to building AI safely and consistent with our voluntary commitments to the White House, we are sharing the GPT-4o System Card, which includes our Preparedness Framework evaluations. In this System Card, we provide a detailed look at GPT-4o's capabilities, limitations, and safety evaluations across multiple categories, focusing on speech-to-speech while also evaluating text and image capabilities, and measures we've implemented to ensure the model is safe and aligned. We also include third-party assessments on dangerous capabilities, as well as discussion of potential societal impacts of GPT-4o's text and vision capabilities.
Published: 2024

27. Explainable News Summarization -- Analysis and mitigation of Disagreement Problem

Author: Aswani, Seema and Shetty, Sujala D.
Subjects: Computer Science - Artificial Intelligence
Abstract: Explainable AI (XAI) techniques for text summarization provide valuable understanding of how the summaries are generated. Recent studies have highlighted a major challenge in this area, known as the disagreement problem. This problem occurs when different XAI methods offer contradictory explanations for the summary generated from the same input article. This inconsistency across XAI methods has been evaluated using predefined metrics designed to quantify agreement levels between them, revealing significant disagreement. This impedes the reliability and interpretability of XAI in this area. To address this challenge, we propose a novel approach that utilizes sentence transformers and the k-means clustering algorithm to first segment the input article and then generate the explanation of the summary generated for each segment. By producing regional or segmented explanations rather than comprehensive ones, a decrease in the observed disagreement between XAI methods is hypothesized. This segmentation-based approach was used on two news summarization datasets, namely Extreme Summarization(XSum) and CNN-DailyMail, and the experiment was conducted using multiple disagreement metrics. Our experiments validate the hypothesis by showing a significant reduction in disagreement among different XAI methods. Additionally, a JavaScript visualization tool is developed, that is easy to use and allows users to interactively explore the color-coded visualization of the input article and the machine-generated summary based on the attribution scores of each sentences.
Published: 2024

28. 'What is the value of {templates}?' Rethinking Document Information Extraction Datasets for LLMs

Author: Zmigrod, Ran, Shetty, Pranav, Sibue, Mathieu, Ma, Zhiqiang, Nourbakhsh, Armineh, Liu, Xiaomo, and Veloso, Manuela
Subjects: Computer Science - Computation and Language
Abstract: The rise of large language models (LLMs) for visually rich document understanding (VRDU) has kindled a need for prompt-response, document-based datasets. As annotating new datasets from scratch is labor-intensive, the existing literature has generated prompt-response datasets from available resources using simple templates. For the case of key information extraction (KIE), one of the most common VRDU tasks, past work has typically employed the template "What is the value for the {key}?". However, given the variety of questions encountered in the wild, simple and uniform templates are insufficient for creating robust models in research and industrial contexts. In this work, we present K2Q, a diverse collection of five datasets converted from KIE to a prompt-response format using a plethora of bespoke templates. The questions in K2Q can span multiple entities and be extractive or boolean. We empirically compare the performance of seven baseline generative models on K2Q with zero-shot prompting. We further compare three of these models when training on K2Q versus training on simpler templates to motivate the need of our work. We find that creating diverse and intricate KIE questions enhances the performance and robustness of VRDU models. We hope this work encourages future studies on data quality for generative model training., Comment: Accepted to EMNLP Findings 2024
Published: 2024
Full Text: View/download PDF

29. Advancing Healthcare: Innovative ML Approaches for Improved Medical Imaging in Data-Constrained Environments

Author: Amin, Al, Hasan, Kamrul, Zein-Sabatto, Saleh, Hong, Liang, Shetty, Sachin, Ahmed, Imtiaz, and Islam, Tariqul
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: Healthcare industries face challenges when experiencing rare diseases due to limited samples. Artificial Intelligence (AI) communities overcome this situation to create synthetic data which is an ethical and privacy issue in the medical domain. This research introduces the CAT-U-Net framework as a new approach to overcome these limitations, which enhances feature extraction from medical images without the need for large datasets. The proposed framework adds an extra concatenation layer with downsampling parts, thereby improving its ability to learn from limited data while maintaining patient privacy. To validate, the proposed framework's robustness, different medical conditioning datasets were utilized including COVID-19, brain tumors, and wrist fractures. The framework achieved nearly 98% reconstruction accuracy, with a Dice coefficient close to 0.946. The proposed CAT-U-Net has the potential to make a big difference in medical image diagnostics in settings with limited data., Comment: 7 pages, 7 figures
Published: 2024

30. Robust Manipulation Primitive Learning via Domain Contraction

Author: Xue, Teng, Razmjoo, Amirreza, Shetty, Suhan, and Calinon, Sylvain
Subjects: Computer Science - Robotics
Abstract: Contact-rich manipulation plays an important role in human daily activities, but uncertain parameters pose significant challenges for robots to achieve comparable performance through planning and control. To address this issue, domain adaptation and domain randomization have been proposed for robust policy learning. However, they either lose the generalization ability across diverse instances or perform conservatively due to neglecting instance-specific information. In this paper, we propose a bi-level approach to learn robust manipulation primitives, including parameter-augmented policy learning using multiple models, and parameter-conditioned policy retrieval through domain contraction. This approach unifies domain randomization and domain adaptation, providing optimal behaviors while keeping generalization ability. We validate the proposed method on three contact-rich manipulation primitives: hitting, pushing, and reorientation. The experimental results showcase the superior performance of our approach in generating robust policies for instances with diverse physical parameters., Comment: Conference on Robot Learning (CoRL), 2024
Published: 2024

31. Public Quantum Network: The First Node

Author: Kapoor, K., Hoseini, S., Choi, J., Nussbaum, B. E., Zhang, Y., Shetty, K., Skaar, C., Ward, M., Wilson, L., Shinbrough, K., Edwards, E., Wiltfong, R., Lualdi, C. P., Cohen, Offir, Kwiat, P. G., and Lorenz, V. O.
Subjects: Quantum Physics, Physics - Applied Physics, Physics - Physics Education, Physics - Optics, Physics - Physics and Society
Abstract: We present a quantum network that distributes entangled photons between the University of Illinois Urbana-Champaign and a public library in Urbana. The network allows members of the public to perform measurements on the photons. We describe its design and implementation and outreach based on the network. Over 400 instances of public interaction have been logged with the system since it was launched in November 2023., Comment: 7 Pages, 8 Figures
Published: 2024

32. Plots Unlock Time-Series Understanding in Multimodal Models

Author: Daswani, Mayank, Bellaiche, Mathias M. J., Wilson, Marc, Ivanov, Desislav, Papkov, Mikhail, Schnider, Eva, Tang, Jing, Lamerigts, Kay, Botea, Gabriela, Sanchez, Michael A., Patel, Yojan, Prabhakara, Shruthi, Shetty, Shravya, and Telang, Umesh
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition
Abstract: While multimodal foundation models can now natively work with data beyond text, they remain underutilized in analyzing the considerable amounts of multi-dimensional time-series data in fields like healthcare, finance, and social sciences, representing a missed opportunity for richer, data-driven insights. This paper proposes a simple but effective method that leverages the existing vision encoders of these models to "see" time-series data via plots, avoiding the need for additional, potentially costly, model training. Our empirical evaluations show that this approach outperforms providing the raw time-series data as text, with the additional benefit that visual time-series representations demonstrate up to a 90% reduction in model API costs. We validate our hypothesis through synthetic data tasks of increasing complexity, progressing from simple functional form identification on clean data, to extracting trends from noisy scatter plots. To demonstrate generalizability from synthetic tasks with clear reasoning steps to more complex, real-world scenarios, we apply our approach to consumer health tasks - specifically fall detection, activity recognition, and readiness assessment - which involve heterogeneous, noisy data and multi-step reasoning. The overall success in plot performance over text performance (up to an 120% performance increase on zero-shot synthetic tasks, and up to 150% performance increase on real-world tasks), across both GPT and Gemini model families, highlights our approach's potential for making the best use of the native capabilities of foundation models., Comment: 57 pages
Published: 2024

33. Influence of hydrotherapy on change in weight: a narrative review

Author: Manju, M. Y., Shetty, Geetha B., Sujatha, K. J., and Shetty, Prashanth
Published: 2025
Full Text: View/download PDF

34. Hypertension among reproductive women in India: A study of interaction between tobacco and diabetes

Author: Mahagaonkar, Rasika S., Prasad, Jang Bahadur, Biradar, Rajeshwari A., Hegde, Sadashiva, Shetty, Vishaka S., Shetty, Rachana R., and Sabhahit, Ganapati Y.
Published: 2025
Full Text: View/download PDF

35. Integrated yoga and naturopathy treatments in enhancing motor and sensory recovery in Guillain Barre syndrome- a case study

Author: Poojary, Geethashree, Shetty, Shivaprasad, and Shetty, Prashanth
Published: 2025
Full Text: View/download PDF

36. The content validity of an instrument that measures health-seeking behavior for tuberculosis among people living with HIV in India

Author: Jacob, Ankeeta Menona, Jacob, Jeni, Peersman, Wim, and Shetty, Avinash K
Published: 2024

37. Revolutionizing Dental Treatment Through IoT Integrated Force Sensors: Design and Calibration

Author: Salian, Chaithra, Shetty, Prakyath, Shetty, Charishma, Prasad, Durga, Pradyumna, G. R., Bommegowda, K. B., Ravi, M. S., Murali, P. S., Li, Gang, Series Editor, Filipe, Joaquim, Series Editor, Ghosh, Ashish, Series Editor, Xu, Zhiwei, Series Editor, T., Shreekumar, editor, L., Dinesha, editor, and Rajesh, Sreeja, editor
Published: 2025
Full Text: View/download PDF

38. False-Positive Phosphatidylethanol Results Due to Blood Transfusion and Implications in the Process of Liver Transplantation Selection.

Author: Wang, Jessica AS, Cruz Cruz, Giovani V, Shetty, Akshay, Esquivel, Darlene, Saab, Sammy, Shoptaw, Steven, and Meza, Julio
Subjects: Public Health, Health Sciences, Chronic Liver Disease and Cirrhosis, Transplantation, Liver Disease, Substance Misuse, Alcoholism, Alcohol Use and Health, Clinical Research, Digestive Diseases, Organ Transplantation, 6.4 Surgery, Oral and gastrointestinal, Good Health and Well Being, Public Health and Health Services, Substance Abuse, Public health, Clinical and health psychology
Abstract: AbstractPhosphatidylethanol (PEth) testing is becoming increasingly common as a tool to assess for alcohol consumption in the practice of addiction medicine. Its potential to be an objective measure of ethanol exposure is appealing; however, the field has yet to develop a complete understanding of the factors that can influence a PEth level. Here we describe 3 patient cases in which blood transfusion within the preceding 28 days was the reason that PEth studies were positive in patients undergoing liver transplant evaluation. These patients all had in-depth evaluations by physicians on an addiction medicine consult service and were believed abstinent from alcohol. In the field of liver transplant, even a mildly elevated PEth level can result in listing delay or even liver transplant candidacy denial. Further study is needed to understand how PEth is impacted by medical procedures and events such as blood transfusion if we are to maintain a just and ethical practice in the setting of addiction and transplant medicine.
Published: 2024

39. A perspective from the National Eye Institute Extracellular Vesicle Workshop: Gaps, needs, and opportunities for studies of extracellular vesicles in vision research.

Author: Lee, Sun, Klingeborn, Mikael, Bulte, Jeff, Chiu, Daniel, Chopp, Michael, Cutler, Christopher, Das, Saumya, Egwuagu, Charles, Fowler, Christie, Hamm-Alvarez, Sarah, Lee, Hakho, Liu, Yutao, Mead, Ben, Moore, Tara, Ravindran, Sriram, Shetty, Ashok, Skog, Johan, Witwer, Kenneth, Djalilian, Ali, and Weaver, Alissa
Subjects: EVs, Eye, diagnosis, exosomes, ocular, prognosis, therapy, vision, Extracellular Vesicles, Humans, United States, National Eye Institute (U.S.), Biomedical Research, Eye Diseases, Vision, Ocular, Animals
Abstract: With an evolving understanding and new discoveries in extracellular vesicle (EV) biology and their implications in health and disease, the significant diagnostic and therapeutic potential of EVs for vision research has gained recognition. In 2021, the National Eye Institute (NEI) unveiled its Strategic Plan titled Vision for the Future (2021-2025), which listed EV research as a priority within the domain of Regenerative Medicine, a pivotal area outlined in the Plan. In alignment with this prioritization, NEI organized a workshop inviting twenty experts from within and beyond the visual system. The workshop aimed to review current knowledge in EV research and explore gaps, needs and opportunities for EV research in the eye, including EV biology and applications of EVs in diagnosis, therapy and prognosis within the visual system. This perspective encapsulates the workshops deliberations, highlighting the current landscape and potential implications of EV research in advancing eye health and addressing visual diseases.
Published: 2024

40. Crowd-sourced machine learning prediction of long COVID using data from the National COVID Cohort Collaborative.

Author: Bergquist, Timothy, Loomba, Johanna, Pfaff, Emily, Xia, Fangfang, Zhao, Zixuan, Zhu, Yitan, Mitchell, Elliot, Bhattacharya, Biplab, Shetty, Gaurav, Munia, Tamanna, Delong, Grant, Tariq, Adbul, Butzin-Dozier, Zachary, Ji, Yunwen, Li, Haodong, Coyle, Jeremy, Shi, Seraphina, Philips, Rachael, Mertens, Andrew, Pirracchio, Romain, van der Laan, Mark, Colford, John, Hubbard, Alan, Gao, Jifan, Chen, Guanhua, Velingker, Neelay, Li, Ziyang, Wu, Yinjun, Stein, Adam, Huang, Jiani, Dai, Zongyu, Long, Qi, Naik, Mayur, Holmes, John, Mowery, Danielle, Wong, Eric, Parekh, Ravi, Getzen, Emily, Hightower, Jake, and Blase, Jennifer
Subjects: COVID-19, Community challenge, Evaluation, Long COVID, Machine learning, PASC, Humans, COVID-19, Machine Learning, SARS-CoV-2, United States, Algorithms, Post-Acute COVID-19 Syndrome, Cohort Studies, Crowdsourcing
Abstract: BACKGROUND: While many patients seem to recover from SARS-CoV-2 infections, many patients report experiencing SARS-CoV-2 symptoms for weeks or months after their acute COVID-19 ends, even developing new symptoms weeks after infection. These long-term effects are called post-acute sequelae of SARS-CoV-2 (PASC) or, more commonly, Long COVID. The overall prevalence of Long COVID is currently unknown, and tools are needed to help identify patients at risk for developing long COVID. METHODS: A working group of the Rapid Acceleration of Diagnostics-radical (RADx-rad) program, comprised of individuals from various NIH institutes and centers, in collaboration with REsearching COVID to Enhance Recovery (RECOVER) developed and organized the Long COVID Computational Challenge (L3C), a community challenge aimed at incentivizing the broader scientific community to develop interpretable and accurate methods for identifying patients at risk of developing Long COVID. From August 2022 to December 2022, participants developed Long COVID risk prediction algorithms using the National COVID Cohort Collaborative (N3C) data enclave, a harmonized data repository from over 75 healthcare institutions from across the United States (U.S.). FINDINGS: Over the course of the challenge, 74 teams designed and built 35 Long COVID prediction models using the N3C data enclave. The top 10 teams all scored above a 0.80 Area Under the Receiver Operator Curve (AUROC) with the highest scoring model achieving a mean AUROC of 0.895. Included in the top submission was a visualization dashboard that built timelines for each patient, updating the risk of a patient developing Long COVID in response to clinical events. INTERPRETATION: As a result of L3C, federal reviewers identified multiple machine learning models that can be used to identify patients at risk for developing Long COVID. Many of the teams used approaches in their submissions which can be applied to future clinical prediction questions. FUNDING: Research reported in this RADx® Rad publication was supported by the National Institutes of Health. Timothy Bergquist, Johanna Loomba, and Emily Pfaff were supported by Axle Subcontract: NCATS-STSS-P00438.
Published: 2024

41. LOLA -- An Open-Source Massively Multilingual Large Language Model

Author: Srivastava, Nikit, Kuchelev, Denis, Ngoli, Tatiana Moteu, Shetty, Kshitij, Röder, Michael, Zahera, Hamada, Moussallem, Diego, and Ngomo, Axel-Cyrille Ngonga
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: This paper presents LOLA, a massively multilingual large language model trained on more than 160 languages using a sparse Mixture-of-Experts Transformer architecture. Our architectural and implementation choices address the challenge of harnessing linguistic diversity while maintaining efficiency and avoiding the common pitfalls of multilinguality. Our analysis of the evaluation results shows competitive performance in natural language generation and understanding tasks. Additionally, we demonstrate how the learned expert-routing mechanism exploits implicit phylogenetic linguistic patterns to potentially alleviate the curse of multilinguality. We provide an in-depth look at the training process, an analysis of the datasets, and a balanced exploration of the model's strengths and limitations. As an open-source model, LOLA promotes reproducibility and serves as a robust foundation for future research. Our findings enable the development of compute-efficient multilingual models with strong, scalable performance across languages.
Published: 2024

42. Gate-tunable negative differential resistance in multifunctional van der Waals heterostructure

Author: Mitra, Richa, Iordanidou, Konstantina, Shetty, Naveen, Hoque, Md Anamul, Datta, Anushree, Kalaboukhov, Alexei, Wiktor, Julia, Kubatkin, Sergey, Dash, Saroj Prasad, and Lara-Avila, Samuel
Subjects: Condensed Matter - Mesoscale and Nanoscale Physics
Abstract: Two-dimensional (2D) semiconductors have emerged as leading candidates for the development of low-power and multifunctional computing applications, thanks to their qualities such as layer-dependent band gap tunability, high carrier mobility, and excellent electrostatic control. Here, we explore a pair of 2D semiconductors with broken-gap (Type III) band alignment and demonstrate a highly gate-tunable p-MoTe$_{2}$/n-SnS$_{2}$ heterojunction tunnel field-effect transistor with multifunctional behavior. Employing a dual-gated asymmetric device geometry, we unveil its functionality as both a forward and backward rectifying device. Consequently, we observe a highly gate-tunable negative differential resistance (NDR), with a gate-coupling efficiency of $\eta \simeq 0.5$ and a peak-to-valley ratio of $\sim$ 3 down to 150K. By employing density functional theory and exploring the density of states, we determine that interband tunneling within the valence bands is the cause of the observed NDR characteristics. The combination of band-to-band tunneling and gate controllability of NDR signal open the pathway for realizing gate-tunable 2D material-based neuromorphic and energy-efficient electronics., Comment: 22 pages, 5 figures
Published: 2024

43. On the $\mathcal{ABS}$ spectrum and energy of graphs

Author: Shetty, Swathi, Rakshith, B. R., and N. V, Sayinath Udupa
Subjects: Mathematics - Combinatorics, 05C50, 05C09, 05C35, 05C92
Abstract: Let $\eta_{1}\ge \eta_{2}\ge\cdots\ge \eta_{n}$ be the eigenavalues of $\mathcal{ABS}$ matrix. In this paper, we characterize connected graphs with $\mathcal{ABS}$ eigenvalue $\eta_{n}>-1$. As a result, we determine all connected graphs with exactly two distinct $\mathcal{ABS}$ eigenvalues. We show that a connected bipartite graph has three distinct $\mathcal{ABS}$ eigenvalues if and only if it is a complete bipartite graph. Furthermore, we present some bounds for the $\mathcal{ABS}$ spectral radius (resp. $\mathcal{ABS}$ energy) and characterize extremal graphs. Also, we obtain a relation between $\mathcal{ABC}$ energy and $\mathcal{ABS}$ energy. Finally, the chemical importance of $\mathcal{ABS}$ energy is investigated and it shown that the $\mathcal{ABS}$ energy is useful in predicting certain properties of molecules.
Published: 2024

44. A Deployed Online Reinforcement Learning Algorithm In An Oral Health Clinical Trial

Author: Trella, Anna L., Zhang, Kelly W., Jajal, Hinal, Nahum-Shani, Inbal, Shetty, Vivek, Doshi-Velez, Finale, and Murphy, Susan A.
Subjects: Computer Science - Artificial Intelligence, Computer Science - Human-Computer Interaction
Abstract: Dental disease is a prevalent chronic condition associated with substantial financial burden, personal suffering, and increased risk of systemic diseases. Despite widespread recommendations for twice-daily tooth brushing, adherence to recommended oral self-care behaviors remains sub-optimal due to factors such as forgetfulness and disengagement. To address this, we developed Oralytics, a mHealth intervention system designed to complement clinician-delivered preventative care for marginalized individuals at risk for dental disease. Oralytics incorporates an online reinforcement learning algorithm to determine optimal times to deliver intervention prompts that encourage oral self-care behaviors. We have deployed Oralytics in a registered clinical trial. The deployment required careful design to manage challenges specific to the clinical trials setting in the U.S. In this paper, we (1) highlight key design decisions of the RL algorithm that address these challenges and (2) conduct a re-sampling analysis to evaluate algorithm design decisions. A second phase (randomized control trial) of Oralytics is planned to start in spring 2025.
Published: 2024

45. Effective Monitoring of Online Decision-Making Algorithms in Digital Intervention Implementation

Author: Trella, Anna L., Ghosh, Susobhan, Bonar, Erin E., Coughlin, Lara, Doshi-Velez, Finale, Guo, Yongyi, Hung, Pei-Yao, Nahum-Shani, Inbal, Shetty, Vivek, Walton, Maureen, Yan, Iris, Zhang, Kelly W., and Murphy, Susan A.
Subjects: Computer Science - Computers and Society, Computer Science - Artificial Intelligence
Abstract: Online AI decision-making algorithms are increasingly used by digital interventions to dynamically personalize treatment to individuals. These algorithms determine, in real-time, the delivery of treatment based on accruing data. The objective of this paper is to provide guidelines for enabling effective monitoring of online decision-making algorithms with the goal of (1) safeguarding individuals and (2) ensuring data quality. We elucidate guidelines and discuss our experience in monitoring online decision-making algorithms in two digital intervention clinical trials (Oralytics and MiWaves). Our guidelines include (1) developing fallback methods, pre-specified procedures executed when an issue occurs, and (2) identifying potential issues categorizing them by severity (red, yellow, and green). Across both trials, the monitoring systems detected real-time issues such as out-of-memory issues, database timeout, and failed communication with an external source. Fallback methods prevented participants from not receiving any treatment during the trial and also prevented the use of incorrect data in statistical analyses. These trials provide case studies for how health scientists can build monitoring systems for their digital intervention. Without these algorithm monitoring systems, critical issues would have gone undetected and unresolved. Instead, these monitoring systems safeguarded participants and ensured the quality of the resulting data for updating the intervention and facilitating scientific discovery. These monitoring guidelines and findings give digital intervention teams the confidence to include online decision-making algorithms in digital interventions.
Published: 2024

46. WET: Overcoming Paraphrasing Vulnerabilities in Embeddings-as-a-Service with Linear Transformation Watermarks

Author: Shetty, Anudeex, Xu, Qiongkai, and Lau, Jey Han
Subjects: Computer Science - Cryptography and Security, Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Embeddings-as-a-Service (EaaS) is a service offered by large language model (LLM) developers to supply embeddings generated by LLMs. Previous research suggests that EaaS is prone to imitation attacks -- attacks that clone the underlying EaaS model by training another model on the queried embeddings. As a result, EaaS watermarks are introduced to protect the intellectual property of EaaS providers. In this paper, we first show that existing EaaS watermarks can be removed by paraphrasing when attackers clone the model. Subsequently, we propose a novel watermarking technique that involves linearly transforming the embeddings, and show that it is empirically and theoretically robust against paraphrasing., Comment: Work in Progress
Published: 2024

47. Enhancement of Photoresponse for InGaAs Infrared Photodetectors Using Plasmonic WO3-x/CsyWO3-x Nanocrystals

Author: Merino, Zach D., Jaics, Gyorgy, Jordan, Andrew W. M., Shetty, Arjun, Yin, Penghui, Tam, Man C., Wang, Xinning, Wasilewski, Zbig. R., Radovanovic, Pavle V., and Baugh, Jonathan
Subjects: Physics - Applied Physics
Abstract: Fast and accurate detection of light in the near-infrared (NIR) spectral range plays a crucial role in modern society, from alleviating speed and capacity bottlenecks in optical communications to enhancing the control and safety of autonomous vehicles through NIR imaging systems. Several technological platforms are currently under investigation to improve NIR photodetection, aiming to surpass the performance of established III-V semiconductor p-i-n (PIN) junction technology. These platforms include in situ-grown inorganic nanocrystals and nanowire arrays, as well as hybrid organic-inorganic materials such as graphene-perovskite heterostructures. However, challenges remain in nanocrystal and nanowire growth, large-area fabrication of high-quality 2D materials, and the fabrication of devices for practical applications. Here, we explore the potential for tailored semiconductor nanocrystals to enhance the responsivity of planar metal-semiconductor-metal (MSM) photodetectors. MSM technology offers ease of fabrication and fast response times compared to PIN detectors. We observe enhancement of the optical-to-electric conversion efficiency by up to a factor of ~2.5 through the application of plasmonically-active semiconductor nanorods and nanocrystals. We present a protocol for synthesizing and rapidly testing the performance of non-stoichiometric tungsten oxide (WO$_{3-x}$) nanorods and cesium-doped tungsten oxide (Cs$_y$WO$_{3-x}$) hexagonal nanoprisms prepared in colloidal suspensions and drop-cast onto photodetector surfaces. The results demonstrate the potential for a cost-effective and scalable method exploiting tailored nanocrystals to improve the performance of NIR optoelectronic devices.
Published: 2024

48. ServerFi: A New Symbiotic Relationship Between Games and Players

Author: Shetty, Pavun
Subjects: Computer Science - Cryptography and Security
Abstract: Blockchain-based games have introduced novel economic models that blend traditional gaming with decentralized ownership and financial incentives, leading to the rapid emergence of the GameFi sector. However, despite their innovative appeal, these games face significant challenges, particularly in terms of market stability, player retention, and the sustainability of token value. This paper explores the evolution of blockchain games and identifies key shortcomings in current tokenomics models using entropy increase theory. We propose two new models - ServerFi, which emphasizes Privatization through Asset Synthesis, and a model focused on Continuous Rewards for High-Retention Players. These models are formalized into mathematical frameworks and validated through group behavior simulation experiments. Our findings indicate that the ServerFi is particularly effective in maintaining player engagement and ensuring the long-term viability of the gaming ecosystem, offering a promising direction for future blockchain game development.
Published: 2024

49. Accuracy-Privacy Trade-off in the Mitigation of Membership Inference Attack in Federated Learning

Author: Ahamed, Sayyed Farid, Banerjee, Soumya, Roy, Sandip, Quinn, Devin, Vucovich, Marc, Choi, Kevin, Rahman, Abdul, Hu, Alison, Bowen, Edward, and Shetty, Sachin
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Cryptography and Security
Abstract: Over the last few years, federated learning (FL) has emerged as a prominent method in machine learning, emphasizing privacy preservation by allowing multiple clients to collaboratively build a model while keeping their training data private. Despite this focus on privacy, FL models are susceptible to various attacks, including membership inference attacks (MIAs), posing a serious threat to data confidentiality. In a recent study, Rezaei \textit{et al.} revealed the existence of an accuracy-privacy trade-off in deep ensembles and proposed a few fusion strategies to overcome it. In this paper, we aim to explore the relationship between deep ensembles and FL. Specifically, we investigate whether confidence-based metrics derived from deep ensembles apply to FL and whether there is a trade-off between accuracy and privacy in FL with respect to MIA. Empirical investigations illustrate a lack of a non-monotonic correlation between the number of clients and the accuracy-privacy trade-off. By experimenting with different numbers of federated clients, datasets, and confidence-metric-based fusion strategies, we identify and analytically justify the clear existence of the accuracy-privacy trade-off.
Published: 2024

50. Building AI Agents for Autonomous Clouds: Challenges and Design Principles

Author: Shetty, Manish, Chen, Yinfang, Somashekar, Gagan, Ma, Minghua, Simmhan, Yogesh, Zhang, Xuchao, Mace, Jonathan, Vandevoorde, Dax, Las-Casas, Pedro, Gupta, Shachee Mishra, Nath, Suman, Bansal, Chetan, and Rajmohan, Saravan
Subjects: Computer Science - Software Engineering, Computer Science - Artificial Intelligence, Computer Science - Distributed, Parallel, and Cluster Computing
Abstract: The rapid growth in the use of Large Language Models (LLMs) and AI Agents as part of software development and deployment is revolutionizing the information technology landscape. While code generation receives significant attention, a higher-impact application lies in using AI agents for operational resilience of cloud services, which currently require significant human effort and domain knowledge. There is a growing interest in AI for IT Operations (AIOps) which aims to automate complex operational tasks, like fault localization and root cause analysis, thereby reducing human intervention and customer impact. However, achieving the vision of autonomous and self-healing clouds through AIOps is hampered by the lack of standardized frameworks for building, evaluating, and improving AIOps agents. This vision paper lays the groundwork for such a framework by first framing the requirements and then discussing design decisions that satisfy them. We also propose AIOpsLab, a prototype implementation leveraging agent-cloud-interface that orchestrates an application, injects real-time faults using chaos engineering, and interfaces with an agent to localize and resolve the faults. We report promising results and lay the groundwork to build a modular and robust framework for building, evaluating, and improving agents for autonomous clouds.
Published: 2024

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

89,982 results on '"Shetty A."'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources