Author: "Rishi, A." - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Rishi, A."' showing total 62,793 results

Start Over Author "Rishi, A."

62,793 results on '"Rishi, A."'

1. Deploying Ten Thousand Robots: Scalable Imitation Learning for Lifelong Multi-Agent Path Finding

Author: Jiang, He, Wang, Yutong, Veerapaneni, Rishi, Duhan, Tanishq, Sartoretti, Guillaume, and Li, Jiaoyang
Subjects: Computer Science - Multiagent Systems, Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Computer Science - Robotics
Abstract: Lifelong Multi-Agent Path Finding (LMAPF) is a variant of MAPF where agents are continually assigned new goals, necessitating frequent re-planning to accommodate these dynamic changes. Recently, this field has embraced learning-based methods, which reactively generate single-step actions based on individual local observations. However, it is still challenging for them to match the performance of the best search-based algorithms, especially in large-scale settings. This work proposes an imitation-learning-based LMAPF solver that introduces a novel communication module and systematic single-step collision resolution and global guidance techniques. Our proposed solver, Scalable Imitation Learning for LMAPF (SILLM), inherits the fast reasoning speed of learning-based methods and the high solution quality of search-based methods with the help of modern GPUs. Across six large-scale maps with up to 10,000 agents and varying obstacle structures, SILLM surpasses the best learning- and search-based baselines, achieving average throughput improvements of 137.7% and 16.0%, respectively. Furthermore, SILLM also beats the winning solution of the 2023 League of Robot Runners, an international LMAPF competition sponsored by Amazon Robotics. Finally, we validated SILLM with 10 real robots and 100 virtual robots in a mockup warehouse environment., Comment: Submitted to ICRA 2025
Published: 2024

2. Multi-Class Abnormality Classification Task in Video Capsule Endoscopy

Author: Verma, Dev Rishi, Saxena, Vibhor, Sharma, Dhruv, and Gupta, Arpan
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: In this work we addressed the challenge of multi-class anomaly classification in Video Capsule Endoscopy (VCE)[1] with a variety of deep learning models, ranging from custom CNNs to advanced transformer architectures. The purpose is to correctly classify diverse gastrointestinal disorders, which is critical for increasing diagnostic efficiency in clinical settings. We started with a proprietary CNN and improved performance with ResNet[7] for better feature extraction, followed by Vision Transformer (ViT)[2] to capture global dependencies. Multiscale Vision Transformer (MViT)[6] improved hierarchical feature extraction, while Dual Attention Vision Transformer (DaViT)[4] delivered cutting-edge results by combining spatial and channel attention methods. This methodology enabled us to improve model accuracy across a wide range of criteria, greatly surpassing older methods., Comment: Video Capsule Endoscopy Challenge
Published: 2024

3. ANAVI: Audio Noise Awareness using Visuals of Indoor environments for NAVIgation

Author: Jain, Vidhi, Veerapaneni, Rishi, and Bisk, Yonatan
Subjects: Computer Science - Robotics, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition
Abstract: We propose Audio Noise Awareness using Visuals of Indoors for NAVIgation for quieter robot path planning. While humans are naturally aware of the noise they make and its impact on those around them, robots currently lack this awareness. A key challenge in achieving audio awareness for robots is estimating how loud will the robot's actions be at a listener's location? Since sound depends upon the geometry and material composition of rooms, we train the robot to passively perceive loudness using visual observations of indoor environments. To this end, we generate data on how loud an 'impulse' sounds at different listener locations in simulated homes, and train our Acoustic Noise Predictor (ANP). Next, we collect acoustic profiles corresponding to different actions for navigation. Unifying ANP with action acoustics, we demonstrate experiments with wheeled (Hello Robot Stretch) and legged (Unitree Go2) robots so that these robots adhere to the noise constraints of the environment. See code and data at https://anavi-corl24.github.io/, Comment: 8th Conference on Robot Learning (CoRL) 2024
Published: 2024

4. Medical Imaging Complexity and its Effects on GAN Performance

Author: Cagas, William, Ko, Chan, Hsiao, Blake, Grandhi, Shryuk, Bhattacharya, Rishi, Zhu, Kevin, and Lam, Michael
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: The proliferation of machine learning models in diverse clinical applications has led to a growing need for high-fidelity, medical image training data. Such data is often scarce due to cost constraints and privacy concerns. Alleviating this burden, medical image synthesis via generative adversarial networks (GANs) emerged as a powerful method for synthetically generating photo-realistic images based on existing sets of real medical images. However, the exact image set size required to efficiently train such a GAN is unclear. In this work, we experimentally establish benchmarks that measure the relationship between a sample dataset size and the fidelity of the generated images, given the dataset's distribution of image complexities. We analyze statistical metrics based on delentropy, an image complexity measure rooted in Shannon's entropy in information theory. For our pipeline, we conduct experiments with two state-of-the-art GANs, StyleGAN 3 and SPADE-GAN, trained on multiple medical imaging datasets with variable sample sizes. Across both GANs, general performance improved with increasing training set size but suffered with increasing complexity., Comment: Accepted to ACCV, Workshop on Generative AI for Synthetic Medical Data
Published: 2024

5. Hydrodynamical properties of baryon rich thermal plasma with flavour quarks

Author: Pokhrel, Rishi and Dey, Tanay K.
Subjects: High Energy Physics - Theory
Abstract: In this work, we holographically study the hydrodynamical properties of strongly coupled $\mathcal{N} = 4$ SYM baryon rich thermal plasma with large number of flavour quarks. Specifically, we study the drag force acting on the moving heavy probe quark and corresponding energy loss. We also study the jet quenching parameter, screening length and binding energy of the quark-antiquark pair. Due to the presence of finite baryon density and flavour quarks the drag force, energy loss, jet quenching parameter and binding energy of the quark-antiquark pair are enhanced for the increase in temperature. However, the screening length of the quark-antiquark pair is reduced, leading to the thermal plasma phase being achieved at a lower temperature, which is consistent with the thermal phase diagram of the quark-gluon plasma. We observed that the perpendicular orientation of quark-antiquark pair with respect to the direction of motion deconfined early compare to the parallel orientation once temperature raises., Comment: 28 pages, 56 figures
Published: 2024

6. Generalization for Least Squares Regression With Simple Spiked Covariances

Author: Li, Jiping and Sonthalia, Rishi
Subjects: Mathematics - Statistics Theory, Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Random matrix theory has proven to be a valuable tool in analyzing the generalization of linear models. However, the generalization properties of even two-layer neural networks trained by gradient descent remain poorly understood. To understand the generalization performance of such networks, it is crucial to characterize the spectrum of the feature matrix at the hidden layer. Recent work has made progress in this direction by describing the spectrum after a single gradient step, revealing a spiked covariance structure. Yet, the generalization error for linear models with spiked covariances has not been previously determined. This paper addresses this gap by examining two simple models exhibiting spiked covariances. We derive their generalization error in the asymptotic proportional regime. Our analysis demonstrates that the eigenvector and eigenvalue corresponding to the spike significantly influence the generalization error.
Published: 2024

7. Boosting Asynchronous Decentralized Learning with Model Fragmentation

Author: Biswas, Sayan, Kermarrec, Anne-Marie, Marouani, Alexis, Pires, Rafael, Sharma, Rishi, and De Vos, Martijn
Subjects: Computer Science - Distributed, Parallel, and Cluster Computing, Computer Science - Artificial Intelligence
Abstract: Decentralized learning (DL) is an emerging technique that allows nodes on the web to collaboratively train machine learning models without sharing raw data. Dealing with stragglers, i.e., nodes with slower compute or communication than others, is a key challenge in DL. We present DivShare, a novel asynchronous DL algorithm that achieves fast model convergence in the presence of communication stragglers. DivShare achieves this by having nodes fragment their models into parameter subsets and send, in parallel to computation, each subset to a random sample of other nodes instead of sequentially exchanging full models. The transfer of smaller fragments allows more efficient usage of the collective bandwidth and enables nodes with slow network links to quickly contribute with at least some of their model parameters. By theoretically proving the convergence of DivShare, we provide, to the best of our knowledge, the first formal proof of convergence for a DL algorithm that accounts for the effects of asynchronous communication with delays. We experimentally evaluate DivShare against two state-of-the-art DL baselines, AD-PSGD and Swift, and with two standard datasets, CIFAR-10 and MovieLens. We find that DivShare with communication stragglers lowers time-to-accuracy by up to 3.9x compared to AD-PSGD on the CIFAR-10 dataset. Compared to baselines, DivShare also achieves up to 19.4% better accuracy and 9.5% lower test loss on the CIFAR-10 and MovieLens datasets, respectively.
Published: 2024

8. Large Language Models for Energy-Efficient Code: Emerging Results and Future Directions

Author: Peng, Huiyun, Gupte, Arjun, Eliopoulos, Nicholas John, Ho, Chien Chou, Mantri, Rishi, Deng, Leo, Jiang, Wenxin, Lu, Yung-Hsiang, Läufer, Konstantin, Thiruvathukal, George K., and Davis, James C.
Subjects: Computer Science - Software Engineering
Abstract: Energy-efficient software helps improve mobile device experiences and reduce the carbon footprint of data centers. However, energy goals are often de-prioritized in order to meet other requirements. We take inspiration from recent work exploring the use of large language models (LLMs) for different software engineering activities. We propose a novel application of LLMs: as code optimizers for energy efficiency. We describe and evaluate a prototype, finding that over 6 small programs our system can improve energy efficiency in 3 of them, up to 2x better than compiler optimizations alone. From our experience, we identify some of the challenges of energy-efficient LLM code optimization and propose a research agenda.
Published: 2024

9. Language model developers should report train-test overlap

Author: Zhang, Andy K, Klyman, Kevin, Mai, Yifan, Levine, Yoav, Zhang, Yian, Bommasani, Rishi, and Liang, Percy
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computers and Society, Computer Science - Software Engineering
Abstract: Language models are extensively evaluated, but correctly interpreting evaluation results requires knowledge of train-test overlap which refers to the extent to which the language model is trained on the very data it is being tested on. The public currently lacks adequate information about train-test overlap: most models have no public train-test overlap statistics, and third parties cannot directly measure train-test overlap since they do not have access to the training data. To make this clear, we document the practices of 30 model developers, finding that just 9 developers report train-test overlap: 4 developers release training data under open-source licenses, enabling the community to directly measure train-test overlap, and 5 developers publish their train-test overlap methodology and statistics. By engaging with language model developers, we provide novel information about train-test overlap for three additional developers. Overall, we take the position that language model developers should publish train-test overlap statistics and/or training data whenever they report evaluation results on public test sets. We hope our work increases transparency into train-test overlap to increase the community-wide trust in model evaluations., Comment: 18 pages
Published: 2024

10. Autonomous Robotic System with Optical Coherence Tomography Guidance for Vascular Anastomosis

Author: Haworth, Jesse, Biswas, Rishi, Opfermann, Justin, Kam, Michael, Wang, Yaning, Pantalone, Desire, Creighton, Francis X., Yang, Robin, Kang, Jin U., and Krieger, Axel
Subjects: Computer Science - Robotics, Electrical Engineering and Systems Science - Systems and Control, 68T40: Robotics
Abstract: Vascular anastomosis, the surgical connection of blood vessels, is essential in procedures such as organ transplants and reconstructive surgeries. The precision required limits accessibility due to the extensive training needed, with manual suturing leading to variable outcomes and revision rates up to 7.9%. Existing robotic systems, while promising, are either fully teleoperated or lack the capabilities necessary for autonomous vascular anastomosis. We present the Micro Smart Tissue Autonomous Robot (micro-STAR), an autonomous robotic system designed to perform vascular anastomosis on small-diameter vessels. The micro-STAR system integrates a novel suturing tool equipped with Optical Coherence Tomography (OCT) fiber-optic sensor and a microcamera, enabling real-time tissue detection and classification. Our system autonomously places sutures and manipulates tissue with minimal human intervention. In an ex vivo study, micro-STAR achieved outcomes competitive with experienced surgeons in terms of leak pressure, lumen reduction, and suture placement variation, completing 90% of sutures without human intervention. This represents the first instance of a robotic system autonomously performing vascular anastomosis on real tissue, offering significant potential for improving surgical precision and expanding access to high-quality care., Comment: This paper was submitted to IEEE TMRB and is currently under review. There are 9 pages, 9 figures, and 2 tables
Published: 2024

11. Multiwavelength Campaign Observations of a Young Solar-type Star, EK Draconis. II. Understanding Prominence Eruption through Data-Driven Modeling and Observed Magnetic Environment

Author: Namekata, Kosuke, Ikuta, Kai, Petit, Pascal, Airapetian, Vladimir S., Vidotto, Aline A., Heinzel, Petr, Wollmann, Jiří, Maehara, Hiroyuki, Notsu, Yuta, Inoue, Shun, Marsden, Stephen, Morin, Julien, Jeffers, Sandra V., Neiner, Coralie, Paudel, Rishi R., Avramova-Boncheva, Antoaneta A., Gendreau, Keith, and Shibata, Kazunari
Subjects: Astrophysics - Solar and Stellar Astrophysics, Astrophysics - Earth and Planetary Astrophysics
Abstract: EK Draconis, a nearby young solar-type star (G1.5V, 50-120 Myr), is known as one of the best proxies for inferring the environmental conditions of the young Sun. The star frequently produces superflares and Paper I presented the first evidence of an associated gigantic prominence eruption observed as a blueshifted H$\alpha$ Balmer line emission. In this paper, we present the results of dynamical modeling of the stellar eruption and examine its relationship to the surface starspots and large-scale magnetic fields observed concurrently with the event. By performing a one-dimensional free-fall dynamical model and a one dimensional hydrodynamic simulation of the flow along the expanding magnetic loop, we found that the prominence eruption likely occurred near the stellar limb (12$^{+5}_{-5}$-16$^{+7}_{-7}$ degrees from the limb) and was ejected at an angle of 15$^{+6}_{-5}$-24$^{+6}_{-6}$ degrees relative to the line of sight, and the magnetic structures can expand into a coronal mass ejection (CME). The observed prominence displayed a terminal velocity of $\sim$0 km s$^{-1}$ prior to disappearance, complicating the interpretation of its dynamics in Paper I. The models in this paper suggest that prominence's H$\alpha$ intensity diminishes at around or before its expected maximum height, explaining the puzzling time evolution in observations. The TESS light curve modeling and (Zeeman) Doppler Imaging revealed large mid-latitude spots with polarity inversion lines and one polar spot with dominant single polarity, all near the stellar limb during the eruption. This suggests that mid-latitude spots could be the source of the pre-existing gigantic prominence we reported in Paper I. These results provide valuable insights into the dynamic processes that likely influenced the environments of early Earth, Mars, Venus, and young exoplanets., Comment: 25 pages, 14 figures, 5 tables. Accepted for publication in The Astrophysical Journal
Published: 2024

12. Identification of Mean-Field Dynamics using Transformers

Author: Biswal, Shiba, Elamvazhuthi, Karthik, and Sonthalia, Rishi
Subjects: Physics - Computational Physics, Condensed Matter - Disordered Systems and Neural Networks, Condensed Matter - Statistical Mechanics, Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: This paper investigates the use of transformer architectures to approximate the mean-field dynamics of interacting particle systems exhibiting collective behavior. Such systems are fundamental in modeling phenomena across physics, biology, and engineering, including gas dynamics, opinion formation, biological networks, and swarm robotics. The key characteristic of these systems is that the particles are indistinguishable, leading to permutation-equivariant dynamics. We demonstrate that transformers, which inherently possess permutation equivariance, are well-suited for approximating these dynamics. Specifically, we prove that if a finite-dimensional transformer can effectively approximate the finite-dimensional vector field governing the particle system, then the expected output of this transformer provides a good approximation for the infinite-dimensional mean-field vector field. Leveraging this result, we establish theoretical bounds on the distance between the true mean-field dynamics and those obtained using the transformer. We validate our theoretical findings through numerical simulations on the Cucker-Smale model for flocking, and the mean-field system for training two-layer neural networks.
Published: 2024

13. Fair Decentralized Learning

Author: Biswas, Sayan, Kermarrec, Anne-Marie, Sharma, Rishi, Trinca, Thibaud, and de Vos, Martijn
Subjects: Computer Science - Machine Learning, Computer Science - Distributed, Parallel, and Cluster Computing
Abstract: Decentralized learning (DL) is an emerging approach that enables nodes to collaboratively train a machine learning model without sharing raw data. In many application domains, such as healthcare, this approach faces challenges due to the high level of heterogeneity in the training data's feature space. Such feature heterogeneity lowers model utility and negatively impacts fairness, particularly for nodes with under-represented training data. In this paper, we introduce \textsc{Facade}, a clustering-based DL algorithm specifically designed for fair model training when the training data exhibits several distinct features. The challenge of \textsc{Facade} is to assign nodes to clusters, one for each feature, based on the similarity in the features of their local data, without requiring individual nodes to know apriori which cluster they belong to. \textsc{Facade} (1) dynamically assigns nodes to their appropriate clusters over time, and (2) enables nodes to collaboratively train a specialized model for each cluster in a fully decentralized manner. We theoretically prove the convergence of \textsc{Facade}, implement our algorithm, and compare it against three state-of-the-art baselines. Our experimental results on three datasets demonstrate the superiority of our approach in terms of model accuracy and fairness compared to all three competitors. Compared to the best-performing baseline, \textsc{Facade} on the CIFAR-10 dataset also reduces communication costs by 32.3\% to reach a target accuracy when cluster sizes are imbalanced.
Published: 2024

14. A combinatorial introduction to Adinkras

Author: Donley Jr, Robert W., Gates Jr, S. James, Hübsch, Tristan, and Nath, Rishi
Subjects: Mathematics - History and Overview, Mathematics - Combinatorics, 05C22 05C70 05B15 81Q60 81V72
Abstract: We survey the combinatorics of the Adinkra, a graphical device for solving differential equations in supersymmetry. These graphs represent an exceptional class of 1-factorizations with further augmentations. As a new feature, we characterize Adinkras using Latin rectangles., Comment: 19 pages, 11 figures
Published: 2024

15. Windowed MAPF with Completeness Guarantees

Author: Veerapaneni, Rishi, Saleem, Muhammad Suhail, Li, Jiaoyang, and Likhachev, Maxim
Subjects: Computer Science - Multiagent Systems, Computer Science - Artificial Intelligence, Computer Science - Robotics
Abstract: Traditional multi-agent path finding (MAPF) methods try to compute entire start-goal paths which are collision free. However, computing an entire path can take too long for MAPF systems where agents need to replan fast. Methods that address this typically employ a "windowed" approach and only try to find collision free paths for a small windowed timestep horizon. This adaptation comes at the cost of incompleteness; all current windowed approaches can become stuck in deadlock or livelock. Our main contribution is to introduce our framework, WinC-MAPF, for Windowed MAPF that enables completeness. Our framework uses heuristic update insights from single-agent real-time heuristic search algorithms as well as agent independence ideas from MAPF algorithms. We also develop Single-Step CBS (SS-CBS), an instantiation of this framework using a novel modification to CBS. We show how SS-CBS, which only plans a single step and updates heuristics, can effectively solve tough scenarios where existing windowed approaches fail.
Published: 2024

16. A POMDP-based hierarchical planning framework for manipulation under pose uncertainty

Author: Saleem, Muhammad Suhail, Veerapaneni, Rishi, and Likhachev, Maxim
Subjects: Computer Science - Robotics
Abstract: Robots often face challenges in domestic environments where visual feedback is ineffective, such as retrieving objects obstructed by occlusions or finding a light switch in the dark. In these cases, utilizing contacts to localize the target object can be effective. We propose an online planning framework using binary contact signals for manipulation tasks with pose uncertainty, formulated as a Partially Observable Markov Decision Process (POMDP). Naively representing the belief as a particle set makes planning infeasible due to the large uncertainties in domestic settings, as identifying the best sequence of actions requires rolling out thousands of actions across millions of particles, taking significant compute time. To address this, we propose a hierarchical belief representation. Initially, we represent the uncertainty coarsely in a 3D volumetric space. Policies that refine uncertainty in this space are computed and executed, and once uncertainty is sufficiently reduced, the problem is translated back into the particle space for further refinement before task completion. We utilize a closed-loop planning and execution framework with a heuristic-search-based anytime solver that computes partial policies within a limited time budget. The performance of the framework is demonstrated both in real world and in simulation on the high-precision task of inserting a plug into a port using a UR10e manipulator, resolving positional uncertainties up to 50 centimeters and angular uncertainties close to $2\pi$. Experimental results highlight the framework's effectiveness, achieving a 93\% success rate in the real world and over 50\% improvement in solution quality compared to greedy baselines, significantly accelerating planning and enabling real-time solutions for complex problems., Comment: Under review (2025 IEEE International Conference on Robotics & Automation)
Published: 2024

17. Preserving phase coherence and linearity in cat qubits with exponential bit-flip suppression

Author: Putterman, Harald, Noh, Kyungjoo, Patel, Rishi N., Peairs, Gregory A., MacCabe, Gregory S., Lee, Menyoung, Aghaeimeibodi, Shahriar, Hann, Connor T., Jarrige, Ignace, Marcaud, Guillaume, He, Yuan, Moradinejad, Hesam, Owens, John Clai, Scaffidi, Thomas, Arrangoiz-Arriola, Patricio, Iverson, Joe, Levine, Harry, Brandão, Fernando G. S. L., Matheny, Matthew H., and Painter, Oskar
Subjects: Quantum Physics
Abstract: Cat qubits, a type of bosonic qubit encoded in a harmonic oscillator, can exhibit an exponential noise bias against bit-flip errors with increasing mean photon number. Here, we focus on cat qubits stabilized by two-photon dissipation, where pairs of photons are added and removed from a harmonic oscillator by an auxiliary, lossy buffer mode. This process requires a large loss rate and strong nonlinearities of the buffer mode that must not degrade the coherence and linearity of the oscillator. In this work, we show how to overcome this challenge by coloring the loss environment of the buffer mode with a multi-pole filter and optimizing the circuit to take into account additional inductances in the buffer mode. Using these techniques, we achieve near-ideal enhancement of cat-qubit bit-flip times with increasing photon number, reaching over $0.1$ seconds with a mean photon number of only $4$. Concurrently, our cat qubit remains highly phase coherent, with phase-flip times corresponding to an effective lifetime of $T_{1,\text{eff}} \simeq 70$ $\mu$s, comparable with the bare oscillator lifetime. We achieve this performance even in the presence of an ancilla transmon, used for reading out the cat qubit states, by engineering a tunable oscillator-ancilla dispersive coupling. Furthermore, the low nonlinearity of the harmonic oscillator mode allows us to perform pulsed cat-qubit stabilization, an important control primitive, where the stabilization can remain off for a significant fraction (e.g., two thirds) of a $3~\mathrm{\mu s}$ cycle without degrading bit-flip times. These advances are important for the realization of scalable error-correction with cat qubits, where large noise bias and low phase-flip error rate enable the use of hardware-efficient outer error-correcting codes., Comment: Comments welcome!
Published: 2024

18. Work Smarter Not Harder: Simple Imitation Learning with CS-PIBT Outperforms Large Scale Imitation Learning for MAPF

Author: Veerapaneni, Rishi, Jakobsson, Arthur, Ren, Kevin, Kim, Samuel, Li, Jiaoyang, and Likhachev, Maxim
Subjects: Computer Science - Multiagent Systems, Computer Science - Robotics
Abstract: Multi-Agent Path Finding (MAPF) is the problem of effectively finding efficient collision-free paths for a group of agents in a shared workspace. The MAPF community has largely focused on developing high-performance heuristic search methods. Recently, several works have applied various machine learning (ML) techniques to solve MAPF, usually involving sophisticated architectures, reinforcement learning techniques, and set-ups, but none using large amounts of high-quality supervised data. Our initial objective in this work was to show how simple large scale imitation learning of high-quality heuristic search methods can lead to state-of-the-art ML MAPF performance. However, we find that, at least with our model architecture, simple large scale (700k examples with hundreds of agents per example) imitation learning does \textit{not} produce impressive results. Instead, we find that by using prior work that post-processes MAPF model predictions to resolve 1-step collisions (CS-PIBT), we can train a simple ML MAPF model in minutes that dramatically outperforms existing ML MAPF policies. This has serious implications for all future ML MAPF policies (with local communication) which currently struggle to scale. In particular, this finding implies that future learnt policies should (1) always use smart 1-step collision shields (e.g. CS-PIBT), (2) always include the collision shield with greedy actions as a baseline (e.g. PIBT) and (3) motivates future models to focus on longer horizon / more complex planning as 1-step collisions can be efficiently resolved.
Published: 2024

19. Hardware-efficient quantum error correction using concatenated bosonic qubits

Author: Putterman, Harald, Noh, Kyungjoo, Hann, Connor T., MacCabe, Gregory S., Aghaeimeibodi, Shahriar, Patel, Rishi N., Lee, Menyoung, Jones, William M., Moradinejad, Hesam, Rodriguez, Roberto, Mahuli, Neha, Rose, Jefferson, Owens, John Clai, Levine, Harry, Rosenfeld, Emma, Reinhold, Philip, Moncelsi, Lorenzo, Alcid, Joshua Ari, Alidoust, Nasser, Arrangoiz-Arriola, Patricio, Barnett, James, Bienias, Przemyslaw, Carson, Hugh A., Chen, Cliff, Chen, Li, Chinkezian, Harutiun, Chisholm, Eric M., Chou, Ming-Han, Clerk, Aashish, Clifford, Andrew, Cosmic, R., Curiel, Ana Valdes, Davis, Erik, DeLorenzo, Laura, D'Ewart, J. Mitchell, Diky, Art, D'Souza, Nathan, Dumitrescu, Philipp T., Eisenmann, Shmuel, Elkhouly, Essam, Evenbly, Glen, Fang, Michael T., Fang, Yawen, Fling, Matthew J., Fon, Warren, Garcia, Gabriel, Gorshkov, Alexey V., Grant, Julia A., Gray, Mason J., Grimberg, Sebastian, Grimsmo, Arne L., Haim, Arbel, Hand, Justin, He, Yuan, Hernandez, Mike, Hover, David, Hung, Jimmy S. C., Hunt, Matthew, Iverson, Joe, Jarrige, Ignace, Jaskula, Jean-Christophe, Jiang, Liang, Kalaee, Mahmoud, Karabalin, Rassul, Karalekas, Peter J., Keller, Andrew J., Khalajhedayati, Amirhossein, Kubica, Aleksander, Lee, Hanho, Leroux, Catherine, Lieu, Simon, Ly, Victor, Madrigal, Keven Villegas, Marcaud, Guillaume, McCabe, Gavin, Miles, Cody, Milsted, Ashley, Minguzzi, Joaquin, Mishra, Anurag, Mukherjee, Biswaroop, Naghiloo, Mahdi, Oblepias, Eric, Ortuno, Gerson, Pagdilao, Jason, Pancotti, Nicola, Panduro, Ashley, Paquette, JP, Park, Minje, Peairs, Gregory A., Perello, David, Peterson, Eric C., Ponte, Sophia, Preskill, John, Qiao, Johnson, Refael, Gil, Resnick, Rachel, Retzker, Alex, Reyna, Omar A., Runyan, Marc, Ryan, Colm A., Sahmoud, Abdulrahman, Sanchez, Ernesto, Sanil, Rohan, Sankar, Krishanu, Sato, Yuki, Scaffidi, Thomas, Siavoshi, Salome, Sivarajah, Prasahnt, Skogland, Trenton, Su, Chun-Ju, Swenson, Loren J., Teo, Stephanie M., Tomada, Astrid, Torlai, Giacomo, Wollack, E. Alex, Ye, Yufeng, Zerrudo, Jessica A., Zhang, Kailing, Brandão, Fernando G. S. L., Matheny, Matthew H., and Painter, Oskar
Subjects: Quantum Physics
Abstract: In order to solve problems of practical importance, quantum computers will likely need to incorporate quantum error correction, where a logical qubit is redundantly encoded in many noisy physical qubits. The large physical-qubit overhead typically associated with error correction motivates the search for more hardware-efficient approaches. Here, using a microfabricated superconducting quantum circuit, we realize a logical qubit memory formed from the concatenation of encoded bosonic cat qubits with an outer repetition code of distance $d=5$. The bosonic cat qubits are passively protected against bit flips using a stabilizing circuit. Cat-qubit phase-flip errors are corrected by the repetition code which uses ancilla transmons for syndrome measurement. We realize a noise-biased CX gate which ensures bit-flip error suppression is maintained during error correction. We study the performance and scaling of the logical qubit memory, finding that the phase-flip correcting repetition code operates below threshold, with logical phase-flip error decreasing with code distance from $d=3$ to $d=5$. Concurrently, the logical bit-flip error is suppressed with increasing cat-qubit mean photon number. The minimum measured logical error per cycle is on average $1.75(2)\%$ for the distance-3 code sections, and $1.65(3)\%$ for the longer distance-5 code, demonstrating the effectiveness of bit-flip error suppression throughout the error correction cycle. These results, where the intrinsic error suppression of the bosonic encodings allows us to use a hardware-efficient outer error correcting code, indicate that concatenated bosonic codes are a compelling paradigm for reaching fault-tolerant quantum computation., Comment: Comments on the manuscript welcome!
Published: 2024

20. FB-HyDON: Parameter-Efficient Physics-Informed Operator Learning of Complex PDEs via Hypernetwork and Finite Basis Domain Decomposition

Author: Ramezankhani, Milad, Parekh, Rishi Yash, Deodhar, Anirudh, and Birru, Dagnachew
Subjects: Computer Science - Machine Learning, Mathematics - Numerical Analysis, Physics - Applied Physics
Abstract: Deep operator networks (DeepONet) and neural operators have gained significant attention for their ability to map infinite-dimensional function spaces and perform zero-shot super-resolution. However, these models often require large datasets for effective training. While physics-informed operators offer a data-agnostic learning approach, they introduce additional training complexities and convergence issues, especially in highly nonlinear systems. To overcome these challenges, we introduce Finite Basis Physics-Informed HyperDeepONet (FB-HyDON), an advanced operator architecture featuring intrinsic domain decomposition. By leveraging hypernetworks and finite basis functions, FB-HyDON effectively mitigates the training limitations associated with existing physics-informed operator learning methods. We validated our approach on the high-frequency harmonic oscillator, Burgers' equation at different viscosity levels, and Allen-Cahn equation demonstrating substantial improvements over other operator learning models.
Published: 2024

21. Structural and electronic transformations in TiO2 induced by electric current

Author: Sterling, Tyler C., Ye, Feng, Jo, Seohyeon, Parulekar, Anish, Zhang, Yu, Cao, Gang, Raj, Rishi, and Reznik, Dmitry
Subjects: Condensed Matter - Materials Science
Abstract: In-situ diffuse neutron scattering experiments revealed that when electric current is passed through single crystals of rutile TiO2 under conditions conducive to flash sintering, it induces the formation of parallel planes of oxygen vacancies. Specifically, a current perpendicular to the c-axis generates planes normal to the (132) reciprocal lattice vector, whereas currents aligned with the c-axis form planes normal to the (132) and to the (225) vector. The concentration of defects increases with incresing current. The structural modifications are linked to the appearance of signatures of interacting Ti3+ moments in magnetic susceptibility, signifying a structural collapse around the vacancy planes. Electrical conductivity measurements of the modified material reveal several electronic transitions between semiconducting states (via a metal-like intermediate state) with the smallest gap being 27 meV. Pristine TiO2 can be restored by heating followed by slow cooling in air. Our work suggests a novel paradigm for achieving switching of electrical conductivity related to the flash phenomenon
Published: 2024
Full Text: View/download PDF

22. A Physics-Enforced Neural Network to Predict Polymer Melt Viscosity

Author: Jain, Ayush, Gurnani, Rishi, Rajan, Arunkumar, Qi, H. Jerry, and Ramprasad, Rampi
Subjects: Computer Science - Computational Engineering, Finance, and Science, Condensed Matter - Materials Science
Abstract: Achieving superior polymeric components through additive manufacturing (AM) relies on precise control of rheology. One key rheological property particularly relevant to AM is melt viscosity ($\eta$). Melt viscosity is influenced by polymer chemistry, molecular weight ($M_w$), polydispersity, induced shear rate ($\dot\gamma$), and processing temperature ($T$). The relationship of $\eta$ with $M_w$, $\dot\gamma$, and $T$ may be captured by parameterized equations. Several physical experiments are required to fit the parameters, so predicting $\eta$ of a new polymer material in unexplored physical domains is a laborious process. Here, we develop a Physics-Enforced Neural Network (PENN) model that predicts the empirical parameters and encodes the parametrized equations to calculate $\eta$ as a function of polymer chemistry, $M_w$, polydispersity, $\dot\gamma$, and $T$. We benchmark our PENN against physics-unaware Artificial Neural Network (ANN) and Gaussian Process Regression (GPR) models. Finally, we demonstrate that the PENN offers superior values of $\eta$ when extrapolating to unseen values of $M_w$, $\dot\gamma$, and $T$ for sparsely seen polymers.
Published: 2024

23. Structurable equivalence relations and $\mathcal{L}_{\omega_1\omega}$ interpretations

Author: Banerjee, Rishi and Chen, Ruiyuan
Subjects: Mathematics - Logic, 03E15, 03C15
Abstract: We show that the category of countable Borel equivalence relations (CBERs) is dually equivalent to the category of countable $\mathcal{L}_{\omega_1\omega}$ theories which admit a one-sorted interpretation of a particular theory we call $\mathcal{T}_\mathsf{LN} \sqcup \mathcal{T}_\mathsf{sep}$ that witnesses embeddability into $2^\mathbb{N}$ and the Lusin--Novikov uniformization theorem. This allows problems about Borel combinatorial structures on CBERs to be translated into syntactic definability problems in $\mathcal{L}_{\omega_1\omega}$, modulo the extra structure provided by $\mathcal{T}_\mathsf{LN} \sqcup \mathcal{T}_\mathsf{sep}$, thereby formalizing a folklore intuition in locally countable Borel combinatorics. We illustrate this with a catalogue of the precise interpretability relations between several standard classes of structures commonly used in Borel combinatorics, such as Feldman--Moore $\omega$-colorings and the Slaman--Steel marker lemma. We also generalize this correspondence to locally countable Borel groupoids and theories interpreting $\mathcal{T}_\mathsf{LN}$, which admit a characterization analogous to that of Hjorth--Kechris for essentially countable isomorphism relations., Comment: 55 pages
Published: 2024

24. Exceptional Points for Density Modulo 1

Author: Berend, Daniel, Boshernitzan, Michael D., Kolesnik, Grigori, and Kumar, Rishi
Subjects: Mathematics - Number Theory, 1B05, 11J71, 11K06, 11K55
Abstract: It is well known that almost every dilation of a sequence of real numbers, that diverges to $\infty$, is dense modulo~1. This paper studies the exceptional set of points -- those for which the dilation is not dense. Specifically, we consider the Hausdorff and modified box dimensions of the set of exceptional points. In particular, we show that the dimension of this set may be any number between 0 and 1. Similar results are obtained for two ``natural'' subsets of the set of exceptional points. Furthermore, the paper calculates the dimension of several sets of points, defined by certain constraints on their binary expansion.
Published: 2024

25. HyPA-RAG: A Hybrid Parameter Adaptive Retrieval-Augmented Generation System for AI Legal and Policy Applications

Author: Kalra, Rishi, Wu, Zekun, Gulley, Ayesha, Hilliard, Airlie, Guan, Xin, Koshiyama, Adriano, and Treleaven, Philip
Subjects: Computer Science - Information Retrieval, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: While Large Language Models (LLMs) excel in text generation and question-answering, their effectiveness in AI legal and policy is limited by outdated knowledge, hallucinations, and inadequate reasoning in complex contexts. Retrieval-Augmented Generation (RAG) systems improve response accuracy by integrating external knowledge but struggle with retrieval errors, poor context integration, and high costs, particularly in interpreting qualitative and quantitative AI legal texts. This paper introduces a Hybrid Parameter-Adaptive RAG (HyPA-RAG) system tailored for AI legal and policy, exemplified by NYC Local Law 144 (LL144). HyPA-RAG uses a query complexity classifier for adaptive parameter tuning, a hybrid retrieval strategy combining dense, sparse, and knowledge graph methods, and an evaluation framework with specific question types and metrics. By dynamically adjusting parameters, HyPA-RAG significantly improves retrieval accuracy and response fidelity. Testing on LL144 shows enhanced correctness, faithfulness, and contextual precision, addressing the need for adaptable NLP systems in complex, high-stakes AI legal and policy applications., Comment: Under review for the EMNLP 2024 Workshop on Customizable NLP: Progress and Challenges in Customizing NLP for a Domain, Application, Group, or Individual
Published: 2024

26. Decoupled Gravitational Wave Equations in Spherical Symmetry from Curvature Wave Equations

Author: Mukkamala, Gowtham Rishi and Pereñiguez, David
Subjects: General Relativity and Quantum Cosmology, High Energy Physics - Theory
Abstract: Black hole perturbation theory on spherically symmetric backgrounds has been instrumental in establishing various aspects about the gravitational dynamics close to black holes, and continues to be an interesting avenue to confront current challenges in gravitational physics. In this paper, we present an approach to perturbation theory in spherical symmetry that addresses simultaneously some conceivably inconvenient aspects of the traditional methods. In particular, focusing on Schwarzschild's background we are able to derive a decoupled wave equation, for a single complex variable, by simply computing one component of the curvature wave equation satisfied by a complex self-dual version of the Riemann tensor. The real and imaginary parts of the variable consist only of even and odd pieces of the metric fluctuation, respectively, and both satisfy the Regge-Wheeler equation. Besides providing a systematic derivation of decoupled equations, an immediate corollary of our results is the isospectrality between even and odd sectors. We conclude by discussing potential extensions of our formalism to include matter and higher orders in perturbation theory., Comment: 12 pages + refs. , no figures
Published: 2024

27. Bidirectional Intent Communication: A Role for Large Foundation Models

Author: Schreiter, Tim, Hazra, Rishi, Rüppel, Jens, and Rudenko, Andrey
Subjects: Computer Science - Robotics, Computer Science - Human-Computer Interaction
Abstract: Integrating multimodal foundation models has significantly enhanced autonomous agents' language comprehension, perception, and planning capabilities. However, while existing works adopt a \emph{task-centric} approach with minimal human interaction, applying these models to developing assistive \emph{user-centric} robots that can interact and cooperate with humans remains underexplored. This paper introduces ``Bident'', a framework designed to integrate robots seamlessly into shared spaces with humans. Bident enhances the interactive experience by incorporating multimodal inputs like speech and user gaze dynamics. Furthermore, Bident supports verbal utterances and physical actions like gestures, making it versatile for bidirectional human-robot interactions. Potential applications include personalized education, where robots can adapt to individual learning styles and paces, and healthcare, where robots can offer personalized support, companionship, and everyday assistance in the home and workplace environments., Comment: 2024 33rd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), Workshop: Large Language Models in the RoMan Age
Published: 2024

28. Probabilistic Medical Predictions of Large Language Models

Author: Gu, Bowen, Desai, Rishi J., Lin, Kueiyu Joshua, and Yang, Jie
Subjects: Computer Science - Artificial Intelligence
Abstract: Large Language Models (LLMs) have demonstrated significant potential in clinical applications through prompt engineering, which enables the generation of flexible and diverse clinical predictions. However, they pose challenges in producing prediction probabilities, which are essential for transparency and allowing clinicians to apply flexible probability thresholds in decision-making. While explicit prompt instructions can lead LLMs to provide prediction probability numbers through text generation, LLMs' limitations in numerical reasoning raise concerns about the reliability of these text-generated probabilities. To assess this reliability, we compared explicit probabilities derived from text generation to implicit probabilities calculated based on the likelihood of predicting the correct label token. Experimenting with six advanced open-source LLMs across five medical datasets, we found that the performance of explicit probabilities was consistently lower than implicit probabilities with respect to discrimination, precision, and recall. Moreover, these differences were enlarged on small LLMs and imbalanced datasets, emphasizing the need for cautious interpretation and applications, as well as further research into robust probability estimation methods for LLMs in clinical contexts., Comment: 58 pages, 3 figures, 3 tables, Submitted to Nature Communication
Published: 2024

29. Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risks of Language Models

Author: Zhang, Andy K., Perry, Neil, Dulepet, Riya, Ji, Joey, Lin, Justin W., Jones, Eliot, Menders, Celeste, Hussein, Gashon, Liu, Samantha, Jasper, Donovan, Peetathawatchai, Pura, Glenn, Ari, Sivashankar, Vikram, Zamoshchin, Daniel, Glikbarg, Leo, Askaryar, Derek, Yang, Mike, Zhang, Teddy, Alluri, Rishi, Tran, Nathan, Sangpisit, Rinnara, Yiorkadjis, Polycarpos, Osele, Kenny, Raghupathi, Gautham, Boneh, Dan, Ho, Daniel E., and Liang, Percy
Subjects: Computer Science - Cryptography and Security, Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Computers and Society, Computer Science - Machine Learning
Abstract: Language Model (LM) agents for cybersecurity that are capable of autonomously identifying vulnerabilities and executing exploits have the potential to cause real-world impact. Policymakers, model providers, and other researchers in the AI and cybersecurity communities are interested in quantifying the capabilities of such agents to help mitigate cyberrisk and investigate opportunities for penetration testing. Toward that end, we introduce Cybench, a framework for specifying cybersecurity tasks and evaluating agents on those tasks. We include 40 professional-level Capture the Flag (CTF) tasks from 4 distinct CTF competitions, chosen to be recent, meaningful, and spanning a wide range of difficulties. Each task includes its own description, starter files, and is initialized in an environment where an agent can execute bash commands and observe outputs. Since many tasks are beyond the capabilities of existing LM agents, we introduce subtasks for each task, which break down a task into intermediary steps for a more detailed evaluation. To evaluate agent capabilities, we construct a cybersecurity agent and evaluate 8 models: GPT-4o, OpenAI o1-preview, Claude 3 Opus, Claude 3.5 Sonnet, Mixtral 8x22b Instruct, Gemini 1.5 Pro, Llama 3 70B Chat, and Llama 3.1 405B Instruct. Without subtask guidance, agents leveraging Claude 3.5 Sonnet, GPT-4o, OpenAI o1-preview, and Claude 3 Opus successfully solved complete tasks that took human teams up to 11 minutes to solve. In comparison, the most difficult task took human teams 24 hours and 54 minutes to solve. All code and data are publicly available at https://cybench.github.io, Comment: 78 pages, 6 figures
Published: 2024

30. Can Large Language Models Reason? A Characterization via 3-SAT

Author: Hazra, Rishi, Venturato, Gabriele, Martires, Pedro Zuidberg Dos, and De Raedt, Luc
Subjects: Computer Science - Artificial Intelligence
Abstract: Large Language Models (LLMs) have been touted as AI models possessing advanced reasoning abilities. However, recent works have shown that LLMs often bypass true reasoning using shortcuts, sparking skepticism. To study the reasoning capabilities in a principled fashion, we adopt a computational theory perspective and propose an experimental protocol centered on 3-SAT -- the prototypical NP-complete problem lying at the core of logical reasoning and constraint satisfaction tasks. Specifically, we examine the phase transitions in random 3-SAT and characterize the reasoning abilities of LLMs by varying the inherent hardness of the problem instances. Our experimental evidence shows that LLMs are incapable of performing true reasoning, as required for solving 3-SAT problems. Moreover, we observe significant performance variation based on the inherent hardness of the problems -- performing poorly on harder instances and vice versa. Importantly, we show that integrating external reasoners can considerably enhance LLM performance. By following a principled experimental protocol, our study draws concrete conclusions and moves beyond the anecdotal evidence often found in LLM reasoning research.
Published: 2024

31. Decentralized Fair Division

Author: Miller, Joel, Advani, Rishi, Kash, Ian, Kanich, Chris, and Zuck, Lenore
Subjects: Computer Science - Computer Science and Game Theory
Abstract: Fair division is typically framed from a centralized perspective. We study a decentralized variant of fair division inspired by the dynamics observed in community-based targeting, mutual aid networks, and community resource management paradigms. We develop an approach for decentralized fair division and compare it with a centralized approach with respect to fairness and social welfare guarantees. In the context of the existing literature, our decentralized model can be viewed as a relaxation of previous models of sequential exchange in light of impossibility results concerning the inability of those models to achieve desirable outcomes. We find that in settings representative of many real world situations, the two models of resource allocation offer contrasting fairness and social welfare guarantees. In particular, we show that under appropriate conditions, our model of decentralized allocation can ensure high-quality allocative decisions in an efficient fashion.
Published: 2024

32. Estimators for the cross-pairwise kSZ effect and forecasts for the dark energy and modified gravity parameters with CMB-S4

Author: Gon, Aritra Kumar and Khatri, Rishi
Subjects: Astrophysics - Cosmology and Nongalactic Astrophysics
Abstract: We present and study a new cross-pairwise estimator to extract the kinetic Sunyaev Zeldovich (kSZ) signal from galaxy clusters. The existing pairwise kSZ method involves pairing clusters with other clusters and stacking them. In the cross-pairwise method, we propose to pair clusters with galaxies from a spectroscopic survey and then do the stacking. Cross-pairing decreases the measurement, instrumentation, and statistical noise, thus boosting the signal-to-noise ratio. However, we also need data from a galaxy survey in addition to the CMB temperature maps and a cluster catalog in order to use this method. We do a Fisher matrix analysis for the optimised pairwise and cross-pairwise estimators and forecast the ability of future Cosmic Microwave Background (CMB) experiments and galaxy surveys to measure cosmological parameters with the kSZ effect when combined with primary CMB and Baryon Acoustic Oscillation (BAO) data. We show that using the cross-pairwise kSZ estimator (CMB-S4 clusters with DESI galaxies) leads to a factor of 3 improvement in the $1-\sigma$ error of the dark energy parameters $w_0$ and $w_a$ and a factor of 6 improvement for the growth rate index $\gamma$ compared to the pairwise estimator for the same CMB dataset and cluster catalog.
Published: 2024

33. Exact expressions for the maximal probability that all $k$-wise independent bits are 1

Author: Berend, Daniel, Ernst, Philip A., Kontorovich, Aryeh, and Kumar, Rishi
Subjects: Mathematics - Probability
Abstract: Let $M(n, k, p)$ denote the maximum probability of the event $X_1 = X_2 = \cdots = X_n=1$ under a $k$-wise independent distribution whose marginals are Bernoulli random variables with mean $p$. A long-standing question is to calculate $M(n, k, p)$ for all values of $n,k,p$. This question has been partially addressed by several authors, primarily with the goal of answering asymptotic questions. The present paper focuses on obtaining exact expressions for this probability. To this end, we provide closed-form formulas of $M(n,k,p)$ for $p$ near 0 as well as $p$ near 1.
Published: 2024

34. Signatures of composite dark matter in the Cosmic Microwave Background spectral distortions

Author: Ganguly, Anoma, Khatri, Rishi, and Roy, Tuhin S.
Subjects: Astrophysics - Cosmology and Nongalactic Astrophysics, High Energy Physics - Phenomenology
Abstract: We compute the spectral distortions of the Cosmic Microwave Background (CMB) created by an exotic process that extracts or injects photons of a particular frequency into the CMB. Such signatures are a natural prediction of a class of composite dark matter models characterized by electrically neutral states but with non-zero higher order electromagnetic moments. We consider a simplified model where dark matter exists as a two state system separated by a fixed transition frequency, which can range from radio waves to gamma rays. The electromagnetic transitions between the two states due to CMB photons give rise to thermal distortions, namely, the $\mu$-type distortion in the redshift range $10^5\lesssim z \lesssim 2\times 10^6$ and the $y$-type distortion as well as non-thermal distortions at redshifts $z \lesssim 10^5$. The nature of spectral distortions depends sensitively on the dark matter transition frequency and the strength of couplings of dark matter with visible sector particles as well as its self-interactions, thus opening a new window to probe the nature of dark matter. Non-thermal distortions have unique spectral shapes making them distinguishable from the standard $\mu$ and $y$-type distortions and potentially detectable in the next-generation experiments such as Primordial Inflation Explorer (PIXIE). We also find that the spectral distortion limits from the COsmic Background Explorer/Far-Infrared Absolute Spectrophotometer (COBE/FIRAS) already give a constraint on the electromagnetic coupling of dark matter which is three orders of magnitude stronger compared to the current direct detection limits for $\sim$ MeV mass dark matter with transition energy in $\sim 1$-$10$ eV range.
Published: 2024

35. QCD phase diagram in the $T-eB$ plane for varying pion mass

Author: Ali, Mahammad Sabir, Islam, Chowdhury Aminul, and Sharma, Rishi
Subjects: High Energy Physics - Phenomenology, High Energy Physics - Lattice, Nuclear Theory
Abstract: We study the effect of a varying pion mass on the quantum chromodynamics (QCD) phase diagram in the presence of an external magnetic field, aiming to understand it, for the first time, using Nambu\textendash Jona-Lasinio like effective models. We compare results from both its local and nonlocal versions. In both cases, we find that the inverse magnetic catalysis (IMC) near the crossover is eliminated with increasing pion mass, while the decreasing trend of crossover temperature with increasing magnetic field persists for pion mass values at least up to $440$ MeV. Thus, the models are capable of capturing qualitatively the results found by lattice QCD (LQCD) for heavy (unphysical) pions. The key feature in the models is the incorporation of the effect of a reduction in the coupling constant with increasing energy. Along with reproducing the IMC effect, it enables models to describe the effects of heavier current quark masses without introducing additional parameters. For the local NJL model, this agreement depends on how the parameters of the model are fit at the physical point. In this respect, the nonlocal version, which, due to its formulation, automatically exhibits the IMC effect around the crossover region, captures the physics more naturally. We further use the nonlocal framework to determine the pion mass beyond which the IMC effect around the transition region does not exist anymore., Comment: 22 pages, 15 captioned figures, pion mass value beyond which the IMC effect disappears calculated, published version
Published: 2024

36. The Foundation Model Transparency Index v1.1: May 2024

Author: Bommasani, Rishi, Klyman, Kevin, Kapoor, Sayash, Longpre, Shayne, Xiong, Betty, Maslej, Nestor, and Liang, Percy
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computers and Society
Abstract: Foundation models are increasingly consequential yet extremely opaque. To characterize the status quo, the Foundation Model Transparency Index was launched in October 2023 to measure the transparency of leading foundation model developers. The October 2023 Index (v1.0) assessed 10 major foundation model developers (e.g. OpenAI, Google) on 100 transparency indicators (e.g. does the developer disclose the wages it pays for data labor?). At the time, developers publicly disclosed very limited information with the average score being 37 out of 100. To understand how the status quo has changed, we conduct a follow-up study (v1.1) after 6 months: we score 14 developers against the same 100 indicators. While in v1.0 we searched for publicly available information, in v1.1 developers submit reports on the 100 transparency indicators, potentially including information that was not previously public. We find that developers now score 58 out of 100 on average, a 21 point improvement over v1.0. Much of this increase is driven by developers disclosing information during the v1.1 process: on average, developers disclosed information related to 16.6 indicators that was not previously public. We observe regions of sustained (i.e. across v1.0 and v1.1) and systemic (i.e. across most or all developers) opacity such as on copyright status, data access, data labor, and downstream impact. We publish transparency reports for each developer that consolidate information disclosures: these reports are based on the information disclosed to us via developers. Our findings demonstrate that transparency can be improved in this nascent ecosystem, the Foundation Model Transparency Index likely contributes to these improvements, and policymakers should consider interventions in areas where transparency has not improved., Comment: Authored by the Center for Research on Foundation Models (CRFM) at the Stanford Institute for Human-Centered Artificial Intelligence (HAI). Project page: https://crfm.stanford.edu/fmti
Published: 2024

37. Differentially Private Multiway and $k$-Cut

Author: Chandra, Rishi, Dinitz, Michael, Fan, Chenglin, and Zou, Zongrui
Subjects: Computer Science - Cryptography and Security, Computer Science - Data Structures and Algorithms
Abstract: In this paper, we address the challenge of differential privacy in the context of graph cuts, specifically focusing on the minimum $k$-cut and multiway cut problems. We introduce edge-differentially private algorithms that achieve nearly optimal performance for these problems. For the multiway cut problem, we first provide a private algorithm with a multiplicative approximation ratio that matches the state-of-the-art non-private algorithm. We then present a tight information-theoretic lower bound on the additive error, demonstrating that our algorithm on weighted graphs is near-optimal for constant $k$. For the minimum $k$-cut problem, our algorithms leverage a known bound on the number of approximate $k$-cuts, resulting in a private algorithm with optimal additive error $O(k\log n)$ for fixed privacy parameter. We also establish a information-theoretic lower bound that matches this additive error. Additionally, we give an efficient private algorithm for $k$-cut even for non-constant $k$, including a polynomial-time 2-approximation with an additive error of $\widetilde{O}(k^{1.5})$., Comment: 39 pages
Published: 2024

38. Energy-Aware Decentralized Learning with Intermittent Model Training

Author: Dhasade, Akash, Dini, Paolo, Guerra, Elia, Kermarrec, Anne-Marie, Miozzo, Marco, Pires, Rafael, Sharma, Rishi, and de Vos, Martijn
Subjects: Computer Science - Machine Learning, Computer Science - Distributed, Parallel, and Cluster Computing
Abstract: Decentralized learning (DL) offers a powerful framework where nodes collaboratively train models without sharing raw data and without the coordination of a central server. In the iterative rounds of DL, models are trained locally, shared with neighbors in the topology, and aggregated with other models received from neighbors. Sharing and merging models contribute to convergence towards a consensus model that generalizes better across the collective data captured at training time. In addition, the energy consumption while sharing and merging model parameters is negligible compared to the energy spent during the training phase. Leveraging this fact, we present SkipTrain, a novel DL algorithm, which minimizes energy consumption in decentralized learning by strategically skipping some training rounds and substituting them with synchronization rounds. These training-silent periods, besides saving energy, also allow models to better mix and finally produce models with superior accuracy than typical DL algorithms that train at every round. Our empirical evaluations with 256 nodes demonstrate that SkipTrain reduces energy consumption by 50% and increases model accuracy by up to 12% compared to D-PSGD, the conventional DL algorithm.
Published: 2024

39. Explainable Image Captioning using CNN- CNN architecture and Hierarchical Attention

Author: Mohan, Rishi Kesav, Sureshkumar, Sanjay, and Sivasubramaniam, Vignesh
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, 68T50, I.2.7
Abstract: Image captioning is a technology that produces text-based descriptions for an image. Deep learning-based solutions built on top of feature recognition may very well serve the purpose. But as with any other machine learning solution, the user understanding in the process of caption generation is poor and the model does not provide any explanation for its predictions and hence the conventional methods are also referred to as Black-Box methods. Thus, an approach where the model's predictions are trusted by the user is needed to appreciate interoperability. Explainable AI is an approach where a conventional method is approached in a way that the model or the algorithm's predictions can be explainable and justifiable. Thus, this article tries to approach image captioning using Explainable AI such that the resulting captions generated by the model can be Explained and visualized. A newer architecture with a CNN decoder and hierarchical attention concept has been used to increase speed and accuracy of caption generation. Also, incorporating explainability to a model makes it more trustable when used in an application. The model is trained and evaluated using MSCOCO dataset and both quantitative and qualitative results are presented in this article., Comment: 23 pages,9 figures
Published: 2024

40. Multimodal Learning and Cognitive Processes in Radiology: MedGaze for Chest X-ray Scanpath Prediction

Author: Awasthi, Akash, Le, Ngan, Deng, Zhigang, Agrawal, Rishi, Wu, Carol C., and Van Nguyen, Hien
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Artificial Intelligence, Computer Science - Human-Computer Interaction
Abstract: Predicting human gaze behavior within computer vision is integral for developing interactive systems that can anticipate user attention, address fundamental questions in cognitive science, and hold implications for fields like human-computer interaction (HCI) and augmented/virtual reality (AR/VR) systems. Despite methodologies introduced for modeling human eye gaze behavior, applying these models to medical imaging for scanpath prediction remains unexplored. Our proposed system aims to predict eye gaze sequences from radiology reports and CXR images, potentially streamlining data collection and enhancing AI systems using larger datasets. However, predicting human scanpaths on medical images presents unique challenges due to the diverse nature of abnormal regions. Our model predicts fixation coordinates and durations critical for medical scanpath prediction, outperforming existing models in the computer vision community. Utilizing a two-stage training process and large publicly available datasets, our approach generates static heatmaps and eye gaze videos aligned with radiology reports, facilitating comprehensive analysis. We validate our approach by comparing its performance with state-of-the-art methods and assessing its generalizability among different radiologists, introducing novel strategies to model radiologists' search patterns during CXR image diagnosis. Based on the radiologist's evaluation, MedGaze can generate human-like gaze sequences with a high focus on relevant regions over the CXR images. It sometimes also outperforms humans in terms of redundancy and randomness in the scanpaths., Comment: Submitted to the Journal
Published: 2024

41. The Responsible Foundation Model Development Cheatsheet: A Review of Tools & Resources

Author: Longpre, Shayne, Biderman, Stella, Albalak, Alon, Schoelkopf, Hailey, McDuff, Daniel, Kapoor, Sayash, Klyman, Kevin, Lo, Kyle, Ilharco, Gabriel, San, Nay, Rauh, Maribeth, Skowron, Aviya, Vidgen, Bertie, Weidinger, Laura, Narayanan, Arvind, Sanh, Victor, Adelani, David, Liang, Percy, Bommasani, Rishi, Henderson, Peter, Luccioni, Sasha, Jernite, Yacine, and Soldaini, Luca
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: Foundation model development attracts a rapidly expanding body of contributors, scientists, and applications. To help shape responsible development practices, we introduce the Foundation Model Development Cheatsheet: a growing collection of 250+ tools and resources spanning text, vision, and speech modalities. We draw on a large body of prior work to survey resources (e.g. software, documentation, frameworks, guides, and practical tools) that support informed data selection, processing, and understanding, precise and limitation-aware artifact documentation, efficient model training, advance awareness of the environmental impact from training, careful model evaluation of capabilities, risks, and claims, as well as responsible model release, licensing and deployment practices. We hope this curated collection of resources helps guide more responsible development. The process of curating this list, enabled us to review the AI development ecosystem, revealing what tools are critically missing, misused, or over-used in existing practices. We find that (i) tools for data sourcing, model evaluation, and monitoring are critically under-serving ethical and real-world needs, (ii) evaluations for model safety, capabilities, and environmental impact all lack reproducibility and transparency, (iii) text and particularly English-centric analyses continue to dominate over multilingual and multi-modal analyses, and (iv) evaluation of systems, rather than just models, is needed so that capabilities and impact are assessed in context.
Published: 2024

42. Multimodal Segmentation for Vocal Tract Modeling

Author: Jain, Rishi, Yu, Bohan, Wu, Peter, Prabhune, Tejas, and Anumanchipalli, Gopala
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Computation and Language, Computer Science - Machine Learning, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Accurate modeling of the vocal tract is necessary to construct articulatory representations for interpretable speech processing and linguistics. However, vocal tract modeling is challenging because many internal articulators are occluded from external motion capture technologies. Real-time magnetic resonance imaging (RT-MRI) allows measuring precise movements of internal articulators during speech, but annotated datasets of MRI are limited in size due to time-consuming and computationally expensive labeling methods. We first present a deep labeling strategy for the RT-MRI video using a vision-only segmentation approach. We then introduce a multimodal algorithm using audio to improve segmentation of vocal articulators. Together, we set a new benchmark for vocal tract modeling in MRI video segmentation and use this to release labels for a 75-speaker RT-MRI dataset, increasing the amount of labeled public RT-MRI data of the vocal tract by over a factor of 9. The code and dataset labels can be found at \url{rishiraij.github.io/multimodal-mri-avatar/}., Comment: Interspeech 2024
Published: 2024

43. Gas permeability, diffusivity, and solubility in polymers: Simulation-experiment data fusion and multi-task machine learning

Author: Phan, Brandon K., Shen, Kuan-Hsuan, Gurnani, Rishi, Tran, Huan, Lively, Ryan, and Ramprasad, Rampi
Subjects: Condensed Matter - Materials Science
Abstract: Machine learning (ML) models for predicting gas permeability through polymers have traditionally relied on experimental data. While these models exhibit robustness within familiar chemical domains, reliability wanes when applied to new spaces. To address this challenge, we present a multi-tiered multi-task learning framework empowered with advanced machine-crafted polymer fingerprinting algorithms and data fusion techniques. This framework combines scarce "high-fidelity" experimental data with abundant diverse "low-fidelity" simulation or synthetic data, resulting in predictive models that display a high level of generalizability across novel chemical spaces. Additionally, this multi-task scheme capitalizes on known physics and interrelated properties, such as gas diffusivity and solubility, both of which are closely tied to permeability. By amalgamating high-throughput generated simulation data with available experimental data for gas permeability, diffusivity, and solubility for various gases, we construct multi-task deep learning models. These models can simultaneously predict all three properties for all gases under consideration. With markedly enhanced predictive accuracy, particularly compared to traditional models reliant solely on experimental data for a singular property. This strategy underscores the potential of coupling high-throughput classical simulations with data fusion methodologies to yield state-of-the-art property predictors, especially when experimental data for targeted properties is scarce., Comment: Submitted to npj Computational Materials
Published: 2024

44. An Advanced Physics-Informed Neural Operator for Comprehensive Design Optimization of Highly-Nonlinear Systems: An Aerospace Composites Processing Case Study

Author: Ramezankhani, Milad, Deodhar, Anirudh, Parekh, Rishi Yash, and Birru, Dagnachew
Subjects: Computer Science - Machine Learning
Abstract: Deep Operator Networks (DeepONets) and their physics-informed variants have shown significant promise in learning mappings between function spaces of partial differential equations, enhancing the generalization of traditional neural networks. However, for highly nonlinear real-world applications like aerospace composites processing, existing models often fail to capture underlying solutions accurately and are typically limited to single input functions, constraining rapid process design development. This paper introduces an advanced physics-informed DeepONet tailored for such complex systems with multiple input functions. Equipped with architectural enhancements like nonlinear decoders and effective training strategies such as curriculum learning and domain decomposition, the proposed model handles high-dimensional design spaces with significantly improved accuracy, outperforming the vanilla physics-informed DeepONet by two orders of magnitude. Its zero-shot prediction capability across a broad design space makes it a powerful tool for accelerating composites process design and optimization, with potential applications in other engineering fields characterized by strong nonlinearity., Comment: Accepted at the ICML 2024 Workshop on AI for Science: Scaling in AI for Scientific Discovery
Published: 2024

45. Low-mass stellar and substellar content of the young cluster Berkeley 59

Author: Panwar, Neelam, C., Rishi, Sharma, Saurabh, Ojha, Devendra K., Samal, Manash R., Singh, H. P., and Yadav, Ram Kesh
Subjects: Astrophysics - Astrophysics of Galaxies, Astrophysics - Solar and Stellar Astrophysics
Abstract: We present a multi-wavelength analysis of the young star cluster Berkeley 59 (Be 59) based on the $Gaia$ data and deep infrared (IR) observations with the 3.58-m Telescopio Nazionale Galileo and $Spitzer$ space telescope. The mean proper motion of the cluster is found to be $\mu$$_\alpha$cos$\delta$ $\sim$ -0.63 mas yr$^{-1}$ and $\mu$$_\delta$ $\sim$ -1.83 mas yr$^{-1}$ and the kinematic distance of the cluster, $\sim$ 1 kpc, is in agreement with previous photometric studies. Present data is the deepest available near-IR observations for the cluster so far and reached below 0.03 M$_\odot$. The mass function of the cluster region is calculated using the statistically cleaned color-magnitude diagram and is similar to the Salpeter value for the member stars above 0.4 M$_\odot$. In contrast, the slope becomes shallower ($\Gamma$ $\sim$ 0.01 $\pm$ 0.18) in the mass range 0.04 - 0.4 M$_\odot$, comparable to other nearby clusters. The spatial distribution of young brown dwarfs (BDs) and stellar candidates shows a non-homogeneous distribution. This suggests that the radiation feedback from massive stars may be a prominent factor contributing to the BD population in the cluster Be 59. We also estimated the star-to-BD ratio for the cluster, which is found to be $\sim$ 3.6. The Kolomogorov-Smirnov test shows that stellar and BD populations significantly differ, and stellar candidates are near the cluster center compared to the BDs, suggesting mass segregation in the cluster toward the substellar mass regime., Comment: 18 pages, 11 Figures, accepted for the publication in the AJ
Published: 2024

46. Grover's algorithm on two-way quantum computer

Author: Czelusta, Grzegorz, Verma, Dev Rishi, and Wanjalkar, Govind
Subjects: Physics - General Physics
Abstract: Two-way quantum computing (2WQC) represents a novel approach to quantum computing that introduces a CPT version of state preparation. This paper analyses the influence of this approach on Grover's algorithm and compares the behaviour of typical Grover and its 2WQC version in the presence of noise in the system. Our findings indicate that, in an ideal scenario without noise, the 2WQC Grover algorithm exhibits a constant complexity of $\mathcal{O}(1)$. In the presence of noise, the 2WQC Grover algorithm demonstrates greater resilience to different noise models than the standard Grover's algorithm., Comment: Preliminary version of the article, we welcome your comments and suggestions
Published: 2024

47. Simultaneous visibility in the integer lattice

Author: Berend, Daniel, Kumar, Rishi, and Pollington, Andrew
Subjects: Mathematics - Number Theory, Primary: 11P21, 11N36, Secondary: 37A44
Abstract: Two lattice points are visible from one another if there is no lattice point on the open line segment joining them. Let $S$ be a finite subset of $\mathbb{Z}^k$. The asymptotic density of the set of lattice points, visible from all points of $S$, was studied by several authors. Our main result is an improved upper bound on the error term. We also find the Schnirelmann density of the set of visible points from some sets S. Finally, we discuss these questions from the point of view of ergodic theory., Comment: Published in Journal of Number Theory
Published: 2024

48. Scalable Private Search with Wally

Author: Asi, Hilal, Boemer, Fabian, Genise, Nicholas, Mughees, Muhammad Haris, Ogilvie, Tabitha, Rishi, Rehan, Rothblum, Guy N., Talwar, Kunal, Tarbe, Karl, Zhu, Ruiyu, and Zuliani, Marco
Subjects: Computer Science - Cryptography and Security, Computer Science - Databases
Abstract: This paper presents Wally, a private search system that supports efficient semantic and keyword search queries against large databases. When sufficiently many clients are making queries, Wally's performance is significantly better than previous systems. In previous private search systems, for each client query, the server must perform at least one expensive cryptographic operation per database entry. As a result, performance degraded proportionally with the number of entries in the database. In Wally, we get rid of this limitation. Specifically, for each query the server performs cryptographic operations only against a few database entries. We achieve these results by requiring each client to add a few fake queries and send each query via an anonymous network to the server at independently chosen random instants. Additionally, each client also uses somewhat homomorphic encryption (SHE) to hide whether a query is real or fake. Wally provides $(\epsilon, \delta)$-differential privacy guarantee, which is an accepted standard for strong privacy. The number of fake queries each client makes depends inversely on the number of clients making queries. Therefore, the fake queries' overhead vanishes as the number of clients increases, enabling scalability to millions of queries and large databases. Concretely, Wally can process eight million queries in just 117 mins. That is around four orders of magnitude less than the state of the art.
Published: 2024

49. Kepler Sets of Second-Order Linear Recurrence Sequences Over $\mathbb{Q}_p$

Author: Kumar, Rishi
Subjects: Mathematics - Number Theory, 11B39, 11K41
Abstract: Let $(a_n)_{n=0}^\infty$ be a second-order linear recurrence sequence with constant coefficients over the field of $p$-adic numbers $\mathbb{Q}_p$. We study the set of limit points of the sequence of consecutive ratios $(a_{n+1}/a_{n})_{n=0}^\infty$ in $\mathbb{Q}_p$., Comment: to appear in the International Journal of Number Theory
Published: 2024

50. A preprocessing-based planning framework for utilizing contacts in high-precision insertion tasks

Author: Saleem, Muhammad Suhail, Veerapaneni, Rishi, and Likhachev, Maxim
Subjects: Computer Science - Robotics
Abstract: In manipulation tasks like plug insertion or assembly that have low tolerance to errors in pose estimation (errors of the order of 2mm can cause task failure), the utilization of touch/contact modality can aid in accurately localizing the object of interest. Motivated by this, in this work we model high-precision insertion tasks as planning problems under pose uncertainty, where we effectively utilize the occurrence of contacts (or the lack thereof) as observations to reduce uncertainty and reliably complete the task. We present a preprocessing-based planning framework for high-precision insertion in repetitive and time-critical settings, where the set of initial pose distributions (identified by a perception system) is finite. The finite set allows us to enumerate the possible planning problems that can be encountered online and preprocess a database of policies. Due to the computational complexity of constructing this database, we propose a general experience-based POMDP solver, E-RTDP-Bel, that uses the solutions of similar planning problems as experience to speed up planning queries and use it to efficiently construct the database. We show that the developed algorithm speeds up database creation by over a factor of 100, making the process computationally tractable. We demonstrate the effectiveness of the proposed framework in a real-world plug insertion task in the presence of port position uncertainty and a pipe assembly task in simulation in the presence of pipe pose uncertainty., Comment: \c{opyright} 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

62,793 results on '"Rishi, A."'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources