339,512 results on '"Anil, A"'
Search Results
2. Development and Validation of Method for Spinetoram content in Suspension Concentrate (SC) Formulation
- Author
-
Yadav, Shubham, Alam, Samsul, Anil, Ajin S, Thakur, Lalitesh K, and Kumar, Jitendra
- Published
- 2024
- Full Text
- View/download PDF
3. An updated research on the antimicrobial properties of Chromolaena odorata (L). leaf and flower extracts against wound promoting pathogens: A comparative and combinatorial in vitro/in silico approach
- Author
-
Mathew, Meena, Anil, Athira, Carlin, Vibena V., Raveendran, Reeja, Ruban, Nikhil, and Jannet, Vennila J.
- Published
- 2022
- Full Text
- View/download PDF
4. School Dropout Causes in Turkish Education System (2009-2022): A Pareto Analysis by Grade Level
- Author
-
Anil Kadir Eranil
- Abstract
This research examines the reasons for school dropout occurring at varying education levels in the Turkish education system (TES). To this end, research on school dropouts pertaining to TES has been targeted. The systematic data analysis method was used in the research. 47 studies suitable for the purposes of the study were analyzed through content analysis. A total of 290 coding processes were carried out. The results suggest that the family factor emerges as the main factor for school dropout in primary education in TES. The inadequacy of families' financial situation and low interest in education are effective. At the high school level, students' academic failure, absenteeism, peer pressure, and indifference of families appear as the causes of school dropouts. In higher education, difficulties learners experience in adapting to novel social environments, academic failure, financial problems and the thought of being a misfit for the selected department seem to be among the causes of school dropout. In the other group, the reasons for dropping out are determined as academic failure, financial difficulties, early marriages, dislike of school, indifference of family, negative effects of friend groups, and indifference towards school.
- Published
- 2024
5. Nuclear Dependence of Beam Normal Single Spin Asymmetry in Elastic Scattering from Nuclei
- Author
-
Gal, Ciprian, Ghosh, Chandan, Park, Sanghwa, Adhikari, Devi, Armstrong, David, Beminiwattha, Rakitha, Camsonne, Alexandre, Chandrasena, Shashini, Dalton, Mark, Deshpande, Abhay, Gaskell, Dave, Higinbotham, Douglas, Horowitz, Charles J., King, Paul, Kumar, Krishna, Kutz, Tyler, Mammei, Juliette, McNulty, Dustin, Michaels, Robert, Palatchi, Caryn, Panta, Anil, Paschke, Kent, Pitt, Mark, Sen, Arindam, Simicevic, Neven, Weliyanga, Lasitha, and Wells, Steven P.
- Subjects
Nuclear Experiment - Abstract
We propose to measure the beam normal single spin asymmetry in elastic scattering of transversely polarized electron from target nuclei with 12 $\leq Z \leq$ 90 at Q$^2$ = 0.0092 GeV$^2$ to study its nuclear dependence. While the theoretical calculations based on two-photon exchange suggest no nuclear dependence at this kinematics, the results of 208Pb from Jefferson Lab show a striking disagreement from both theoretical predictions and light nuclei measurements. The proposed measurements will provide new data for intermediate to heavy nuclei where no data exists for $Z \geq$ 20 in the kinematics of previous high-energy experiments. It will allow one to investigate the missing contributions that are not accounted in the current theoretical models., Comment: Submitted to Jefferson Lab PAC52
- Published
- 2024
6. Large-Scale Cost-Effective Mid-Infrared Resonant Silicon Microstructures for Surface-Enhanced Infrared Absorption Spectroscopy
- Author
-
Sudha, Pooja, kumar, Anil, Dhankar, Kunal, Ansari, Khalid, Hazra, Sugata, and Samanta, Arup
- Subjects
Condensed Matter - Mesoscale and Nanoscale Physics - Abstract
The mid-infrared region is crucial for elucidating the unique biochemical signatures of microorganisms. The MIR resonant structures turned out to facilitate exceptional performance owing to the enhance electric field confinement in the nano-sized aperture. However, the extension of such technique in bacteria-sensing remains limited, primarily due to its micrometre size. This work is the first demonstration of a MIR resonant structure, the gold-coated micro-structured inverted pyramid array of silicon exhibiting light-trapping capabilities, for the bacteria detection in entire MIR range. The electric-field localization within the micro-sized cavity of inverted pyramid amplifies the light-matter interaction by harnessing surface plasmon polaritons, leading to improved detection sensitivity. The confinement of electric field is further corroborated by electric-field simulations based on finite element method. In particular, we observed notable enhancement in both the quantitative and qualitative detection of Escherichia coli and Staphylococcus aureus for the bacteria cell with very low concentration, reflecting the efficacy of our detection method. Furthermore, the cost-effective micro structured silicon is fabricated using metal-assisted chemical etching method with the lithography-free method, along with the capabilities of wafer-scale fabrication. Moreover, our device configuration even demonstrates the characteristics of reusability and reproducibility offers substantial benefits over conventional detection schemes. Consequently, this CMOS technology-compatible biosensor signifies promising ways for the integration of this technology with forthcoming bio-applications.
- Published
- 2024
7. Coexistence of Hilbert space effects and orthogonality
- Author
-
Karn, Anil Kumar
- Subjects
Mathematics - Functional Analysis ,Primary: 47B02, Secondary: 46L10, 47L30 - Abstract
In this paper, we show that every pair of absolutely compatible Hilbert space effects are coexistent and exhibit a partial orthogonality property. We introduce the notion of partially ortho-coexistence. We generalize absolute compatibility to obtain more examples of partially ortho-coexistent pairs and introduce the notion of generalized compatibility. In the case of $\mathbb{M}_2$, we discuss a geometric behaviour of the generalized compatibility., Comment: 17 pages
- Published
- 2024
8. A cat qubit stabilization scheme using a voltage biased Josephson junction
- Author
-
Aissaoui, Thiziri, Murani, Anil, Lescanne, Raphaël, and Sarlette, Alain
- Subjects
Quantum Physics - Abstract
DC-voltage-biased Josephson junctions have been recently employed in superconducting circuits for Hamiltonian engineering, demonstrating microwave amplification, single photon sources and entangled photon generation. Compared to more conventional approaches based on parametric pumps, this solution typically enables larger interaction strengths. In the context of quantum information, a two-to-one photon interaction can stabilize cat qubits, where bit-flip errors are exponentially suppressed, promising significant resource savings for quantum error correction. This work investigates how the DC bias approach to Hamiltonian engineering can benefit cat qubits. We find a simple circuit design that is predicted to showcase a two-to-one photon exchange rate larger than that of the parametric pump-based implementation while dynamically averaging typically resonant parasitic terms such as Kerr and cross Kerr. In addition to addressing qubit stabilization, we propose to use injection locking with a cat-qubit adapted frequency filter to prevent long-term drifts of the cat qubit angle associated to DC voltage noise. The whole scheme is simulated without rotating-wave approximations, highlighting for the first time the amplitude of related oscillatory effects in cat-qubit stabilization schemes. This study lays the groundwork for the experimental realization of such a circuit., Comment: 30 pages, 7 figures
- Published
- 2024
9. AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation
- Author
-
Kag, Anil, Coskun, Huseyin, Chen, Jierun, Cao, Junli, Menapace, Willi, Siarohin, Aliaksandr, Tulyakov, Sergey, and Ren, Jian
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning - Abstract
Neural network architecture design requires making many crucial decisions. The common desiderata is that similar decisions, with little modifications, can be reused in a variety of tasks and applications. To satisfy that, architectures must provide promising latency and performance trade-offs, support a variety of tasks, scale efficiently with respect to the amounts of data and compute, leverage available data from other tasks, and efficiently support various hardware. To this end, we introduce AsCAN -- a hybrid architecture, combining both convolutional and transformer blocks. We revisit the key design principles of hybrid architectures and propose a simple and effective \emph{asymmetric} architecture, where the distribution of convolutional and transformer blocks is \emph{asymmetric}, containing more convolutional blocks in the earlier stages, followed by more transformer blocks in later stages. AsCAN supports a variety of tasks: recognition, segmentation, class-conditional image generation, and features a superior trade-off between performance and latency. We then scale the same architecture to solve a large-scale text-to-image task and show state-of-the-art performance compared to the most recent public and commercial models. Notably, even without any computation optimization for transformer blocks, our models still yield faster inference speed than existing works featuring efficient attention mechanisms, highlighting the advantages and the value of our approach., Comment: NeurIPS 2024. Project Page: https://snap-research.github.io/snap_image/
- Published
- 2024
10. Exploring Large Language Models for Specialist-level Oncology Care
- Author
-
Palepu, Anil, Dhillon, Vikram, Niravath, Polly, Weng, Wei-Hung, Prasad, Preethi, Saab, Khaled, Tanno, Ryutaro, Cheng, Yong, Mai, Hanh, Burns, Ethan, Ajmal, Zainub, Kulkarni, Kavita, Mansfield, Philip, Webster, Dale, Barral, Joelle, Gottweis, Juraj, Schaekermann, Mike, Mahdavi, S. Sara, Natarajan, Vivek, Karthikesalingam, Alan, and Tu, Tao
- Subjects
Computer Science - Human-Computer Interaction ,Computer Science - Computation and Language - Abstract
Large language models (LLMs) have shown remarkable progress in encoding clinical knowledge and responding to complex medical queries with appropriate clinical reasoning. However, their applicability in subspecialist or complex medical settings remains underexplored. In this work, we probe the performance of AMIE, a research conversational diagnostic AI system, in the subspecialist domain of breast oncology care without specific fine-tuning to this challenging domain. To perform this evaluation, we curated a set of 50 synthetic breast cancer vignettes representing a range of treatment-naive and treatment-refractory cases and mirroring the key information available to a multidisciplinary tumor board for decision-making (openly released with this work). We developed a detailed clinical rubric for evaluating management plans, including axes such as the quality of case summarization, safety of the proposed care plan, and recommendations for chemotherapy, radiotherapy, surgery and hormonal therapy. To improve performance, we enhanced AMIE with the inference-time ability to perform web search retrieval to gather relevant and up-to-date clinical knowledge and refine its responses with a multi-stage self-critique pipeline. We compare response quality of AMIE with internal medicine trainees, oncology fellows, and general oncology attendings under both automated and specialist clinician evaluations. In our evaluations, AMIE outperformed trainees and fellows demonstrating the potential of the system in this challenging and important domain. We further demonstrate through qualitative examples, how systems such as AMIE might facilitate conversational interactions to assist clinicians in their decision making. However, AMIE's performance was overall inferior to attending oncologists suggesting that further research is needed prior to consideration of prospective uses.
- Published
- 2024
11. A Machine Learning based Hybrid Receiver for 5G NR PRACH
- Author
-
Singh, Rohit, Yerrapragada, Anil Kumar, and Ganti, Radha Krishna
- Subjects
Electrical Engineering and Systems Science - Signal Processing ,Computer Science - Artificial Intelligence ,Computer Science - Information Theory ,Computer Science - Machine Learning - Abstract
Random Access is a critical procedure using which a User Equipment (UE) identifies itself to a Base Station (BS). Random Access starts with the UE transmitting a random preamble on the Physical Random Access Channel (PRACH). In a conventional BS receiver, the UE's specific preamble is identified by correlation with all the possible preambles. The PRACH signal is also used to estimate the timing advance which is induced by propagation delay. Correlation-based receivers suffer from false peaks and missed detection in scenarios dominated by high fading and low signal-to-noise ratio. This paper describes the design of a hybrid receiver that consists of an AI/ML model for preamble detection followed by conventional peak detection for the Timing Advance estimation. The proposed receiver combines the Power Delay Profiles of correlation windows across multiple antennas and uses the combination as input to a Neural Network model. The model predicts the presence or absence of a user in a particular preamble window, after which the timing advance is estimated by peak detection. Results show superior performance of the hybrid receiver compared to conventional receivers both for simulated and real hardware-captured datasets., Comment: 6 pages, 9 figures
- Published
- 2024
12. Explaining and Improving Contrastive Decoding by Extrapolating the Probabilities of a Huge and Hypothetical LM
- Author
-
Chang, Haw-Shiuan, Peng, Nanyun, Bansal, Mohit, Ramakrishna, Anil, and Chung, Tagyoung
- Subjects
Computer Science - Computation and Language - Abstract
Contrastive decoding (CD) (Li et al., 2023) improves the next-token distribution of a large expert language model (LM) using a small amateur LM. Although CD is applied to various LMs and domains to enhance open-ended text generation, it is still unclear why CD often works well, when it could fail, and how we can make it better. To deepen our understanding of CD, we first theoretically prove that CD could be viewed as linearly extrapolating the next-token logits from a huge and hypothetical LM. We also highlight that the linear extrapolation could make CD unable to output the most obvious answers that have already been assigned high probabilities by the amateur LM. To overcome CD's limitation, we propose a new unsupervised decoding method called $\mathbf{A}$symptotic $\mathbf{P}$robability $\mathbf{D}$ecoding (APD). APD explicitly extrapolates the probability curves from the LMs of different sizes to infer the asymptotic probabilities from an infinitely large LM without inducing more inference costs than CD. In FactualityPrompts, an open-ended text generation benchmark, sampling using APD significantly boosts factuality in comparison to the CD sampling and its variants, and achieves state-of-the-art results for Pythia 6.9B and OPT 6.7B. Furthermore, in five commonsense QA datasets, APD is often significantly better than CD and achieves a similar effect of using a larger LLM. For example, the perplexity of APD on top of Pythia 6.9B is even lower than the perplexity of Pythia 12B in CommonsenseQA and LAMBADA., Comment: EMNLP 2024 Oral
- Published
- 2024
13. Hedging and Pricing Structured Products Featuring Multiple Underlying Assets
- Author
-
Sharma, Anil, Chen, Freeman, Noh, Jaesun, DeJesus, Julio, and Schlener, Mario
- Subjects
Computer Science - Computational Engineering, Finance, and Science ,Computer Science - Machine Learning ,Quantitative Finance - Pricing of Securities - Abstract
Hedging a portfolio containing autocallable notes presents unique challenges due to the complex risk profile of these financial instruments. In addition to hedging, pricing these notes, particularly when multiple underlying assets are involved, adds another layer of complexity. Pricing autocallable notes involves intricate considerations of various risk factors, including underlying assets, interest rates, and volatility. Traditional pricing methods, such as sample-based Monte Carlo simulations, are often time-consuming and impractical for long maturities, particularly when there are multiple underlying assets. In this paper, we explore autocallable structured notes with three underlying assets and proposes a machine learning-based pricing method that significantly improves efficiency, computing prices 250 times faster than traditional Monte Carlo simulation based method. Additionally, we introduce a Distributional Reinforcement Learning (RL) algorithm to hedge a portfolio containing an autocallable structured note. Our distributional RL based hedging strategy provides better PnL compared to traditional Delta-neutral and Delta-Gamma neutral hedging strategies. The VaR 5% (PnL value) of our RL agent based hedging is 33.95, significantly outperforming both the Delta neutral strategy, which has a VaR 5% of -0.04, and the Delta-Gamma neutral strategy, which has a VaR 5% of 13.05. It also provides the hedging action with better left tail PnL, such as 95% and 99% value-at-risk (VaR) and conditional value-at-risk (CVaR), highlighting its potential for front-office hedging and risk management., Comment: Workshop on Simulation of Financial Markets and Economic Systems
- Published
- 2024
14. Incremental IVF Index Maintenance for Streaming Vector Search
- Author
-
Mohoney, Jason, Pacaci, Anil, Chowdhury, Shihabur Rahman, Minhas, Umar Farooq, Pound, Jeffery, Renggli, Cedric, Reyhani, Nima, Ilyas, Ihab F., Rekatsinas, Theodoros, and Venkataraman, Shivaram
- Subjects
Computer Science - Databases ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
The prevalence of vector similarity search in modern machine learning applications and the continuously changing nature of data processed by these applications necessitate efficient and effective index maintenance techniques for vector search indexes. Designed primarily for static workloads, existing vector search indexes degrade in search quality and performance as the underlying data is updated unless costly index reconstruction is performed. To address this, we introduce Ada-IVF, an incremental indexing methodology for Inverted File (IVF) indexes. Ada-IVF consists of 1) an adaptive maintenance policy that decides which index partitions are problematic for performance and should be repartitioned and 2) a local re-clustering mechanism that determines how to repartition them. Compared with state-of-the-art dynamic IVF index maintenance strategies, Ada-IVF achieves an average of 2x and up to 5x higher update throughput across a range of benchmark workloads., Comment: 14 pages, 14 figures
- Published
- 2024
15. Demystifying the use of Compression in Virtual Production
- Author
-
Kokaram, Anil, Vibhoothi, Vibhoothi, Zouein, Julien, Pitié, François, Nash, Christopher, Bentley, James, and Coulam-Jones, Philip
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Emerging Technologies ,Computer Science - Human-Computer Interaction ,Electrical Engineering and Systems Science - Signal Processing - Abstract
Virtual Production (VP) technologies have continued to improve the flexibility of on-set filming and enhance the live concert experience. The core technology of VP relies on high-resolution, high-brightness LED panels to playback/render video content. There are a number of technical challenges to effective deployment e.g. image tile synchronisation across the panels, cross panel colour balancing and compensating for colour fluctuations due to changes in camera angles. Given the complexity and potential quality degradation, the industry prefers "pristine" or lossless compressed source material for displays, which requires significant storage and bandwidth. Modern lossy compression standards like AV1 or H.265 could maintain the same quality at significantly lower bitrates and resource demands. There is yet no agreed methodology for assessing the impact of these standards on quality when the VP scene is recorded in-camera. We present a methodology to assess this impact by comparing lossless and lossy compressed footage displayed through VP screens and recorded in-camera. We assess the quality impact of HAP/NotchLC/Daniel2 and AV1/HEVC/H.264 compression bitrates from 2 Mb/s to 2000 Mb/s with various GOP sizes. Several perceptual quality metrics are then used to automatically evaluate in-camera picture quality, referencing the original uncompressed source content through the LED wall. Our results show that we can achieve the same quality with hybrid codecs as with intermediate encoders at orders of magnitude less bitrate and storage requirements., Comment: SMPTE Media Summit Paper on use of Compression in Virtual Production from TCD and Disguise
- Published
- 2024
16. Towards Understanding the Milky Way's Typicality: Assessing the Chemodynamics of M31's Bulge & Bar, Thick & Thin Discs
- Author
-
Gibson, Benjamin J., Zasowski, Gail, Seth, Anil, Gadotti, Dimitri A., Wang, Zixian, Bizyaev, Dmitry, Majewski, Steven R., Holtzmann, Jon, and Sharma, Sanjib
- Subjects
Astrophysics - Astrophysics of Galaxies - Abstract
We describe a novel framework to model galaxy spectra with two cospatial stellar populations, such as may represent a bulge & bar or thick & thin disc, and apply it to APOGEE spectra in the inner $\sim$2 kpc of M31, as well as to stacked spectra representative of the northern and southern parts of M31's disc ($R\sim4-7$ kpc). We use a custom M31 photometric decomposition and A-LIST spectral templates to derive the radial velocity, velocity dispersion, metallicity, and $\alpha$ abundance for both components in each spectrum. In the bulge, one component exhibits little net rotation, high velocity dispersion ($\sim$170 km s$^{-1}$), near-solar metallicity, and high $\alpha$ abundance ([$\alpha$/M] = 0.28), while the second component shows structured rotation, lower velocity dispersion ($\sim$121 km s$^{-1}$), and slightly higher abundances ([M/H] = 0.09, [$\alpha$/M] = 0.3). We tentatively associate the first component with the classical bulge and the second with the bar. In the north disc we identify two distinct components: the first with hotter kinematics, lower metallicity, and higher $\alpha$ abundance than the second ([M/H] = 0.1 and 0.39, [$\alpha$/M] = 0.29 and 0.07). These discs appear comparable to the Milky Way's ''thick'' and ''thin'' discs, providing the first evidence that M31's inner disc has a similar chemodynamical structure. We do not identify two distinct components in the south, potentially due to effects from recent interactions. Such multi-population analysis is crucial to constrain galaxy evolution models that strive to recreate the complex stellar populations found in the Milky Way., Comment: 16 pages, 9 figures
- Published
- 2024
17. Exploring Topological Transitivity in Families of Functions
- Author
-
Singh, Anil and Lal, Banarsi
- Subjects
Mathematics - Complex Variables - Abstract
We have established various criteria for the topological transitivity of families of continuous (holomorphic) functions. Furthermore, by leveraging the properties of expanding families of meromorphic functions, we offer an alternative proof of Montel's three point theorem.
- Published
- 2024
18. Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate
- Author
-
Bu, Zhiqi, Jin, Xiaomeng, Vinzamuri, Bhanukiran, Ramakrishna, Anil, Chang, Kai-Wei, Cevher, Volkan, and Hong, Mingyi
- Subjects
Computer Science - Machine Learning ,Computer Science - Computation and Language - Abstract
Machine unlearning has been used to remove unwanted knowledge acquired by large language models (LLMs). In this paper, we examine machine unlearning from an optimization perspective, framing it as a regularized multi-task optimization problem, where one task optimizes a forgetting objective and another optimizes the model performance. In particular, we introduce a normalized gradient difference (NGDiff) algorithm, enabling us to have better control over the trade-off between the objectives, while integrating a new, automatic learning rate scheduler. We provide a theoretical analysis and empirically demonstrate the superior performance of NGDiff among state-of-the-art unlearning methods on the TOFU and MUSE datasets while exhibiting stable training.
- Published
- 2024
19. Sabotage Evaluations for Frontier Models
- Author
-
Benton, Joe, Wagner, Misha, Christiansen, Eric, Anil, Cem, Perez, Ethan, Srivastav, Jai, Durmus, Esin, Ganguli, Deep, Kravec, Shauna, Shlegeris, Buck, Kaplan, Jared, Karnofsky, Holden, Hubinger, Evan, Grosse, Roger, Bowman, Samuel R., and Duvenaud, David
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Computers and Society - Abstract
Sufficiently capable models could subvert human oversight and decision-making in important contexts. For example, in the context of AI development, models could covertly sabotage efforts to evaluate their own dangerous capabilities, to monitor their behavior, or to make decisions about their deployment. We refer to this family of abilities as sabotage capabilities. We develop a set of related threat models and evaluations. These evaluations are designed to provide evidence that a given model, operating under a given set of mitigations, could not successfully sabotage a frontier model developer or other large organization's activities in any of these ways. We demonstrate these evaluations on Anthropic's Claude 3 Opus and Claude 3.5 Sonnet models. Our results suggest that for these models, minimal mitigations are currently sufficient to address sabotage risks, but that more realistic evaluations and stronger mitigations seem likely to be necessary soon as capabilities improve. We also survey related evaluations we tried and abandoned. Finally, we discuss the advantages of mitigation-aware capability evaluations, and of simulating large-scale deployments using small-scale statistics.
- Published
- 2024
20. On the Application of Deep Learning for Precise Indoor Positioning in 6G
- Author
-
Kotturi, Sai Prasanth, Yerrapragada, Anil Kumar, Prasad, Sai, and Ganti, Radha Krishna
- Subjects
Electrical Engineering and Systems Science - Signal Processing ,Computer Science - Machine Learning - Abstract
Accurate localization in indoor environments is a challenge due to the Non Line of Sight (NLoS) nature of the signaling. In this paper, we explore the use of AI/ML techniques for positioning accuracy enhancement in Indoor Factory (InF) scenarios. The proposed neural network, which we term LocNet, is trained on measurements such as Channel Impulse Response (CIR) and Reference Signal Received Power (RSRP) from multiple Transmit Receive Points (TRPs). Simulation results show that when using measurements from 18 TRPs, LocNet achieves a 9 cm positioning accuracy at the 90th percentile. Additionally, we demonstrate that the same model generalizes effectively even when measurements from some TRPs randomly become unavailable. Lastly, we provide insights on the robustness of the trained model to the errors in ground truth labels used for training., Comment: 6 Pages, 6 Figures
- Published
- 2024
21. A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs
- Author
-
Rawat, Ankit Singh, Sadhanala, Veeranjaneyulu, Rostamizadeh, Afshin, Chakrabarti, Ayan, Jitkrittum, Wittawat, Feinberg, Vladimir, Kim, Seungyeon, Harutyunyan, Hrayr, Saunshi, Nikunj, Nado, Zachary, Shivanna, Rakesh, Reddi, Sashank J., Menon, Aditya Krishna, Anil, Rohan, and Kumar, Sanjiv
- Subjects
Computer Science - Machine Learning ,Computer Science - Computation and Language - Abstract
A primary challenge in large language model (LLM) development is their onerous pre-training cost. Typically, such pre-training involves optimizing a self-supervised objective (such as next-token prediction) over a large corpus. This paper explores a promising paradigm to improve LLM pre-training efficiency and quality by suitably leveraging a small language model (SLM). In particular, this paradigm relies on an SLM to both (1) provide soft labels as additional training supervision, and (2) select a small subset of valuable ("informative" and "hard") training examples. Put together, this enables an effective transfer of the SLM's predictive distribution to the LLM, while prioritizing specific regions of the training data distribution. Empirically, this leads to reduced LLM training time compared to standard training, while improving the overall quality. Theoretically, we develop a statistical framework to systematically study the utility of SLMs in enabling efficient training of high-quality LLMs. In particular, our framework characterizes how the SLM's seemingly low-quality supervision can enhance the training of a much more capable LLM. Furthermore, it also highlights the need for an adaptive utilization of such supervision, by striking a balance between the bias and variance introduced by the SLM-provided soft labels. We corroborate our theoretical framework by improving the pre-training of an LLM with 2.8B parameters by utilizing a smaller LM with 1.5B parameters on the Pile dataset.
- Published
- 2024
22. Computation of symmetries of rational surfaces
- Author
-
Alcázar, Juan Juan Gerardo, Hermoso, Carlos, Çoban, Hüsnü Anıl, and Gözütok, Uğur
- Subjects
Computer Science - Computational Geometry ,Mathematics - Algebraic Geometry - Abstract
In this paper we provide, first, a general symbolic algorithm for computing the symmetries of a given rational surface, based on the classical differential invariants of surfaces, i.e. Gauss curvature and mean curvature. In practice, the algorithm works well for sparse parametrizations (e.g. toric surfaces) and PN surfaces. Additionally, we provide a specific, also symbolic algorithm for computing the symmetries of ruled surfaces; this algorithm works extremely well in practice, since the problem is reduced to that of rational space curves, which can be efficiently solved by using existing methods. The algorithm for ruled surfaces is based on the fact, proven in the paper, that every symmetry of a rational surface must also be a symmetry of its line of striction, which is a rational space curve. The algorithms have been implemented in the computer algebra system Maple, and the implementations have been made public; evidence of their performance is given in the paper.
- Published
- 2024
23. Predicting total time to compress a video corpus using online inference systems
- Author
-
Shu, Xin, Vibhoothi, Vibhoothi, and Kokaram, Anil
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Predicting the computational cost of compressing/transcoding clips in a video corpus is important for resource management of cloud services and VOD (Video On Demand) providers. Currently, customers of cloud video services are unaware of the cost of transcoding their files until the task is completed. Previous work concentrated on predicting perclip compression time, and thus estimating the cost of video compression. In this work, we propose new Machine Learning (ML) systems which predict cost for the entire corpus instead. This is a more appropriate goal since users are not interested in per-clip cost but instead the cost for the whole corpus. In this work, we evaluate our systems with respect to two video codecs (x264, x265) and a novel high-quality video corpus. We find that the accuracy of aggregate time prediction for a video corpus more than two times better than using per-clip predictions. Furthermore, we present an online inference framework in which we update the ML models as files are processed. A consideration of video compute overhead and appropriate choice of ML predictor for each fraction of corpus completed yields a prediction error of less than 5%. This is approximately two times better than previous work which proposed generalised predictors., Comment: Accepted by IEEE International Conference on Visual Communications and Image Processing (VCIP) 2024
- Published
- 2024
24. Scalable Ranked Preference Optimization for Text-to-Image Generation
- Author
-
Karthik, Shyamgopal, Coskun, Huseyin, Akata, Zeynep, Tulyakov, Sergey, Ren, Jian, and Kag, Anil
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Direct Preference Optimization (DPO) has emerged as a powerful approach to align text-to-image (T2I) models with human feedback. Unfortunately, successful application of DPO to T2I models requires a huge amount of resources to collect and label large-scale datasets, e.g., millions of generated paired images annotated with human preferences. In addition, these human preference datasets can get outdated quickly as the rapid improvements of T2I models lead to higher quality images. In this work, we investigate a scalable approach for collecting large-scale and fully synthetic datasets for DPO training. Specifically, the preferences for paired images are generated using a pre-trained reward function, eliminating the need for involving humans in the annotation process, greatly improving the dataset collection efficiency. Moreover, we demonstrate that such datasets allow averaging predictions across multiple models and collecting ranked preferences as opposed to pairwise preferences. Furthermore, we introduce RankDPO to enhance DPO-based methods using the ranking feedback. Applying RankDPO on SDXL and SD3-Medium models with our synthetically generated preference dataset "Syn-Pic" improves both prompt-following (on benchmarks like T2I-Compbench, GenEval, and DPG-Bench) and visual quality (through user studies). This pipeline presents a practical and scalable solution to develop better preference datasets to enhance the performance of text-to-image models., Comment: Project Page: https://snap-research.github.io/RankDPO/
- Published
- 2024
25. Facile One Pot Synthesis of Hybrid Core-Shell Silica-Based Sensors for Live Imaging of Dissolved Oxygen and Hypoxia Mapping in 3D cell models
- Author
-
Iuele, Helena, Forciniti, Stefania, Onesto, Valentina, Colella, Francesco, Siciliano, Anna Chiara, Chandra, Anil, Nobile, Concetta, Gigli, Giuseppe, and del Mercato, Loretta L.
- Subjects
Physics - Medical Physics ,Physics - Chemical Physics - Abstract
Fluorescence imaging allows for non-invasively visualizing and measuring key physiological parameters like pH and dissolved oxygen. In our work, we created two ratiometric fluorescent microsensors designed for accurately tracking dissolved oxygen levels in 3D cell cultures. We developed a simple and cost-effective method to produce hybrid core-shell silica microparticles that are biocompatible and versatile. These sensors incorporate oxygen-sensitive probes (Ru(dpp) or PtOEP) and reference dyes (RBITC or A647 NHS-Ester). SEM analysis confirmed efficient loading and distribution of the sensing dye on the outer shell. Fluorimetric and CLSM tests demonstrated the sensors' reversibility and high sensitivity to oxygen, even when integrated into 3D scaffolds. Aging and bleaching experiments validated the stability of our hybrid core-shell silica microsensors for 3D monitoring. The Ru(dpp)-RBITC microparticles showed the most promising performance, especially in a pancreatic cancer model using alginate microgels. By employing computational segmentation, we generated 3D oxygen maps during live cell imaging, revealing oxygen gradients in the extracellular matrix and indicating a significant decrease in oxygen levels characteristic of solid tumors. Notably, after 12 hours, the oxygen concentration dropped to a hypoxic level of PO2 2.7 +/- 0.1%.
- Published
- 2024
- Full Text
- View/download PDF
26. A generalization of Lipschitz mappings
- Author
-
Karn, Anil Kumar and Mandal, Arindam
- Subjects
Mathematics - Functional Analysis ,Primary 46B20, Secondary 46B28 - Abstract
Using the notion of modulus of continuity at a point of a mapping between metric spaces, we introduce the notion of extensively bounded mappings generalizing that of Lipschitz mappings. We also introduce a metric on it which becomes a norm if the codomain is a normed linear space. We study its basic properties. We also discuss a dilation of an extensively bounded mapping into a Lipschitz mapping as well as into a bounded linear mapping., Comment: 21
- Published
- 2024
27. Test smells in LLM-Generated Unit Tests
- Author
-
Ouédraogo, Wendkûuni C., Li, Yinghua, Kaboré, Kader, Tang, Xunzhu, Koyuncu, Anil, Klein, Jacques, Lo, David, and Bissyandé, Tegawendé F.
- Subjects
Computer Science - Software Engineering - Abstract
The use of Large Language Models (LLMs) in automated test generation is gaining popularity, with much of the research focusing on metrics like compilability rate, code coverage and bug detection. However, an equally important quality metric is the presence of test smells design flaws or anti patterns in test code that hinder maintainability and readability. In this study, we explore the diffusion of test smells in LLM generated unit test suites and compare them to those found in human written ones. We analyze a benchmark of 20,500 LLM-generated test suites produced by four models (GPT-3.5, GPT-4, Mistral 7B, and Mixtral 8x7B) across five prompt engineering techniques, alongside a dataset of 780,144 human written test suites from 34,637 projects. Leveraging TsDetect, a state of the art tool capable of detecting 21 different types of test smells, we identify and analyze the prevalence and co-occurrence of various test smells in both human written and LLM-generated test suites. Our findings reveal new insights into the strengths and limitations of LLMs in test generation. First, regarding prevalence, we observe that LLMs frequently generate tests with common test smells, such as Magic Number Test and Assertion Roulette. Second, in terms of co occurrence, certain smells, like Long Test and Useless Test, tend to co occur in LLM-generated suites, influenced by specific prompt techniques. Third, we find that project complexity and LLM specific factors, including model size and context length, significantly affect the prevalence of test smells. Finally, the patterns of test smells in LLM-generated tests often mirror those in human-written tests, suggesting potential data leakage from training datasets. These insights underscore the need to refine LLM-based test generation for cleaner code and suggest improvements in both LLM capabilities and software testing practices.
- Published
- 2024
28. Advancing Experimental Platforms for UAV Communications: Insights from AERPAW'S Digital Twin
- Author
-
Moore, Joshua, Abdalla, Aly Sabri, Ueltschey, Charles, Gürses, Anıl, Özdemir, Özgür, Sichitiu, Mihail L., Güvenç, İsmail, and Marojevic, Vuk
- Subjects
Electrical Engineering and Systems Science - Systems and Control ,Computer Science - Hardware Architecture ,Computer Science - Networking and Internet Architecture - Abstract
The rapid evolution of 5G and beyond has advanced space-air-terrestrial networks, with unmanned aerial vehicles (UAVs) offering enhanced coverage, flexible configurations, and cost efficiency. However, deploying UAV-based systems presents challenges including varying propagation conditions and hardware limitations. While simulators and theoretical models have been developed, real-world experimentation is critically important to validate the research. Digital twins, virtual replicas of physical systems, enable emulation that bridge theory and practice. This paper presents our experimental results from AERPAW's digital twin, showcasing its ability to simulate UAV communication scenarios and providing insights into system performance and reliability., Comment: This article has been accepted for publication in the IEEE VTC Fall 2024--UAV Communication and Experimentation Workshop
- Published
- 2024
29. PINNing Cerebral Blood Flow: Analysis of Perfusion MRI in Infants using Physics-Informed Neural Networks
- Author
-
Galazis, Christoforos, Chiu, Ching-En, Arichi, Tomoki, Bharath, Anil A., and Varela, Marta
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
Arterial spin labeling (ASL) magnetic resonance imaging (MRI) enables cerebral perfusion measurement, which is crucial in detecting and managing neurological issues in infants born prematurely or after perinatal complications. However, cerebral blood flow (CBF) estimation in infants using ASL remains challenging due to the complex interplay of network physiology, involving dynamic interactions between cardiac output and cerebral perfusion, as well as issues with parameter uncertainty and data noise. We propose a new spatial uncertainty-based physics-informed neural network (PINN), SUPINN, to estimate CBF and other parameters from infant ASL data. SUPINN employs a multi-branch architecture to concurrently estimate regional and global model parameters across multiple voxels. It computes regional spatial uncertainties to weigh the signal. SUPINN can reliably estimate CBF (relative error $-0.3 \pm 71.7$), bolus arrival time (AT) ($30.5 \pm 257.8$), and blood longitudinal relaxation time ($T_{1b}$) ($-4.4 \pm 28.9$), surpassing parameter estimates performed using least squares or standard PINNs. Furthermore, SUPINN produces physiologically plausible spatially smooth CBF and AT maps. Our study demonstrates the successful modification of PINNs for accurate multi-parameter perfusion estimation from noisy and limited ASL data in infants. Frameworks like SUPINN have the potential to advance our understanding of the complex cardio-brain network physiology, aiding in the detection and management of diseases. Source code is provided at: https://github.com/cgalaz01/supinn.
- Published
- 2024
30. Transforming In-Vehicle Network Intrusion Detection: VAE-based Knowledge Distillation Meets Explainable AI
- Author
-
Yagiz, Muhammet Anil, MohajerAnsari, Pedram, Pese, Mert D., and Goktas, Polat
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Artificial Intelligence - Abstract
In the evolving landscape of autonomous vehicles, ensuring robust in-vehicle network (IVN) security is paramount. This paper introduces an advanced intrusion detection system (IDS) called KD-XVAE that uses a Variational Autoencoder (VAE)-based knowledge distillation approach to enhance both performance and efficiency. Our model significantly reduces complexity, operating with just 1669 parameters and achieving an inference time of 0.3 ms per batch, making it highly suitable for resource-constrained automotive environments. Evaluations in the HCRL Car-Hacking dataset demonstrate exceptional capabilities, attaining perfect scores (Recall, Precision, F1 Score of 100%, and FNR of 0%) under multiple attack types, including DoS, Fuzzing, Gear Spoofing, and RPM Spoofing. Comparative analysis on the CICIoV2024 dataset further underscores its superiority over traditional machine learning models, achieving perfect detection metrics. We furthermore integrate Explainable AI (XAI) techniques to ensure transparency in the model's decisions. The VAE compresses the original feature space into a latent space, on which the distilled model is trained. SHAP(SHapley Additive exPlanations) values provide insights into the importance of each latent dimension, mapped back to original features for intuitive understanding. Our paper advances the field by integrating state-of-the-art techniques, addressing critical challenges in the deployment of efficient, trustworthy, and reliable IDSes for autonomous vehicles, ensuring enhanced protection against emerging cyber threats.
- Published
- 2024
31. Order unit property and orthogonality
- Author
-
Karn, Anil Kumar
- Subjects
Mathematics - Functional Analysis ,Primary: 46B40, Secondary: 46B20 - Abstract
We characterize the order unit property of a positive element in an order unit space in terms of orthogonality., Comment: 11 pages
- Published
- 2024
32. Finite-Time Trajectory Tracking of a Four wheeled Mecanum Mobile Robot
- Author
-
B, Anil, Pandey, Mayank, and Gajbhiye, Sneha
- Subjects
Electrical Engineering and Systems Science - Systems and Control - Abstract
Four Wheeled Mecanum Robot (FWMR) possess the capability to move in any direction on a plane making it a cornerstone system in modern industrial operations. Despite the extreme maneuverability offered by FWMR, the practical implementation or real-time simulation of Mecanum wheel robots encounters substantial challenges in trajectory tracking control. In this research work, we present a finite-time control law using backstepping technique to perform stabilization and trajectory tracking objectives for a FWMR system. A rigorous stability proof is presented and explicit computation of the finite-time is provided. For tracking objective, we demonstrate the results taking an S-shaped trajectory inclined towards collision avoidance applications. Simulation validation in real time using Gazebo-ROS on a Mecanum robot model is carried out which complies with the theoretical results.
- Published
- 2024
33. On the Minimal Theory of Consciousness Implicit in Active Inference
- Author
-
Whyte, Christopher J., Corcoran, Andrew W., Robinson, Jonathan, Smith, Ryan, Moran, Rosalyn J., Parr, Thomas, Friston, Karl J., Seth, Anil K., and Hohwy, Jakob
- Subjects
Quantitative Biology - Neurons and Cognition - Abstract
The multifaceted nature of experience poses a challenge to the study of consciousness. Traditional neuroscientific approaches often concentrate on isolated facets, such as perceptual awareness or the global state of consciousness and construct a theory around the relevant empirical paradigms and findings. Theories of consciousness are, therefore, often difficult to compare; indeed, there might be little overlap in the phenomena such theories aim to explain. Here, we take a different approach: starting with active inference, a first principles framework for modelling behaviour as (approximate) Bayesian inference, and building up to a minimal theory of consciousness, which emerges from the shared features of computational models derived under active inference. We review a body of work applying active inference models to the study of consciousness and argue that there is implicit in all these models a small set of theoretical commitments that point to a minimal (and testable) theory of consciousness.
- Published
- 2024
34. Adaptive Mesh Refinement and Error Estimation Method for Optimal Control Using Direct Collocation
- Author
-
Haman III, George V. and Rao, Anil V.
- Subjects
Mathematics - Optimization and Control - Abstract
An adaptive mesh refinement and error estimation method for numerically solving optimal control problems is developed using Legendre-Gauss-Radau direct collocation. In regions of the solution where the desired accuracy tolerance has not been met, the mesh is refined by either increasing the degree of the approximating polynomial in a mesh interval or dividing a mesh interval into subintervals. In regions of the solution where the desired accuracy tolerance has been met, the mesh size may be reduced by either merging adjacent mesh intervals or decreasing the degree of the approximating polynomial in a mesh interval. Coupled with the mesh refinement method described in this paper is a newly developed relative error estimate that is based on the differences between solutions obtained from the collocation method and those obtained by solving initial-value and terminal-value problems in each mesh interval using an interpolated control obtained from the collocation method. Because the error estimate is based on explicit simulation, the solution obtained via collocation is in close agreement with the solution obtained via explicit simulation using the control on the final mesh, which ensures that the control is an accurate approximation of the true optimal control. The method is demonstrated on three examples from the open literature, and the results obtained show an improvement in final mesh size when compared against previously developed mesh refinement methods., Comment: 32 pages; 9 figures; 4 tables
- Published
- 2024
35. Persistent flat band splitting and strong selective band renormalization in a kagome magnet thin film
- Author
-
Ren, Zheng, Huang, Jianwei, Tan, Hengxin, Biswas, Ananya, Pulkkinen, Aki, Zhang, Yichen, Xie, Yaofeng, Yue, Ziqin, Chen, Lei, Xie, Fang, Allen, Kevin, Wu, Han, Ren, Qirui, Rajapitamahuni, Anil, Kundu, Asish, Vescovo, Elio, Kono, Junichiro, Morosan, Emilia, Dai, Pengcheng, Zhu, Jian-Xin, Si, Qimiao, Minár, Ján, Yan, Binghai, and Yi, Ming
- Subjects
Condensed Matter - Mesoscale and Nanoscale Physics - Abstract
Magnetic kagome materials provide a fascinating playground for exploring the interplay of magnetism, correlation and topology. Many magnetic kagome systems have been reported including the binary FemXn (X=Sn, Ge; m:n = 3:1, 3:2, 1:1) family and the rare earth RMn6Sn6 (R = rare earth) family, where their kagome flat bands are calculated to be near the Fermi level in the paramagnetic phase. While partially filling a kagome flat band is predicted to give rise to a Stoner-type ferromagnetism, experimental visualization of the magnetic splitting across the ordering temperature has not been reported for any of these systems due to the high ordering temperatures, hence leaving the nature of magnetism in kagome magnets an open question. Here, we probe the electronic structure with angle-resolved photoemission spectroscopy in a kagome magnet thin film FeSn synthesized using molecular beam epitaxy. We identify the exchange-split kagome flat bands, whose splitting persists above the magnetic ordering temperature, indicative of a local moment picture. Such local moments in the presence of the topological flat band are consistent with the compact molecular orbitals predicted in theory. We further observe a large spin-orbital selective band renormalization in the Fe d_xy+d_(x^2-y^2 ) spin majority channel reminiscent of the orbital selective correlation effects in the iron-based superconductors. Our discovery of the coexistence of local moments with topological flat bands in a kagome system echoes similar findings in magic-angle twisted bilayer graphene, and provides a basis for theoretical effort towards modeling correlation effects in magnetic flat band systems.
- Published
- 2024
- Full Text
- View/download PDF
36. An Optimized H5 Hysteresis Current Control with Clamped Diodes in Transformer-less Grid-PV Inverter
- Author
-
Phuyal, Sushil, Shrestha, Shashwot, Sharma, Swodesh, Subedi, Rachana, Panjiyar, Anil Kumar, and Gautam, Mukesh
- Subjects
Electrical Engineering and Systems Science - Systems and Control - Abstract
With the rise of renewable energy penetration in the grid, photovoltaic (PV) panels are connected to the grid via inverters to supply solar energy. Transformer-less grid-tied PV inverters are gaining popularity because of their improved efficiency, reduced size, and lower costs. However, they can induce a path for leakage currents between the PV and the grid part due to the absence of galvanic isolation between them. This leads to serious electromagnetic interference, loss in efficiency and safety concerns. The leakage current is primarily influenced by the nature of the common mode voltage (CMV), which is determined by the switching techniques of the inverter. In this paper, a novel inverter topology of Hysteresis Controlled H5 with Two Clamping Diodes (HCH5-D2) has been derived. The HCH5-D2 topology helps to decouple the AC part (Grid) and DC part (PV) during the freewheeling to make the CMV constant and in turn, reduces the leakage current. Also, the additional diodes help to reduce the voltage spikes generated during the freewheeling period and maintain the CMV at a constant value. Finally, a 2.2kW grid-connected single-phase HCH5-D2 PV inverter system's MATLAB simulation has been presented with better results when compared with a traditional H4 inverter.
- Published
- 2024
37. Noncrossing Longest Paths and Cycles
- Author
-
Aloupis, Greg, Biniaz, Ahmad, Bose, Prosenjit, De Carufel, Jean-Lou, Eppstein, David, Maheshwari, Anil, Odak, Saeed, Smid, Michiel, Tóth, Csaba D., and Valtr, Pavel
- Subjects
Computer Science - Computational Geometry - Abstract
Edge crossings in geometric graphs are sometimes undesirable as they could lead to unwanted situations such as collisions in motion planning and inconsistency in VLSI layout. Short geometric structures such as shortest perfect matchings, shortest spanning trees, shortest spanning paths, and shortest spanning cycles on a given point set are inherently noncrossing. However, the longest such structures need not be noncrossing. In fact, it is intuitive to expect many edge crossings in various geometric graphs that are longest. Recently, \'Alvarez-Rebollar, Cravioto-Lagos, Mar\'in, Sol\'e-Pi, and Urrutia (Graphs and Combinatorics, 2024) constructed a set of points for which the longest perfect matching is noncrossing. They raised several challenging questions in this direction. In particular, they asked whether the longest spanning path, on any finite set of points in the plane, must have a pair of crossing edges. They also conjectured that the longest spanning cycle must have a pair of crossing edges. In this paper, we give a negative answer to the question and also refute the conjecture. We present a framework for constructing arbitrarily large point sets for which the longest perfect matchings, the longest spanning paths, and the longest spanning cycles are noncrossing., Comment: 19 pages, 8 figures, GD 2024
- Published
- 2024
38. Attribute Controlled Fine-tuning for Large Language Models: A Case Study on Detoxification
- Author
-
Meng, Tao, Mehrabi, Ninareh, Goyal, Palash, Ramakrishna, Anil, Galstyan, Aram, Zemel, Richard, Chang, Kai-Wei, Gupta, Rahul, and Peris, Charith
- Subjects
Computer Science - Computation and Language - Abstract
We propose a constraint learning schema for fine-tuning Large Language Models (LLMs) with attribute control. Given a training corpus and control criteria formulated as a sequence-level constraint on model outputs, our method fine-tunes the LLM on the training corpus while enhancing constraint satisfaction with minimal impact on its utility and generation quality. Specifically, our approach regularizes the LLM training by penalizing the KL divergence between the desired output distribution, which satisfies the constraints, and the LLM's posterior. This regularization term can be approximated by an auxiliary model trained to decompose the sequence-level constraints into token-level guidance, allowing the term to be measured by a closed-form formulation. To further improve efficiency, we design a parallel scheme for concurrently updating both the LLM and the auxiliary model. We evaluate the empirical performance of our approach by controlling the toxicity when training an LLM. We show that our approach leads to an LLM that produces fewer inappropriate responses while achieving competitive performance on benchmarks and a toxicity detection task., Comment: Accepted to EMNLP Findings
- Published
- 2024
39. DAAL: Density-Aware Adaptive Line Margin Loss for Multi-Modal Deep Metric Learning
- Author
-
Gebrerufael, Hadush Hailu, Tiwari, Anil Kumar, Neupane, Gaurav, and Hailu, Goitom Ybrah
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning - Abstract
Multi-modal deep metric learning is crucial for effectively capturing diverse representations in tasks such as face verification, fine-grained object recognition, and product search. Traditional approaches to metric learning, whether based on distance or margin metrics, primarily emphasize class separation, often overlooking the intra-class distribution essential for multi-modal feature learning. In this context, we propose a novel loss function called Density-Aware Adaptive Margin Loss(DAAL), which preserves the density distribution of embeddings while encouraging the formation of adaptive sub-clusters within each class. By employing an adaptive line strategy, DAAL not only enhances intra-class variance but also ensures robust inter-class separation, facilitating effective multi-modal representation. Comprehensive experiments on benchmark fine-grained datasets demonstrate the superior performance of DAAL, underscoring its potential in advancing retrieval applications and multi-modal deep metric learning., Comment: 13 pages, 4 fugues, 2 tables
- Published
- 2024
40. Integrating Reasoning Systems for Trustworthy AI, Proceedings of the 4th Workshop on Logic and Practice of Programming (LPOP)
- Author
-
Nerode, Anil and Liu, Yanhong A.
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Logic in Computer Science ,Computer Science - Programming Languages - Abstract
This proceedings contains abstracts and position papers for the work to be presented at the fourth Logic and Practice of Programming (LPOP) Workshop. The workshop is to be held in Dallas, Texas, USA, and as a hybrid event, on October 13, 2024, in conjunction with the 40th International Conference on Logic Programming (ICLP). The focus of this workshop is integrating reasoning systems for trustworthy AI, especially including integrating diverse models of programming with rules and constraints.
- Published
- 2024
41. Towards Democratization of Subspeciality Medical Expertise
- Author
-
O'Sullivan, Jack W., Palepu, Anil, Saab, Khaled, Weng, Wei-Hung, Cheng, Yong, Chu, Emily, Desai, Yaanik, Elezaby, Aly, Kim, Daniel Seung, Lan, Roy, Tang, Wilson, Tapaskar, Natalie, Parikh, Victoria, Jain, Sneha S., Kulkarni, Kavita, Mansfield, Philip, Webster, Dale, Gottweis, Juraj, Barral, Joelle, Schaekermann, Mike, Tanno, Ryutaro, Mahdavi, S. Sara, Natarajan, Vivek, Karthikesalingam, Alan, Ashley, Euan, and Tu, Tao
- Subjects
Computer Science - Human-Computer Interaction ,Computer Science - Artificial Intelligence - Abstract
The scarcity of subspecialist medical expertise, particularly in rare, complex and life-threatening diseases, poses a significant challenge for healthcare delivery. This issue is particularly acute in cardiology where timely, accurate management determines outcomes. We explored the potential of AMIE (Articulate Medical Intelligence Explorer), a large language model (LLM)-based experimental AI system optimized for diagnostic dialogue, to potentially augment and support clinical decision-making in this challenging context. We curated a real-world dataset of 204 complex cases from a subspecialist cardiology practice, including results for electrocardiograms, echocardiograms, cardiac MRI, genetic tests, and cardiopulmonary stress tests. We developed a ten-domain evaluation rubric used by subspecialists to evaluate the quality of diagnosis and clinical management plans produced by general cardiologists or AMIE, the latter enhanced with web-search and self-critique capabilities. AMIE was rated superior to general cardiologists for 5 of the 10 domains (with preference ranging from 9% to 20%), and equivalent for the rest. Access to AMIE's response improved cardiologists' overall response quality in 63.7% of cases while lowering quality in just 3.4%. Cardiologists' responses with access to AMIE were superior to cardiologist responses without access to AMIE for all 10 domains. Qualitative examinations suggest AMIE and general cardiologist could complement each other, with AMIE thorough and sensitive, while general cardiologist concise and specific. Overall, our results suggest that specialized medical LLMs have the potential to augment general cardiologists' capabilities by bridging gaps in subspecialty expertise, though further research and validation are essential for wide clinical utility.
- Published
- 2024
42. A Mathematical Perspective on Neurophenomenology
- Author
-
Da Costa, Lancelot, Sandved-Smith, Lars, Friston, Karl, Ramstead, Maxwell J. D., and Seth, Anil K.
- Subjects
Quantitative Biology - Neurons and Cognition - Abstract
In the context of consciousness studies, a key challenge is how to rigorously conceptualise first-person phenomenological descriptions of lived experience and their relation to third-person empirical measurements of the activity or dynamics of the brain and body. Since the 1990s, there has been a coordinated effort to explicitly combine first-person phenomenological methods, generating qualitative data, with neuroscientific techniques used to describe and quantify brain activity under the banner of "neurophenomenology". Here, we take on this challenge and develop an approach to neurophenomenology from a mathematical perspective. We harness recent advances in theoretical neuroscience and the physics of cognitive systems to mathematically conceptualise first-person experience and its correspondence with neural and behavioural dynamics. Throughout, we make the operating assumption that the content of first-person experience can be formalised as (or related to) a belief (i.e. a probability distribution) that encodes an organism's best guesses about the state of its external and internal world (e.g. body or brain) as well as its uncertainty. We mathematically characterise phenomenology, bringing to light a tool-set to quantify individual phenomenological differences and develop several hypotheses including on the metabolic cost of phenomenology and on the subjective experience of time. We conceptualise the form of the generative passages between first- and third-person descriptions, and the mathematical apparatus that mutually constrains them, as well as future research directions. In summary, we formalise and characterise first-person subjective experience and its correspondence with third-person empirical measurements of brain and body, offering hypotheses for quantifying various aspects of phenomenology to be tested in future work., Comment: 15 pages, 4 figures
- Published
- 2024
43. Bayesian Inference of dense matter equation of state of neutron star with antikaon condensation
- Author
-
Parmar, Vishal, Thapa, Vivek Baruah, Kumar, Anil, Bandyopadhyay, Debades, and Sinha, Monika
- Subjects
Astrophysics - High Energy Astrophysical Phenomena ,High Energy Physics - Phenomenology - Abstract
In this paper, we employ the Density Dependent Relativistic Hadron (DDRH) field theoretical Model in a Bayesian analysis to investigate the equation of state (EOS) of dense matter featuring antikaon condensation for $K^-$ and $\bar{K}^0$ inside neutron stars. The vector coupling parameters within the kaonic sector are determined through the iso-spin counting rule and quark model. Our study integrates various constraints, including $\chi$EFT calculations, nuclear saturation properties, and astrophysical observations from pulsars PSR J0030+0451 and PSR J0740+66 and from the GW170817 event. We present posterior distributions of model parameters derived from these constraints, enabling us to explore the distributions of nuclear matter properties and neutron star (NS) characteristics such as radii, tidal deformabilities, central energy densities, and speed of sound. The antikaon potential at the 68(90)\% confidence intervals is determined to be $-129.36^{+12.53(+32.617)}_{-3.837(-5.696)}$ MeV. This aligns with several studies providing estimates within the range of $-120$ to $-150$ MeV. We find that the maximum neutron star mass is constrained to around 2M$_\odot$ due to the significant softening of the EOS caused by antikaon condensation. This softening results in a considerable decrease in the speed of sound. Although antikaon condensation for $K^-$ is not feasible inside the canonical neutron stars, it becomes feasible for higher NS masses. The condensation of both $K^-$ and $\bar{K}^0$ is probably present in the interior of neutron star with mass greater than 2M$_\odot$. We also discuss the interconnections among input variables, isoscalar and isovector aspects of the EOS, and specific NS properties in the context of antikaon condensation., Comment: 19 pages, 10 figures, Accepted for publication in Physical Review C (in press)
- Published
- 2024
- Full Text
- View/download PDF
44. On 1-Planar Graphs with Bounded Cop-Number
- Author
-
Bose, Prosenjit, De Carufel, Jean-Lou, Maheshwari, Anil, and Murali, Karthik
- Subjects
Computer Science - Discrete Mathematics ,Mathematics - Combinatorics - Abstract
Cops and Robbers is a type of pursuit-evasion game played on a graph where a set of cops try to capture a single robber. The cops first choose their initial vertex positions, and later the robber chooses a vertex. The cops and robbers make their moves in alternate turns: in the cops' turn, every cop can either choose to move to an adjacent vertex or stay on the same vertex, and likewise the robber in his turn. If the cops can capture the robber in a finite number of rounds, the cops win, otherwise the robber wins. The cop-number of a graph is the minimum number of cops required to catch a robber in the graph. It has long been known that graphs embedded on surfaces (such as planar graphs and toroidal graphs) have a small cop-number. Recently, Durocher et al. [Graph Drawing, 2023] investigated the problem of cop-number for the class of $1$-planar graphs, which are graphs that can be embedded in the plane such that each edge is crossed at most once. They showed that unlike planar graphs which require just three cops, 1-planar graphs have an unbounded cop-number. On the positive side, they showed that maximal 1-planar graphs require only three cops by crucially using the fact that the endpoints of every crossing in an embedded maximal 1-planar graph induce a $K_4$. In this paper, we show that the cop-number remains bounded even under the relaxed condition that the endpoints induce at least three edges. More precisely, let an $\times$-crossing of an embedded 1-planar graph be a crossing whose endpoints induce a matching; i.e., there is no edge connecting the endpoints apart from the crossing edges themselves. We show that any 1-planar graph that can be embedded without $\times$-crossings has cop-number at most 21. Moreover, any 1-planar graph that can be embedded with at most $\gamma$ $\times$-crossings has cop-number at most $\gamma + 21$.
- Published
- 2024
45. Adversarial Watermarking for Face Recognition
- Author
-
Yao, Yuguang, Jain, Anil, and Liu, Sijia
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
Watermarking is an essential technique for embedding an identifier (i.e., watermark message) within digital images to assert ownership and monitor unauthorized alterations. In face recognition systems, watermarking plays a pivotal role in ensuring data integrity and security. However, an adversary could potentially interfere with the watermarking process, significantly impairing recognition performance. We explore the interaction between watermarking and adversarial attacks on face recognition models. Our findings reveal that while watermarking or input-level perturbation alone may have a negligible effect on recognition accuracy, the combined effect of watermarking and perturbation can result in an adversarial watermarking attack, significantly degrading recognition performance. Specifically, we introduce a novel threat model, the adversarial watermarking attack, which remains stealthy in the absence of watermarking, allowing images to be correctly recognized initially. However, once watermarking is applied, the attack is activated, causing recognition failures. Our study reveals a previously unrecognized vulnerability: adversarial perturbations can exploit the watermark message to evade face recognition systems. Evaluated on the CASIA-WebFace dataset, our proposed adversarial watermarking attack reduces face matching accuracy by 67.2% with an $\ell_\infty$ norm-measured perturbation strength of ${2}/{255}$ and by 95.9% with a strength of ${4}/{255}$.
- Published
- 2024
46. A theory of generalised coordinates for stochastic differential equations
- Author
-
Da Costa, Lancelot, Da Costa, Nathaël, Heins, Conor, Medrano, Johan, Pavliotis, Grigorios A., Parr, Thomas, Meera, Ajith Anil, and Friston, Karl
- Subjects
Mathematics - Probability ,Mathematics - Dynamical Systems ,Statistics - Methodology - Abstract
Stochastic differential equations are ubiquitous modelling tools in physics and the sciences. In most modelling scenarios, random fluctuations driving dynamics or motion have some non-trivial temporal correlation structure, which renders the SDE non-Markovian; a phenomenon commonly known as ``colored'' noise. Thus, an important objective is to develop effective tools for mathematically and numerically studying (possibly non-Markovian) SDEs. In this report, we formalise a mathematical theory for analysing and numerically studying SDEs based on so-called `generalised coordinates of motion'. Like the theory of rough paths, we analyse SDEs pathwise for any given realisation of the noise, not solely probabilistically. Like the established theory of Markovian realisation, we realise non-Markovian SDEs as a Markov process in an extended space. Unlike the established theory of Markovian realisation however, the Markovian realisations here are accurate on short timescales and may be exact globally in time, when flows and fluctuations are analytic. This theory is exact for SDEs with analytic flows and fluctuations, and is approximate when flows and fluctuations are differentiable. It provides useful analysis tools, which we employ to solve linear SDEs with analytic fluctuations. It may also be useful for studying rougher SDEs, as these may be identified as the limit of smoother ones. This theory supplies effective, computationally straightforward methods for simulation, filtering and control of SDEs; amongst others, we re-derive generalised Bayesian filtering, a state-of-the-art method for time-series analysis. Looking forward, this report suggests that generalised coordinates have far-reaching applications throughout stochastic differential equations., Comment: 38 pages of main, 45 pages including TOC, Appendix and references
- Published
- 2024
47. Charge Density Waves and the Effects of Uniaxial Strain on the Electronic Structure of 2H-NbSe$_2$
- Author
-
Kundu, Asish K., Rajapitamahuni, Anil, Vescovo, Elio, Klimovskikh, Ilya I., Berger, Helmuth, and Valla, Tonica
- Subjects
Condensed Matter - Strongly Correlated Electrons ,Condensed Matter - Superconductivity - Abstract
Interplay of superconductivity and density wave orders has been at the forefront of research of correlated electronic phases for a long time. 2H-NbSe$_2$ is considered to be a prototype system for studying this interplay, where the balance between the two orders was proven to be sensitive to band filling and pressure. However, the origin of charge density wave in this material is still unresolved. Here, by using angle-resolved photoemission spectroscopy, we revisit the charge density wave order and study the effects of uniaxial strain on the electronic structure of 2H-NbSe$_2$. Our results indicate previously undetected signatures of charge density waves on the Fermi surface. The application of small amount of uniaxial strain induces substantial changes in the electronic structure and lowers its symmetry. This, and the altered lattice should affect both the charge density wave phase and superconductivity and should be observable in the macroscopic properties., Comment: 9 pages, 6 figures
- Published
- 2024
- Full Text
- View/download PDF
48. Exploring Fine-grained Retail Product Discrimination with Zero-shot Object Classification Using Vision-Language Models
- Author
-
Tur, Anil Osman, Conti, Alessandro, Beyan, Cigdem, Boscaini, Davide, Larcher, Roberto, Messelodi, Stefano, Poiesi, Fabio, and Ricci, Elisa
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
In smart retail applications, the large number of products and their frequent turnover necessitate reliable zero-shot object classification methods. The zero-shot assumption is essential to avoid the need for re-training the classifier every time a new product is introduced into stock or an existing product undergoes rebranding. In this paper, we make three key contributions. Firstly, we introduce the MIMEX dataset, comprising 28 distinct product categories. Unlike existing datasets in the literature, MIMEX focuses on fine-grained product classification and includes a diverse range of retail products. Secondly, we benchmark the zero-shot object classification performance of state-of-the-art vision-language models (VLMs) on the proposed MIMEX dataset. Our experiments reveal that these models achieve unsatisfactory fine-grained classification performance, highlighting the need for specialized approaches. Lastly, we propose a novel ensemble approach that integrates embeddings from CLIP and DINOv2 with dimensionality reduction techniques to enhance classification performance. By combining these components, our ensemble approach outperforms VLMs, effectively capturing visual cues crucial for fine-grained product discrimination. Additionally, we introduce a class adaptation method that utilizes visual prototyping with limited samples in scenarios with scarce labeled data, addressing a critical need in retail environments where product variety frequently changes. To encourage further research into zero-shot object classification for smart retail applications, we will release both the MIMEX dataset and benchmark to the research community. Interested researchers can contact the authors for details on the terms and conditions of use. The code is available: https://github.com/AnilOsmanTur/Zero-shot-Retail-Product-Classification., Comment: Accepted at 2024 IEEE 8th Forum on Research and Technologies for Society and Industry Innovation (RTSI) conference
- Published
- 2024
49. Data Pruning via Separability, Integrity, and Model Uncertainty-Aware Importance Sampling
- Author
-
Grosz, Steven, Zhao, Rui, Ranjan, Rajeev, Wang, Hongcheng, Aggarwal, Manoj, Medioni, Gerard, and Jain, Anil
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
This paper improves upon existing data pruning methods for image classification by introducing a novel pruning metric and pruning procedure based on importance sampling. The proposed pruning metric explicitly accounts for data separability, data integrity, and model uncertainty, while the sampling procedure is adaptive to the pruning ratio and considers both intra-class and inter-class separation to further enhance the effectiveness of pruning. Furthermore, the sampling method can readily be applied to other pruning metrics to improve their performance. Overall, the proposed approach scales well to high pruning ratio and generalizes better across different classification models, as demonstrated by experiments on four benchmark datasets, including the fine-grained classification scenario.
- Published
- 2024
50. Michelangelo: Long Context Evaluations Beyond Haystacks via Latent Structure Queries
- Author
-
Vodrahalli, Kiran, Ontanon, Santiago, Tripuraneni, Nilesh, Xu, Kelvin, Jain, Sanil, Shivanna, Rakesh, Hui, Jeffrey, Dikkala, Nishanth, Kazemi, Mehran, Fatemi, Bahare, Anil, Rohan, Dyer, Ethan, Shakeri, Siamak, Vij, Roopali, Mehta, Harsh, Ramasesh, Vinay, Le, Quoc, Chi, Ed, Lu, Yifeng, Firat, Orhan, Lazaridou, Angeliki, Lespiau, Jean-Baptiste, Attaluri, Nithya, and Olszewska, Kate
- Subjects
Computer Science - Computation and Language ,Computer Science - Machine Learning - Abstract
We introduce Michelangelo: a minimal, synthetic, and unleaked long-context reasoning evaluation for large language models which is also easy to automatically score. This evaluation is derived via a novel, unifying framework for evaluations over arbitrarily long contexts which measure the model's ability to do more than retrieve a single piece of information from its context. The central idea of the Latent Structure Queries framework (LSQ) is to construct tasks which require a model to ``chisel away'' the irrelevant information in the context, revealing a latent structure in the context. To verify a model's understanding of this latent structure, we query the model for details of the structure. Using LSQ, we produce three diagnostic long-context evaluations across code and natural-language domains intended to provide a stronger signal of long-context language model capabilities. We perform evaluations on several state-of-the-art models and demonstrate both that a) the proposed evaluations are high-signal and b) that there is significant room for improvement in synthesizing long-context information.
- Published
- 2024
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.