1,136 results on '"Srijith"'
Search Results
2. Designing resilient and economically viable water distribution systems: a multi-dimensional approach
- Author
-
Beatrice Cassottana, Srijith Balakrishnan, Nazli Yonca Aydin, and Giovanni Sansavini
- Subjects
Resilience ,Economic analysis ,Water distribution system ,Interdependency ,Recovery strategy ,Disasters and engineering ,TA495 ,Cities. Urban geography ,GF125 - Abstract
Enhancing the resilience of critical infrastructure systems requires substantial investment and entails trade-offs between environmental and economic benefits. To this aim, we propose a methodological framework that combines resilience and economic analyses and assesses the economic viability of alternative resilience designs for a Water Distribution System (WDS) and its interdependent power and transportation systems. Flow-based network models simulate the interdependent infrastructure systems and Global Resilience Analysis (GRA) quantifies three resilience metrics under various disruption scenarios. The economic analysis monetizes the three metrics and compares two resilience strategies involving the installation of remotely controlled shutoff valves. Using the Micropolis synthetic interdependent water-transportation network as an example, we demonstrate how our framework can guide infrastructure stakeholders and utility operators in measuring the value of resilience investments. Overall, our approach highlights the importance of economic analysis in designing resilient infrastructure systems.
- Published
- 2023
- Full Text
- View/download PDF
3. Electrical detection of RNA cancer biomarkers at the single-molecule level
- Author
-
Keshani G. Gunasinghe Pattiya Arachchillage, Subrata Chandra, Ajoke Williams, Patrick Piscitelli, Jennifer Pham, Aderlyn Castillo, Lily Florence, Srijith Rangan, and Juan M. Artes Vivancos
- Subjects
Medicine ,Science - Abstract
Abstract Cancer is a significant healthcare issue, and early screening methods based on biomarker analysis in liquid biopsies are promising avenues to reduce mortality rates. Electrical detection of nucleic acids at the single molecule level could enable these applications. We examine the electrical detection of RNA cancer biomarkers (KRAS mutants G12C and G12V) as a single-molecule proof-of-concept electrical biosensor for cancer screening applications. We show that the electrical conductance is highly sensitive to the sequence, allowing discrimination of the mutants from a wild-type KRAS sequence differing in just one base. In addition to this high specificity, our results also show that these biosensors are sensitive down to an individual molecule with a high signal-to-noise ratio. These results pave the way for future miniaturized single-molecule electrical biosensors that could be groundbreaking for cancer screening and other applications.
- Published
- 2023
- Full Text
- View/download PDF
4. Chemical-Vapor-Deposition-Synthesized Two-Dimensional Non-Stoichiometric Copper Selenide (β-Cu2−xSe) for Ultra-Fast Tetracycline Hydrochloride Degradation under Solar Light
- Author
-
Srijith, Rajashree Konar, Eti Teblum, Vivek Kumar Singh, Madina Telkhozhayeva, Michelangelo Paiardi, and Gilbert Daniel Nessim
- Subjects
2D materials ,photocatalysis ,chemical vapor deposition ,copper selenide ,antibiotic ,degradation ,Organic chemistry ,QD241-441 - Abstract
The high concentration of antibiotics in aquatic environments is a serious environmental issue. In response, researchers have explored photocatalytic degradation as a potential solution. Through chemical vapor deposition (CVD), we synthesized copper selenide (β-Cu2−xSe) and found it an effective catalyst for degrading tetracycline hydrochloride (TC-HCl). The catalyst demonstrated an impressive degradation efficiency of approximately 98% and a reaction rate constant of 3.14 × 10−2 min−1. Its layered structure, which exposes reactive sites, contributes to excellent stability, interfacial charge transfer efficiency, and visible light absorption capacity. Our investigations confirmed that the principal active species produced by the catalyst comprises O2− radicals, which we verified through trapping experiments and electron paramagnetic resonance (EPR). We also verified the TC-HCl degradation mechanism using high-performance liquid chromatography–mass spectrometry (LC-MS). Our results provide valuable insights into developing the β-Cu2−xSe catalyst using CVD and its potential applications in environmental remediation.
- Published
- 2024
- Full Text
- View/download PDF
5. Predicting Resilience of Interdependent Urban Infrastructure Systems
- Author
-
Beatrice Cassottana, Partha P. Biswas, Srijith Balakrishnan, Bennet Ng, Daisuke Mashima, and Giovanni Sansavini
- Subjects
Interdependent infrastructure systems ,simulation ,resilience ,machine learning ,predictive analytics ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Climate change is increasing the frequency and the intensity of weather events, leading to large-scale disruptions to critical infrastructure systems. The high level of interdependence among these systems further aggravates the extent of disruptions. To mitigate these impacts, models and methods are needed to support rapid decision-making for optimal resource allocation in the aftermath of a disruption and to substantiate investment decisions for the structural reconfiguration of these systems. In this paper, we leverage infrastructure simulation models and Machine Learning (ML) algorithms to develop resilience prediction models. First, we employ an interdependent infrastructure simulation model to generate infrastructure disruption and recovery scenarios and compute the resilience value for each scenario. The infrastructure-, disruption-, and recovery-related attributes are recorded for each scenario and ML algorithms are employed on the synthetic dataset to develop accurate resilience prediction models. The results of the prediction models are analyzed and possible design strategies suggested based on the resilience enhancement attributes. The proposed methodology can support infrastructure agencies in the resource-allocation process for pre- and post-disaster interventions.
- Published
- 2022
- Full Text
- View/download PDF
6. A prospective study on the pattern of traumatic ocular injuries in Central Karnatakaand their forensic aspects
- Author
-
Mahalingappa, Seema S, Banakar, Ravindra, Siddappa, Santhosh C, Muthiah, Murugan, and Srijith
- Published
- 2021
- Full Text
- View/download PDF
7. A prospective study on the profile of accidental childhood fatalities in Central Karnataka
- Author
-
Shibina, S, Santhosh, CS, and Srijith
- Published
- 2021
- Full Text
- View/download PDF
8. Technology-Assisted Social Reforms and Online Hate Content: Insights from India
- Author
-
Naganna Chetty, Srijith Alathur, and Vishal Kumar
- Subjects
Social reforms ,hate content ,India ,Twitter ,Social media ,MeToo ,technology. ,Biotechnology ,TP248.13-248.65 - Abstract
Abstract In the current scenario while everything seems digitalized, we often spend more time scrolling across various social platforms as compared to what we spend in any other real life activities. It becomes a matter of great concern when it comes to analyze what we see and how it is interpreted. Through this paper we aims to identify the influence of technology assisted social-reform initiatives on gender-based hate content generation. With the help of Twitter API, 112577 government-initiated and 58370 citizen-initiated movement(s) tweets have been extracted. This collected data is examined for hatred nature content in terms of emotions using a software programmed in R programming language, the scores for each emotion is counted and a comparison between both the moments is made. The study clearly shows that the Citizen-initiated moments shares comparatively more hate content than the Government-initiated movements as the scores particular to specific emotions like anger, disgust, and sadness is more. This cognitive study can be helpful in policy making, promoting gender based equality, defining strategies to rebuild citizen initiatives in a hate-free environment and controlling hate content generation.
- Published
- 2022
- Full Text
- View/download PDF
9. FTNet: Feature Transverse Network for Thermal Image Semantic Segmentation
- Author
-
Karen Panetta, K. M. Shreyas Kamath, Srijith Rajeev, and Sos S. Agaian
- Subjects
Convolutional neural network ,FTNet ,semantic segmentation ,thermal segmentation ,edge-guidance ,transverse network ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Thermal imaging is a process of using infrared radiation and thermal energy to collect information about objects. It is superior to visible imaging for its ability to operate in darkness and tolerate illumination variations. In addition, it has potential to penetrate smoke, aerosol, dust, and mist, which are critical inhibitors for visible imaging applications, including semantic segmentation. Unfortunately, current state-of-the-art image semantic segmentation methods (i) mainly concentrate on visible spectrum images, which do not adequately capture the context of corresponding pixels, particularly edge details in thermal images, and (ii) accept a trade-off between higher accuracy and lower speed, or vice-versa. Here, a novel end-to-end trainable convolutional neural network architecture, feature transverse network (FTNet), has been proposed to solve the aforementioned problems. FTNet captures and optimizes feature representation at the multi-scale resolution, thereby improving the capability to process high-resolution images and producing quality output with a lower computational cost. Extensive computer experimentations were conducted on publicly available benchmarking thermal datasets, including SODA, MFNet, and SCUT-Seg, to demonstrate the effectiveness of the proposed FTNet compared to state-of-the-art methods. This comparison includes multiple aspects, including the quantitative accuracy and speed of the various approaches. The source code is available at https://github.com/shreyaskamathkm/FTNet.
- Published
- 2021
- Full Text
- View/download PDF
10. Enhanced Astronomical Source Classification with Integration of Attention Mechanisms and Vision Transformers
- Author
-
Bhavanam, Srinadh Reddy, Channappayya, Sumohana S., Srijith, P. K., and Desai, Shantanu
- Subjects
Astrophysics - Instrumentation and Methods for Astrophysics - Abstract
Accurate classification of celestial objects is essential for advancing our understanding of the universe. MargNet is a recently developed deep learning-based classifier applied to SDSS DR16 dataset to segregate stars, quasars, and compact galaxies using photometric data. MargNet utilizes a stacked architecture, combining a Convolutional Neural Network (CNN) for image modelling and an Artificial Neural Network (ANN) for modelling photometric parameters. In this study, we propose enhancing MargNet's performance by incorporating attention mechanisms and Vision Transformer (ViT)-based models for processing image data. The attention mechanism allows the model to focus on relevant features and capture intricate patterns within images, effectively distinguishing between different classes of celestial objects. Additionally, we leverage ViTs, a transformer-based deep learning architecture renowned for exceptional performance in image classification tasks. We enhance the model's understanding of complex astronomical images by utilizing ViT's ability to capture global dependencies and contextual information. Our approach uses a curated dataset comprising 240,000 compact and 150,000 faint objects. The models learn classification directly from the data, minimizing human intervention. Furthermore, we explore ViT as a hybrid architecture that uses photometric features and images together as input to predict astronomical objects. Our results demonstrate that the proposed attention mechanism augmented CNN in MargNet marginally outperforms the traditional MargNet and the proposed ViT-based MargNet models. Additionally, the ViT-based hybrid model emerges as the most lightweight and easy-to-train model with classification accuracy similar to that of the best-performing attention-enhanced MargNet., Comment: 33 pages, 11 figures. Accepted for publication in APSS
- Published
- 2024
- Full Text
- View/download PDF
11. ISeeColor: Method for Advanced Visual Analytics of Eye Tracking Data
- Author
-
Karen Panetta, Qianwen Wan, Srijith Rajeev, Aleksandra Kaszowska, Aaron L. Gardony, Kevin Naranjo, Holly A. Taylor, and Sos Agaian
- Subjects
Eye-trackers ,cognitive science ,data visualization ,data analysis ,deep-learning ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Recent advances in head-mounted eye-tracking technology have allowed researchers to monitor eye movements during locomotion in real-world environments, increasing the ecological validity of research on human gaze behavior. While collecting eye-tracking data is becoming more accessible, visual analytics of eye-tracking data remains difficult and time-consuming. As such, there is a significant need for developing efficient visualization and analysis tools for large-scale eye-tracking data. This work develops a first-of-its-kind eye-tracking data visualization and analysis system that allows for automatic recognition of independent objects within field-of-vision, using deep-learning-based semantic segmentation. This system recolors the fixated objects-of-interest by integrating gaze fixation information with semantic maps. The system effectively allows researchers to automatically infer what objects users view and for how long in dynamic contexts. The contributions are 1) a data visualization and analysis system that uses deep-learning technology along with eye-tracking data to automatically recognize objects-of-interest from head-mounted eye-tracking video recordings, and 2) a graphical user interface that presents objects-of-interest annotation along with eye-tracking data information. The architecture is tested with an outdoor case study of users walking around the Tufts University campus as part of a navigation study, which was administered by a team of research psychologists.
- Published
- 2020
- Full Text
- View/download PDF
12. Unrolling Post-Mortem 3D Fingerprints Using Mosaicking Pressure Simulation Technique
- Author
-
Karen Panetta, Srijith Rajeev, K. M. Shreyas Kamath, and Sos S. Agaian
- Subjects
2D processing ,3D processing ,ante-mortem ,authentication ,biometric ,fingerprint ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Post-mortem fingerprints are a valuable biometric used to aid in the identification of a deceased individual. However, fingerprints from the deceased undergo decomposition leading to indefinite structure when compared to ante-mortem fingerprints. Moreover, the performance of the existing two-dimensional (2D) fingerprint recognition systems is still below the expected potential. These problems arise because fingerprints are generally captured by manipulating a finger against a plane. In post-mortem fingerprint recovery, the decedent's finger must go through several reconditioning processes to prevent the rapid onslaught of decomposition. To address these deficiencies associated with the 2D systems, three-dimensional (3D) scanning systems have been employed to capture fingerprints. The 3D technology is still in its transient phase and is limited primarily by 1) the lack of existing 3D databases; 2) the deficiency of 3D-to-2D fingerprint image mapping algorithms, 3) the incapacity to model and recreate the 2D fingerprint capturing procedure to improve 3D-2D fingerprint verification; and 4) the inability to apply traditional fingerprint unrolling techniques on post-mortem 3D fingerprints. This paper presents a novel method to perform post-mortem 3D fingerprint unrolling and pressure simulation to produce fingerprint images that are compatible with 2D fingerprint recognition systems. The thrust of this paper strives to: 1) develop a correspondence between 3D touchless and contact-based 2D fingerprint images; 2) model fingerprints with deformities to provide a viable fingerprint image for matching and; 3) develop a mosaic pressure simulation (MPS) algorithm to recreate the effects of 2D fingerprint capturing procedure.
- Published
- 2019
- Full Text
- View/download PDF
13. LQM: Localized Quality Measure for Fingerprint Image Enhancement
- Author
-
Karen Panetta, Shreyas K. M Kamath, Srijith Rajeev, and Sos S. Agaian
- Subjects
LQM ,LQME ,fingerprint ,quality ,enhancement ,subjective ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Fingerprint is one of the most widely used biometric in law enforcement. However, low-quality fingerprint images can drastically degrade the performance of automated fingerprint identification systems (AFIS). AFIS can be substantially advanced by: 1) establishing a metric to evaluate the image quality accurately and 2) utilizing this metric to enable an automated enhancement process. This paper offers a novel localized quality measure (LQM) to evaluate the quality of fingerprint images, and a genetic localized quality measure enhancement (LQME) algorithm, which is tailored to iteratively enhance poor-quality fingerprint images. In addition, a method is introduced to automatically choose the enhancement algorithm's parameters based on the proposed measure such that it yields the best enhancement result. The presented LQM measure uses fingerprint image characteristics, which include sharpness, contrast, orientation certainty level, symmetry features, and imprints of friction ridge structure (minutiae) information. The FVC2004 Set B database containing fingerprint images from four different sensors and a total of 240 images (80 from each sensor) is used to evaluate the performance of the presented algorithms and methods. The computer simulations demonstrate that the LQM measure is useful in predicting the quality of the fingerprint images captured from various devices. Furthermore, the experiments show that LQME can recover retrievable-corrupt fingerprint regions.
- Published
- 2019
- Full Text
- View/download PDF
14. Dynamic Physical Activity Recommendation Delivered through a Mobile Fitness App: A Deep Learning Approach
- Author
-
Subramaniyaswamy Vairavasundaram, Vijayakumar Varadarajan, Deepthi Srinivasan, Varshini Balaganesh, Srijith Bharadwaj Damerla, Bhuvaneswari Swaminathan, and Logesh Ravi
- Subjects
data-driven ,machine learning ,tracking physical fitness ,personalized recommendation system ,mobile apps ,walking step count ,Mathematics ,QA1-939 - Abstract
Regular physical activity has a positive impact on our physical and mental health. Adhering to a fixed physical activity regimen is essential for good health and mental wellbeing. Today, fitness trackers and smartphone applications are used to promote physical activity. These applications use step counts recorded by accelerometers to estimate physical activity. In this research, we performed a two-level clustering on a dataset based on individuals’ physical and physiological features, as well as past daily activity patterns. The proposed model exploits the user data with partial or complete features. To include the user with partial features, we trained the proposed model with the data of users who possess exclusive features. Additionally, we classified the users into several clusters to produce more accurate results for the users. This enables the proposed system to provide data-driven and personalized activity planning recommendations every day. A personalized physical activity plan is generated on the basis of hourly patterns for users according to their adherence and past recommended activity plans. Customization of activity plans can be achieved according to the user’s historical activity habits and current activity objective, as well as the likelihood of sticking to the plan. The proposed physical activity recommendation system was evaluated in real time, and the results demonstrated the improved performance over existing baselines.
- Published
- 2022
- Full Text
- View/download PDF
15. Transformer based Multitask Learning for Image Captioning and Object Detection
- Author
-
Basak, Debolena, Srijith, P. K., and Desarkar, Maunendra Sankar
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Computation and Language - Abstract
In several real-world scenarios like autonomous navigation and mobility, to obtain a better visual understanding of the surroundings, image captioning and object detection play a crucial role. This work introduces a novel multitask learning framework that combines image captioning and object detection into a joint model. We propose TICOD, Transformer-based Image Captioning and Object detection model for jointly training both tasks by combining the losses obtained from image captioning and object detection networks. By leveraging joint training, the model benefits from the complementary information shared between the two tasks, leading to improved performance for image captioning. Our approach utilizes a transformer-based architecture that enables end-to-end network integration for image captioning and object detection and performs both tasks jointly. We evaluate the effectiveness of our approach through comprehensive experiments on the MS-COCO dataset. Our model outperforms the baselines from image captioning literature by achieving a 3.65% improvement in BERTScore., Comment: Accepted at PAKDD 2024
- Published
- 2024
16. Smart junction: advanced zone-based traffic control system with integrated anomaly detector
- Author
-
S. P., Krishnendhu, Mohandas, Prabu, and C. S., Srijith
- Published
- 2024
- Full Text
- View/download PDF
17. Synthesis, structural characterization and uv analysis of praseodymium zirconate oxides
- Author
-
Srijith, S., Kavitha, V. T., and Asitha, L.R.
- Published
- 2018
- Full Text
- View/download PDF
18. Post Mortem Cooling Pattern In South India-A Basic Approach
- Author
-
Hussain, Jaffar AP, Srijith, Subhedar, Abhijit, Mohanty, Sujan Kumar, and Kumar, Virendra
- Published
- 2018
- Full Text
- View/download PDF
19. Hybrid CNN-LightGBM Architecture for Earthquake Event Classification in DAS Systems
- Author
-
Sasi, Deepika, Joseph, Thomas, and Kanakambaran, Srijith
- Published
- 2024
- Full Text
- View/download PDF
20. Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech Recognition
- Author
-
Radhakrishnan, Srijith, Yang, Chao-Han Huck, Khan, Sumeer Ahmad, Kumar, Rohit, Kiani, Narsis A., Gomez-Cabrero, David, and Tegner, Jesper N.
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Multimedia ,Computer Science - Sound ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
We introduce a new cross-modal fusion technique designed for generative error correction in automatic speech recognition (ASR). Our methodology leverages both acoustic information and external linguistic representations to generate accurate speech transcription contexts. This marks a step towards a fresh paradigm in generative error correction within the realm of n-best hypotheses. Unlike the existing ranking-based rescoring methods, our approach adeptly uses distinct initialization techniques and parameter-efficient algorithms to boost ASR performance derived from pre-trained speech and text models. Through evaluation across diverse ASR datasets, we evaluate the stability and reproducibility of our fusion technique, demonstrating its improved word error rate relative (WERR) performance in comparison to n-best hypotheses by relatively 37.66%. To encourage future research, we have made our code and pre-trained models open source at https://github.com/Srijith-rkr/Whispering-LLaMA., Comment: Accepted to EMNLP 2023 as main paper. 10 pages. Revised math notations. GitHub: https://github.com/Srijith-rkr/Whispering-LLaMA
- Published
- 2023
21. Sensor for the Characterization of 2D Angular Actuators with Picoradian Resolution and Nanoradian Accuracy with Microradian Range
- Author
-
Marco Pisani, Milena Astrua, and Srijith Bangaru Thirumalai Raj
- Subjects
angular actuators ,angular sensors ,encoders ,autocollimators ,Chemical technology ,TP1-1185 - Abstract
High precision angular actuators are used for high demanding applications such as laser steering for photolithography. Piezo technology allows developing actuators with a resolution as low as a few nanoradians, with bandwidths as high as several kilohertz. In most demanding applications, the actual performance of these instruments needs to be characterized. The best angular measurement instruments available today do not sufficient resolution and/or bandwidth to satisfy these needs. At the Istituto Nazionale di Ricerca Metrologica, INRIM a device was designed and built aiming at characterizing precision 2D angular actuators with a resolution surpassing the best devices on the market. The device is based on a multi reflection scheme that allows multiplying the deflection angle by a factor of 70. The ultimate resolution of the device is 2 prad/√Hz over a measurement range of 36 µrad with a measurement band >10 kHz. The present work describes the working principle, the practical realization, and a case study on a top-level commercial angular actuator (Nano-MTA2 produced by Mad City Labs).
- Published
- 2020
- Full Text
- View/download PDF
22. A Parameter-Efficient Learning Approach to Arabic Dialect Identification with Pre-Trained General-Purpose Speech Model
- Author
-
Radhakrishnan, Srijith, Yang, Chao-Han Huck, Khan, Sumeer Ahmad, Kiani, Narsis A., Gomez-Cabrero, David, and Tegner, Jesper N.
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning ,Computer Science - Neural and Evolutionary Computing ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
In this work, we explore Parameter-Efficient-Learning (PEL) techniques to repurpose a General-Purpose-Speech (GSM) model for Arabic dialect identification (ADI). Specifically, we investigate different setups to incorporate trainable features into a multi-layer encoder-decoder GSM formulation under frozen pre-trained settings. Our architecture includes residual adapter and model reprogramming (input-prompting). We design a token-level label mapping to condition the GSM for Arabic Dialect Identification (ADI). This is challenging due to the high variation in vocabulary and pronunciation among the numerous regional dialects. We achieve new state-of-the-art accuracy on the ADI-17 dataset by vanilla fine-tuning. We further reduce the training budgets with the PEL method, which performs within 1.86% accuracy to fine-tuning using only 2.5% of (extra) network trainable parameters. Our study demonstrates how to identify Arabic dialects using a small dataset and limited computation with open source code and pre-trained models., Comment: Accepted to Interspeech 2023, 5 pages. Code is available at: https://github.com/Srijith-rkr/KAUST-Whisper-Adapter under MIT license
- Published
- 2023
- Full Text
- View/download PDF
23. Continuous Depth Recurrent Neural Differential Equations
- Author
-
Anumasa, Srinivas, Gunapati, Geetakrishnasai, and Srijith, P. K.
- Subjects
Computer Science - Machine Learning - Abstract
Recurrent neural networks (RNNs) have brought a lot of advancements in sequence labeling tasks and sequence data. However, their effectiveness is limited when the observations in the sequence are irregularly sampled, where the observations arrive at irregular time intervals. To address this, continuous time variants of the RNNs were introduced based on neural ordinary differential equations (NODE). They learn a better representation of the data using the continuous transformation of hidden states over time, taking into account the time interval between the observations. However, they are still limited in their capability as they use the discrete transformations and a fixed discrete number of layers (depth) over an input in the sequence to produce the output observation. We intend to address this limitation by proposing RNNs based on differential equations which model continuous transformations over both depth and time to predict an output for a given input in the sequence. Specifically, we propose continuous depth recurrent neural differential equations (CDR-NDE) which generalizes RNN models by continuously evolving the hidden states in both the temporal and depth dimensions. CDR-NDE considers two separate differential equations over each of these dimensions and models the evolution in the temporal and depth directions alternatively. We also propose the CDR-NDE-heat model based on partial differential equations which treats the computation of hidden states as solving a heat equation over time. We demonstrate the effectiveness of the proposed models by comparing against the state-of-the-art RNN models on real world sequence labeling problems and data.
- Published
- 2022
24. Transformer based Multitask Learning for Image Captioning and Object Detection.
- Author
-
Debolena Basak, P. K. Srijith, and Maunendra Sankar Desarkar
- Published
- 2024
- Full Text
- View/download PDF
25. A Study on Effects of Synthetic Data for Predicting the Remaining Useful Life of Aluminium Electrolytic Capacitors Using Bagging-Based Ensemble Learning
- Author
-
Bhattacharyya, Anindya, Srijith, K., Behera, R. P., Dasgupta, Arup, Chakraborty, R. S., Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Das, Swagatam, editor, Saha, Snehanshu, editor, Coello Coello, Carlos A., editor, and Bansal, Jagdish C., editor
- Published
- 2024
- Full Text
- View/download PDF
26. Layered Graphs: Applications and Algorithms
- Author
-
Bhadrachalam Chitturi, Srijith Balachander, Sandeep Satheesh, and Krithic Puthiyoppil
- Subjects
NP-complete ,layered graph ,quasi-polynomial time ,dynamic programming ,independent set ,vertex cover ,dominating set ,string transformations ,social networks ,Industrial engineering. Management engineering ,T55.4-60.8 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
The computation of distances between strings has applications in molecular biology, music theory and pattern recognition. One such measure, called short reversal distance, has applications in evolutionary distance computation. It has been shown that this problem can be reduced to the computation of a maximum independent set on the corresponding graph that is constructed from the given input strings. The constructed graphs primarily fall into a class that we call layered graphs. In a layered graph, each layer refers to a subgraph containing, at most, some k vertices. The inter-layer edges are restricted to the vertices in adjacent layers. We study the MIS, MVC, MDS, MCV and MCD problems on layered graphs where MIS computes the maximum independent set; MVC computes the minimum vertex cover; MDS computes the minimum dominating set; MCV computes the minimum connected vertex cover; and MCD computes the minimum connected dominating set. The MIS, MVC and MDS are computed in polynomial time if k=Θ(log|V|). MCV and MCD are computed polynomial time if k=O((log|V|)α), where α0, then MIS, MVC and MDS are computed in quasi-polynomial time. If k=Θ(log|V|), then MCV and MCD are computed in quasi-polynomial time.
- Published
- 2018
- Full Text
- View/download PDF
27. DialoGen: Generalized Long-Range Context Representation for Dialogue Systems
- Author
-
Dey, Suvodip, Desarkar, Maunendra Sankar, Ekbal, Asif, and Srijith, P. K.
- Subjects
Computer Science - Computation and Language - Abstract
Long-range context modeling is crucial to both dialogue understanding and generation. The most popular method for dialogue context representation is to concatenate the last-$k$ utterances in chronological order. However, this method may not be ideal for conversations containing long-range dependencies, i.e., when there is a need to look beyond last-$k$ utterances to generate a meaningful response. In this work, we propose DialoGen, a novel encoder-decoder based framework for dialogue generation with a generalized context representation that can look beyond the last-$k$ utterances. The main idea of our approach is to identify and utilize the most relevant historical utterances instead of last-$k$, which also enables the compact representation of dialogue history with fewer tokens. We study the effectiveness of our proposed method on both dialogue generation (open-domain) and understanding (DST). Even with a compact context representation, DialoGen performs comparably to the state-of-the-art models on the open-domain DailyDialog dataset. We observe a similar behavior on the DST task of the MultiWOZ dataset when the proposed context representation is applied to existing DST models. We also discuss the generalizability and interpretability of DialoGen and show that the relevance score of previous utterances agrees well with human cognition., Comment: Accepted at PACLIC 2023
- Published
- 2022
28. HyperHawkes: Hypernetwork based Neural Temporal Point Process
- Author
-
Dubey, Manisha, Srijith, P. K., and Desarkar, Maunendra Sankar
- Subjects
Computer Science - Machine Learning - Abstract
Temporal point process serves as an essential tool for modeling time-to-event data in continuous time space. Despite having massive amounts of event sequence data from various domains like social media, healthcare etc., real world application of temporal point process faces two major challenges: 1) it is not generalizable to predict events from unseen sequences in dynamic environment 2) they are not capable of thriving in continually evolving environment with minimal supervision while retaining previously learnt knowledge. To tackle these issues, we propose \textit{HyperHawkes}, a hypernetwork based temporal point process framework which is capable of modeling time of occurrence of events for unseen sequences. Thereby, we solve the problem of zero-shot learning for time-to-event modeling. We also develop a hypernetwork based continually learning temporal point process for continuous modeling of time-to-event sequences with minimal forgetting. In this way, \textit{HyperHawkes} augments the temporal point process with zero-shot modeling and continual learning capabilities. We demonstrate the application of the proposed framework through our experiments on two real-world datasets. Our results show the efficacy of the proposed approach in terms of predicting future events under zero-shot regime for unseen event sequences. We also show that the proposed model is able to predict sequences continually while retaining information from previous event sequences, hence mitigating catastrophic forgetting for time-to-event data., Comment: 9 pages, 2 figures
- Published
- 2022
29. Continual Learning with Dependency Preserving Hypernetworks
- Author
-
Chandra, Dupati Srikar, Varshney, Sakshi, Srijith, P. K., and Gupta, Sunil
- Subjects
Computer Science - Machine Learning ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Humans learn continually throughout their lifespan by accumulating diverse knowledge and fine-tuning it for future tasks. When presented with a similar goal, neural networks suffer from catastrophic forgetting if data distributions across sequential tasks are not stationary over the course of learning. An effective approach to address such continual learning (CL) problems is to use hypernetworks which generate task dependent weights for a target network. However, the continual learning performance of existing hypernetwork based approaches are affected by the assumption of independence of the weights across the layers in order to maintain parameter efficiency. To address this limitation, we propose a novel approach that uses a dependency preserving hypernetwork to generate weights for the target network while also maintaining the parameter efficiency. We propose to use recurrent neural network (RNN) based hypernetwork that can generate layer weights efficiently while allowing for dependencies across them. In addition, we propose novel regularisation and network growth techniques for the RNN based hypernetwork to further improve the continual learning performance. To demonstrate the effectiveness of the proposed methods, we conducted experiments on several image classification continual learning tasks and settings. We found that the proposed methods based on the RNN hypernetworks outperformed the baselines in all these CL settings and tasks., Comment: Paper got accepted in IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023
- Published
- 2022
30. Cosmic Ray Rejection with Attention Augmented Deep Learning
- Author
-
Bhavanam, S. R., Channappayya, Sumohana S., Srijith, P. K., and Desai, Shantanu
- Subjects
Astrophysics - Instrumentation and Methods for Astrophysics ,Physics - Data Analysis, Statistics and Probability - Abstract
Cosmic Ray (CR) hits are the major contaminants in astronomical imaging and spectroscopic observations involving solid-state detectors. Correctly identifying and masking them is a crucial part of the image processing pipeline, since it may otherwise lead to spurious detections. For this purpose, we have developed and tested a novel Deep Learning based framework for the automatic detection of CR hits from astronomical imaging data from two different imagers: Dark Energy Camera (DECam) and Las Cumbres Observatory Global Telescope (LCOGT). We considered two baseline models namely deepCR and Cosmic-CoNN, which are the current state-of-the-art learning based algorithms that were trained using Hubble Space Telescope (HST) ACS/WFC and LCOGT Network images respectively. We have experimented with the idea of augmenting the baseline models using Attention Gates (AGs) to improve the CR detection performance. We have trained our models on DECam data and demonstrate a consistent marginal improvement by adding AGs in True Positive Rate (TPR) at 0.01% False Positive Rate (FPR) and Precision at 95% TPR over the aforementioned baseline models for the DECam dataset. We demonstrate that the proposed AG augmented models provide significant gain in TPR at 0.01% FPR when tested on previously unseen LCO test data having images from three distinct telescope classes. Furthermore, we demonstrate that the proposed baseline models with and without attention augmentation outperform state-of-the-art models such as Astro-SCRAPPY, Maximask (that is trained natively on DECam data) and pre-trained ground-based Cosmic-CoNN. This study demonstrates that the AG module augmentation enables us to get a better deepCR and Cosmic-CoNN models and to improve their generalization capability on unseen data., Comment: 21 pages, 23 figures. Accepted in Astronomy and Computing
- Published
- 2022
- Full Text
- View/download PDF
31. Transformer based Multitask Learning for Image Captioning and Object Detection
- Author
-
Basak, Debolena, primary, Srijith, P. K., additional, and Desarkar, Maunendra Sankar, additional
- Published
- 2024
- Full Text
- View/download PDF
32. InfraRisk: An Open-Source Simulation Platform for Asset-Level Resilience Analysis in Interconnected Infrastructure Networks
- Author
-
Balakrishnan, Srijith and Cassottana, Beatrice
- Subjects
Electrical Engineering and Systems Science - Systems and Control - Abstract
Integrated simulation models are emerging as an alternative for analyzing large-scale interdependent infrastructure networks due to their modeling advantages over traditional interdependency models. This paper presents an open-source integrated simulation package for the asset-level analysis of interdependent infrastructure systems. The simulation platform, named 'InfraRisk' and developed in Python, can simulate disaster-induced infrastructure failures and subsequent post-disaster restoration in interconnected water-, power-, and road networks. InfraRisk consists of an infrastructure module, a hazard module, a recovery module, a simulation module, and a resilience quantification module. The infrastructure module integrates existing infrastructure network packages (wntr for water networks, pandapower for power systems, and a static traffic assignment model for transportation networks) through an interface that facilitates the network-level simulation of interdependent failures. The hazard module generates infrastructure component failure sequences based on various disaster characteristics. The recovery module determines repair sequences and assigns repair crews based on predefined heuristics-based recovery strategies or model predictive control (MPC) based optimization. Based on the schedule, the simulation module implements the network-wide simulation of the consequences of the disaster impacts and the recovery actions. The resilience quantification module offers system-level and consumer-level metrics to quantify both the risks and resilience of the integrated infrastructure networks against disaster events. InfraRisk provides a virtual platform for decision-makers to experiment and develop region-specific pre-disaster and post-disaster policies to enhance the overall resilience of interdependent urban infrastructure networks.
- Published
- 2022
- Full Text
- View/download PDF
33. Application of Clustering Algorithms for Dimensionality Reduction in Infrastructure Resilience Prediction Models
- Author
-
Balakrishnan, Srijith, Cassottana, Beatrice, and Verma, Arun
- Subjects
Computer Science - Machine Learning ,Electrical Engineering and Systems Science - Systems and Control - Abstract
Recent studies increasingly adopt simulation-based machine learning (ML) models to analyze critical infrastructure system resilience. For realistic applications, these ML models consider the component-level characteristics that influence the network response during emergencies. However, such an approach could result in a large number of features and cause ML models to suffer from the `curse of dimensionality'. We present a clustering-based method that simultaneously minimizes the problem of high-dimensionality and improves the prediction accuracy of ML models developed for resilience analysis in large-scale interdependent infrastructure networks. The methodology has three parts: (a) generation of simulation dataset, (b) network component clustering, and (c) dimensionality reduction and development of prediction models. First, an interdependent infrastructure simulation model simulates the network-wide consequences of various disruptive events. The component-level features are extracted from the simulated data. Next, clustering algorithms are used to derive the cluster-level features by grouping component-level features based on their topological and functional characteristics. Finally, ML algorithms are used to develop models that predict the network-wide impacts of disruptive events using the cluster-level features. The applicability of the method is demonstrated using an interdependent power-water-transport testbed. The proposed method can be used to develop decision-support tools for post-disaster recovery of infrastructure networks.
- Published
- 2022
34. Developing resilience pathways for interdependent infrastructure networks: A simulation-based approach with consideration to risk preferences of decision-makers
- Author
-
Balakrishnan, Srijith, Jin, Lawrence, Cassottana, Beatrice, Costa, Alberto, and Sansavini, Giovanni
- Published
- 2024
- Full Text
- View/download PDF
35. Pressure feedback system for flow separation mitigation in scramjet intakes
- Author
-
John, Bibin, Dinesan, Deepu, Geca, Michal Jan, and M.S., Srijith
- Published
- 2024
- Full Text
- View/download PDF
36. Bayesian neural hawkes process for event uncertainty prediction
- Author
-
Dubey, Manisha, Palakkadavath, Ragja, and Srijith, P. K.
- Published
- 2023
- Full Text
- View/download PDF
37. Bayesian Neural Hawkes Process for Event Uncertainty Prediction
- Author
-
Dubey, Manisha, Palakkadavath, Ragja, and Srijith, P. K.
- Subjects
Computer Science - Machine Learning ,Computer Science - Social and Information Networks - Abstract
Event data consisting of time of occurrence of the events arises in several real-world applications. Recent works have introduced neural network based point processes for modeling event-times, and were shown to provide state-of-the-art performance in predicting event-times. However, neural point process models lack a good uncertainty quantification capability on predictions. A proper uncertainty quantification over event modeling will help in better decision making for many practical applications. Therefore, we propose a novel point process model, Bayesian Neural Hawkes process (BNHP) which leverages uncertainty modelling capability of Bayesian models and generalization capability of the neural networks to model event occurrence times. We augment the model with spatio-temporal modeling capability where it can consider uncertainty over predicted time and location of the events. Experiments on simulated and real-world datasets show that BNHP significantly improves prediction performance and uncertainty quantification for modelling events., Comment: 13 pages, 6 tables, 4 plots
- Published
- 2021
38. Bi-Directional Recurrent Neural Ordinary Differential Equations for Social Media Text Classification
- Author
-
Tamire, Maunika, Anumasa, Srinivas, and Srijith, P. K.
- Subjects
Computer Science - Computation and Language ,Computer Science - Machine Learning - Abstract
Classification of posts in social media such as Twitter is difficult due to the noisy and short nature of texts. Sequence classification models based on recurrent neural networks (RNN) are popular for classifying posts that are sequential in nature. RNNs assume the hidden representation dynamics to evolve in a discrete manner and do not consider the exact time of the posting. In this work, we propose to use recurrent neural ordinary differential equations (RNODE) for social media post classification which consider the time of posting and allow the computation of hidden representation to evolve in a time-sensitive continuous manner. In addition, we propose a novel model, Bi-directional RNODE (Bi-RNODE), which can consider the information flow in both the forward and backward directions of posting times to predict the post label. Our experiments demonstrate that RNODE and Bi-RNODE are effective for the problem of stance classification of rumours in social media.
- Published
- 2021
39. Latent Time Neural Ordinary Differential Equations
- Author
-
Anumasa, Srinivas and Srijith, P. K.
- Subjects
Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
Neural ordinary differential equations (NODE) have been proposed as a continuous depth generalization to popular deep learning models such as Residual networks (ResNets). They provide parameter efficiency and automate the model selection process in deep learning models to some extent. However, they lack the much-required uncertainty modelling and robustness capabilities which are crucial for their use in several real-world applications such as autonomous driving and healthcare. We propose a novel and unique approach to model uncertainty in NODE by considering a distribution over the end-time $T$ of the ODE solver. The proposed approach, latent time NODE (LT-NODE), treats $T$ as a latent variable and apply Bayesian learning to obtain a posterior distribution over $T$ from the data. In particular, we use variational inference to learn an approximate posterior and the model parameters. Prediction is done by considering the NODE representations from different samples of the posterior and can be done efficiently using a single forward pass. As $T$ implicitly defines the depth of a NODE, posterior distribution over $T$ would also help in model selection in NODE. We also propose, adaptive latent time NODE (ALT-NODE), which allow each data point to have a distinct posterior distribution over end-times. ALT-NODE uses amortized variational inference to learn an approximate posterior using inference networks. We demonstrate the effectiveness of the proposed approaches in modelling uncertainty and robustness through experiments on synthetic and several real-world image classification data., Comment: Accepted at AAAI-2022
- Published
- 2021
40. Improving Robustness and Uncertainty Modelling in Neural Ordinary Differential Equations
- Author
-
Anumasa, Srinivas and Srijith, P. K.
- Subjects
Computer Science - Machine Learning - Abstract
Neural ordinary differential equations (NODE) have been proposed as a continuous depth generalization to popular deep learning models such as Residual networks (ResNets). They provide parameter efficiency and automate the model selection process in deep learning models to some extent. However, they lack the much-required uncertainty modelling and robustness capabilities which are crucial for their use in several real-world applications such as autonomous driving and healthcare. We propose a novel and unique approach to model uncertainty in NODE by considering a distribution over the end-time $T$ of the ODE solver. The proposed approach, latent time NODE (LT-NODE), treats $T$ as a latent variable and apply Bayesian learning to obtain a posterior distribution over $T$ from the data. In particular, we use variational inference to learn an approximate posterior and the model parameters. Prediction is done by considering the NODE representations from different samples of the posterior and can be done efficiently using a single forward pass. As $T$ implicitly defines the depth of a NODE, posterior distribution over $T$ would also help in model selection in NODE. We also propose, adaptive latent time NODE (ALT-NODE), which allow each data point to have a distinct posterior distribution over end-times. ALT-NODE uses amortized variational inference to learn an approximate posterior using inference networks. We demonstrate the effectiveness of the proposed approaches in modelling uncertainty and robustness through experiments on synthetic and several real-world image classification data., Comment: Winter Conference on Applications of Computer Vision, 2021
- Published
- 2021
- Full Text
- View/download PDF
41. An Investigation into Keystroke Dynamics and Heart Rate Variability as Indicators of Stress
- Author
-
Unni, Srijith, Gowda, Sushma Suryanarayana, and Smeaton, Alan F.
- Subjects
Computer Science - Human-Computer Interaction - Abstract
Lifelogging has become a prominent research topic in recent years. Wearable sensors like Fitbits and smart watches are now increasingly popular for recording ones activities. Some researchers are also exploring keystroke dynamics for lifelogging. Keystroke dynamics refers to the process of measuring and assessing a persons typing rhythm on digital devices. A digital footprint is created when a user interacts with devices like keyboards, mobile phones or touch screen panels and the timing of the keystrokes is unique to each individual though likely to be affected by factors such as fatigue, distraction or emotional stress. In this work we explore the relationship between keystroke dynamics as measured by the timing for the top-10 most frequently occurring bi-grams in English, and the emotional state and stress of an individual as measured by heart rate variability (HRV). We collected keystroke data using the Loggerman application while HRV was simultaneously gathered. With this data we performed an analysis to determine the relationship between variations in keystroke dynamics and variations in HRV. Our conclusion is that we need to use a more detailed representation of keystroke timing than the top-10 bigrams, probably personalised to each user., Comment: 12 pages. To appear at MMM 2022, 28th International Conference on Multimedia Modeling, 5-8 April 2022, Phu Quoc, Vietnam
- Published
- 2021
42. Safety, tolerability, viral kinetics, and immune correlates of protection in healthy, seropositive UK adults inoculated with SARS-CoV-2: a single-centre, open-label, phase 1 controlled human infection study
- Author
-
Alparaque, Maricel, Anid, Liisa, Barnes, Eleanor, Benamore, Rachel, Bharti, Neha, Patel, Bhumika, Burns, Adrian, Byard, Nicholas, Conway, Oliver, Cooper, Cushla, Crowther, Charlotte, Dunachie, Susanna J, Johnstone, Trudi, Jose, Jyolsna, Luciw, Michael, Mujadidi, Yama, Nehiweze, Aiseosa, Nyamunda, Sibongile, Orobiyi-Rieba, Maria, Parvelikudy, Bindu, Platt, Abigail, Pswarayi, Dzikamayi, Quaddy, Jack, Samuel, Binnie Elizabeth, Sette, Alessandro, Sodipo, Victoria, Srijith, Preethu, Stone, Helen, Turner, Cheryl, Valmores, Mary Ann, Voaides, Alexandru, Vuddamalay, Gavindren, Jackson, Susan, Marshall, Julia L, Mawer, Andrew, Lopez-Ramon, Raquel, Harris, Stephanie A, Satti, Iman, Hughes, Eileen, Preston-Jones, Hannah, Cabrera Puig, Ingrid, Longet, Stephanie, Tipton, Tom, Laidlaw, Stephen, Doherty, Rebecca Powell, Morrison, Hazel, Mitchell, Robert, Tanner, Rachel, Ateere, Alberta, Stylianou, Elena, Wu, Meng-San, Fredsgaard-Jones, Timothy P W, Breuer, Judith, Rapeport, Garth, Ferreira, Vanessa M, Gleeson, Fergus, Pollard, Andrew J, Carroll, Miles, Catchpole, Andrew, Chiu, Christopher, and McShane, Helen
- Published
- 2024
- Full Text
- View/download PDF
43. Electrical detection of RNA cancer biomarkers at the single-molecule level
- Author
-
Pattiya Arachchillage, Keshani G. Gunasinghe, Chandra, Subrata, Williams, Ajoke, Piscitelli, Patrick, Pham, Jennifer, Castillo, Aderlyn, Florence, Lily, Rangan, Srijith, and Artes Vivancos, Juan M.
- Published
- 2023
- Full Text
- View/download PDF
44. Monte Carlo DropBlock for Modelling Uncertainty in Object Detection
- Author
-
Deepshikha, Kumari, Yelleni, Sai Harsha, Srijith, P. K., and Mohan, C Krishna
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
With the advancements made in deep learning, computer vision problems like object detection and segmentation have seen a great improvement in performance. However, in many real-world applications such as autonomous driving vehicles, the risk associated with incorrect predictions of objects is very high. Standard deep learning models for object detection such as YOLO models are often overconfident in their predictions and do not take into account the uncertainty in predictions on out-of-distribution data. In this work, we propose an efficient and effective approach to model uncertainty in object detection and segmentation tasks using Monte-Carlo DropBlock (MC-DropBlock) based inference. The proposed approach applies drop-block during training time and test time on the convolutional layer of the deep learning models such as YOLO. We show that this leads to a Bayesian convolutional neural network capable of capturing the epistemic uncertainty in the model. Additionally, we capture the aleatoric uncertainty using a Gaussian likelihood. We demonstrate the effectiveness of the proposed approach on modeling uncertainty in object detection and segmentation tasks using out-of-distribution experiments. Experimental results show that MC-DropBlock improves the generalization, calibration, and uncertainty modeling capabilities of YOLO models in object detection and segmentation.
- Published
- 2021
45. Subset-of-Data Variational Inference for Deep Gaussian-Processes Regression
- Author
-
Jain, Ayush, Srijith, P. K., and Khan, Mohammad Emtiyaz
- Subjects
Statistics - Machine Learning ,Computer Science - Machine Learning - Abstract
Deep Gaussian Processes (DGPs) are multi-layer, flexible extensions of Gaussian processes but their training remains challenging. Sparse approximations simplify the training but often require optimization over a large number of inducing inputs and their locations across layers. In this paper, we simplify the training by setting the locations to a fixed subset of data and sampling the inducing inputs from a variational distribution. This reduces the trainable parameters and computation cost without significant performance degradations, as demonstrated by our empirical results on regression problems. Our modifications simplify and stabilize DGP training while making it amenable to sampling schemes for setting the inducing inputs., Comment: Accepted in the 37th Conference on Uncertainty in Artificial Intelligence (UAI 2021)
- Published
- 2021
46. Adiabatic Quantum Feature Selection for Sparse Linear Regression
- Author
-
Desu, Surya Sai Teja, Srijith, P. K., Rao, M. V. Panduranga, and Sivadasan, Naveen
- Subjects
Computer Science - Machine Learning ,Quantum Physics ,Statistics - Machine Learning - Abstract
Linear regression is a popular machine learning approach to learn and predict real valued outputs or dependent variables from independent variables or features. In many real world problems, its beneficial to perform sparse linear regression to identify important features helpful in predicting the dependent variable. It not only helps in getting interpretable results but also avoids overfitting when the number of features is large, and the amount of data is small. The most natural way to achieve this is by using `best subset selection' which penalizes non-zero model parameters by adding $\ell_0$ norm over parameters to the least squares loss. However, this makes the objective function non-convex and intractable even for a small number of features. This paper aims to address the intractability of sparse linear regression with $\ell_0$ norm using adiabatic quantum computing, a quantum computing paradigm that is particularly useful for solving optimization problems faster. We formulate the $\ell_0$ optimization problem as a Quadratic Unconstrained Binary Optimization (QUBO) problem and solve it using the D-Wave adiabatic quantum computer. We study and compare the quality of QUBO solution on synthetic and real world datasets. The results demonstrate the effectiveness of the proposed adiabatic quantum computing approach in finding the optimal solution. The QUBO solution matches the optimal solution for a wide range of sparsity penalty values across the datasets., Comment: 8 pages, 2 tables
- Published
- 2021
47. CAM-GAN: Continual Adaptation Modules for Generative Adversarial Networks
- Author
-
Varshney, Sakshi, Verma, Vinay Kumar, K, Srijith P, Carin, Lawrence, and Rai, Piyush
- Subjects
Computer Science - Machine Learning ,Computer Science - Computer Vision and Pattern Recognition ,Statistics - Machine Learning - Abstract
We present a continual learning approach for generative adversarial networks (GANs), by designing and leveraging parameter-efficient feature map transformations. Our approach is based on learning a set of global and task-specific parameters. The global parameters are fixed across tasks whereas the task-specific parameters act as local adapters for each task, and help in efficiently obtaining task-specific feature maps. Moreover, we propose an element-wise addition of residual bias in the transformed feature space, which further helps stabilize GAN training in such settings. Our approach also leverages task similarity information based on the Fisher information matrix. Leveraging this knowledge from previous tasks significantly improves the model performance. In addition, the similarity measure also helps reduce the parameter growth in continual adaptation and helps to learn a compact model. In contrast to the recent approaches for continually-learned GANs, the proposed approach provides a memory-efficient way to perform effective continual data generation. Through extensive experiments on challenging and diverse datasets, we show that the feature-map-transformation approach outperforms state-of-the-art methods for continually-learned GANs, with substantially fewer parameters. The proposed method generates high-quality samples that can also improve the generative-replay-based continual learning for discriminative tasks., Comment: Under Submission
- Published
- 2021
48. Galaxy Morphology Classification using Neural Ordinary Differential Equations
- Author
-
Gupta, Raghav, Srijith, P. K., and Desai, Shantanu
- Subjects
Astrophysics - Instrumentation and Methods for Astrophysics ,Astrophysics - Astrophysics of Galaxies - Abstract
We introduce a continuous depth version of the Residual Network (ResNet) called Neural ordinary differential equations (NODE) for the purpose of galaxy morphology classification. We carry out a classification of galaxy images from the Galaxy Zoo 2 dataset, consisting of five distinct classes, and obtained an accuracy between 91-95\%, depending on the image class. We train NODE with different numerical techniques such as adjoint and Adaptive Checkpoint Adjoint (ACA) and compare them against ResNet. While ResNet has certain drawbacks, such as time consuming architecture selection (e.g. the number of layers) and the requirement of a large dataset needed for training, NODE can overcome these limitations. Through our results, we show that that the accuracy of NODE is comparable to ResNet, and the number of parameters used is about one-third as compared to ResNet, thus leading to a smaller memory footprint, which would benefit next generation surveys., Comment: 10 pages, 5 figures. Now also used NODE_ACA. Accepted for publication in Astronomy and Computing
- Published
- 2020
- Full Text
- View/download PDF
49. Delay Differential Neural Networks
- Author
-
Anumasa, Srinivas and Srijith, P. K.
- Subjects
Computer Science - Machine Learning ,Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Neural and Evolutionary Computing - Abstract
Neural ordinary differential equations (NODEs) treat computation of intermediate feature vectors as trajectories of ordinary differential equation parameterized by a neural network. In this paper, we propose a novel model, delay differential neural networks (DDNN), inspired by delay differential equations (DDEs). The proposed model considers the derivative of the hidden feature vector as a function of the current feature vector and past feature vectors (history). The function is modelled as a neural network and consequently, it leads to continuous depth alternatives to many recent ResNet variants. We propose two different DDNN architectures, depending on the way current and past feature vectors are considered. For training DDNNs, we provide a memory-efficient adjoint method for computing gradients and back-propagate through the network. DDNN improves the data efficiency of NODE by further reducing the number of parameters without affecting the generalization performance. Experiments conducted on synthetic and real-world image classification datasets such as Cifar10 and Cifar100 show the effectiveness of the proposed models.
- Published
- 2020
50. Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech Recognition.
- Author
-
Srijith Radhakrishnan, Chao-Han Huck Yang, Sumeer Ahmad Khan, Rohit Kumar, Narsis A. Kiani, David Gomez-Cabrero, and Jesper Tegnér
- Published
- 2023
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.