Descriptor: "Generative models" / Search Limiters: Peer Reviewed - Searchworks@Jio Institute Digital Library Search Results

1. Synthesizing Spreading-out features for generative zero-shot image classification

Author: Liu, Jingren, Sun, Ke, Zhang, Zheng, Long, Yang, Yang, Wankou, Yan, Yunyang, and Zhang, Haofeng
Published: 2025
Full Text: View/download PDF

2. Synthetic generated data for intelligent corrosion classification in oil and gas pipelines

Author: Ramos, Leo Thomas, Casas, Edmundo, and Rivas-Echeverría, Francklin
Published: 2025
Full Text: View/download PDF

3. Fast gradient-free activation maximization for neurons in spiking neural networks

Author: Pospelov, Nikita, Chertkov, Andrei, Beketov, Maxim, Oseledets, Ivan, and Anokhin, Konstantin
Published: 2025
Full Text: View/download PDF

4. Content-aware preserving image generation

Author: Le, Giang H., Nguyen, Anh Q., Kang, Byeongkeun, and Lee, Yeejin
Published: 2025
Full Text: View/download PDF

5. A novel framework for diverse video generation from a single video using frame-conditioned denoising diffusion probabilistic model and ConvNeXt-V2

Author: Verma, Ayushi, Badal, Tapas, and Bansal, Abhay
Published: 2025
Full Text: View/download PDF

6. Generative AI in the context of assistive technologies: Trends, limitations and future directions

Author: Fu, Biying, Hadid, Abdenour, and Damer, Naser
Published: 2025
Full Text: View/download PDF

7. Learning temporal maps of dynamics for mobile robots

Author: Shi, Junyi and Kucner, Tomasz Piotr
Published: 2025
Full Text: View/download PDF

8. Unveiling the nexus between energy storage and electricity markets in academic publications. A data-driven analysis of emerging trends and market dynamics using NLP, sentiment analysis and probabilistic modeling

Author: Oprea, Simona-Vasilica and Bâra, Adela
Published: 2025
Full Text: View/download PDF

9. Synthetic ECG signals generation: A scoping review

Author: Zanchi, Beatrice, Monachino, Giuliana, Fiorillo, Luigi, Conte, Giulio, Auricchio, Angelo, Tzovara, Athina, and Faraci, Francesca D.
Published: 2025
Full Text: View/download PDF

10. Augmenting a spine CT scans dataset using VAEs, GANs, and transfer learning for improved detection of vertebral compression fractures

Author: El Kojok, Zeina, Al Khansa, Hadi, Trad, Fouad, and Chehab, Ali
Published: 2025
Full Text: View/download PDF

11. Fault diagnosis of photovoltaic arrays with different degradation levels based on cross-domain adaptive generative adversarial network

Author: Lin, Peijie, Guo, Feng, Lin, Yaohai, Cheng, Shuying, Lu, Xiaoyang, Chen, Zhicong, and Wu, Lijun
Published: 2025
Full Text: View/download PDF

12. Class-wise and instance-wise contrastive learning for zero-shot learning based on VAEGAN

Author: Zheng, Baolong, Li, Zhanshan, and Li, Jingyao
Published: 2025
Full Text: View/download PDF

13. Generative AI for synthetic data across multiple medical modalities: A systematic review of recent developments and challenges

Author: Ibrahim, Mahmoud, Khalil, Yasmina Al, Amirrajab, Sina, Sun, Chang, Breeuwer, Marcel, Pluim, Josien, Elen, Bart, Ertaylan, Gökhan, and Dumontier, Michel
Published: 2025
Full Text: View/download PDF

14. TG-ERC: Utilizing three generation models to handle emotion recognition in conversation tasks

Author: Gou, Zhinan, Long, Yuchen, Sun, Jieli, and Gao, Kai
Published: 2025
Full Text: View/download PDF

15. Single-image reflectance and transmittance estimation from any flatbed scanner

Author: Rodriguez-Pardo, Carlos, Pascual-Hernandez, David, Rodriguez-Vazquez, Javier, Lopez-Moreno, Jorge, and Garces, Elena
Published: 2025
Full Text: View/download PDF

16. A lightweight generative model for interpretable subject-level prediction

Author: Mauri, Chiara, Cerri, Stefano, Puonti, Oula, Mühlau, Mark, and Van Leemput, Koen
Published: 2025
Full Text: View/download PDF

17. Bridging gaps with computer vision: AI in (bio)medical imaging and astronomy

Author: Rezaei, S., Chegeni, A., Javadpour, A., VafaeiSadr, A., Cao, L., Röttgering, H., and Staring, M.
Published: 2025
Full Text: View/download PDF

18. Structure-based protein and small molecule generation using EGNN and diffusion models: A comprehensive review

Author: Soleymani, Farzan, Paquet, Eric, Viktor, Herna Lydia, and Michalowski, Wojtek
Published: 2024
Full Text: View/download PDF

19. AI vs. human-generated content and accounts on Instagram: User preferences, evaluations, and ethical considerations

Author: Park, Jeongeun, Oh, Changhoon, and Kim, Ha Young
Published: 2024
Full Text: View/download PDF

20. Can generative AI replace immunofluorescent staining processes? A comparison study of synthetically generated cellpainting images from brightfield

Author: Xing, Xiaodan, Murdoch, Siofra, Tang, Chunling, Papanastasiou, Giorgos, Cross-Zamirski, Jan, Guo, Yunzhe, Xiao, Xianglu, Schönlieb, Carola-Bibiane, Wang, Yinhai, and Yang, Guang
Published: 2024
Full Text: View/download PDF

21. The minimal computational substrate of fluid intelligence

Author: Nelson, Amy P.K., Mole, Joe, Pombo, Guilherme, Gray, Robert J., Ruffle, James K., Chan, Edgar, Rees, Geraint E., Cipolotti, Lisa, and Nachev, Parashkev
Published: 2024
Full Text: View/download PDF

22. RealSinger: Ultra-realistic singing voice generation via stochastic differential equations

Author: Shi, Ziqiang and Wu, Shoule
Published: 2024
Full Text: View/download PDF

23. Leveraging VQ-VAE tokenization for autoregressive modeling of medical time series

Author: Lee, Yoonhyung, Chae, Younhyung, and Jung, Kyomin
Published: 2024
Full Text: View/download PDF

24. Generative discovery of safer chemical alternatives using diffusion modeling: A case study in green solvent design for cyclohexane/benzene extractive distillation

Author: Tan, Zhichao, Lin, Kunsen, Zhao, Youcai, and Zhou, Tao
Published: 2025
Full Text: View/download PDF

25. Preserving logical and functional dependencies in synthetic tabular data

Author: Umesh, Chaithra, Schultz, Kristian, Mahendra, Manjunath, Bej, Saptarshi, and Wolkenhauer, Olaf
Published: 2025
Full Text: View/download PDF

26. Molecule generation for drug design: A graph learning perspective

Author: Yang, Nianzu, Wu, Huaijin, Zeng, Kaipeng, Li, Yang, Bao, Siyuan, and Yan, Junchi
Published: 2024
Full Text: View/download PDF

27. Diff-Props: is Semantics Preserved within a Diffusion Model?

Author: Bonechi, Simone, Andreini, Paolo, Corradini, Barbara Toniella, and Scarselli, Franco
Published: 2024
Full Text: View/download PDF

28. Towards a Definition of Generative Artificial Intelligence.

Author: Ronge, Raphael, Maier, Markus, and Rathgeber, Benjamin
Abstract: The concept of Generative Artificial Intelligence (GenAI) is ubiquitous in the public discourse, yet rarely defined precisely. We clarify main concepts that are usually discussed in connection to GenAI and argue that one ought to distinguish between the technical and the public discourse. In order to show its complex development and associated conceptual ambiguities, we offer a historical-systematic reconstruction of GenAI and explicitly discuss two exemplary cases: the generative status of the Large Language Model BERT and the differences between protein structure predictions from AlphaFold 2 and 3. Our analysis shows that there is no unique and unambiguous definition of GenAI based on a purely technical account of the term. Following this conclusion, we argue that the public discourse is not simply a less complex way of speaking, but instead transcends its technical basis. As a means to structure this newly emerging discussion landscape we introduce a non-exhaustive list of four central aspects of GenAI: (multi-)modality, interaction, flexibility, and productivity. These dimensions constitute a first step towards defining GenAI beyond its technical basis. [ABSTRACT FROM AUTHOR]
Published: 2025
Full Text: View/download PDF

29. Interactive sequential generative models for team sports.

Author: Fassmeyer, Dennis, Cordes, Moritz, and Brefeld, Ulf
Subjects: GRAPH neural networks, TEAM sports, ARTIFICIAL intelligence, AUTOENCODER, INFORMATION networks
Abstract: Understanding spatiotemporal coordination of players in team sports is key to movement models, pattern detection, and computational tactics. Existing generative models propose to capture all stochasticity by a single latent variable and may suffer from entangled representations, or aim to uncover interaction structures of players but then do not focus on their generative ability. As a remedy, we propose a hierarchical latent variable model for predicting trajectories of multiple players. In the generative model, both, discrete role assignments and a latent interaction graph are sampled to allow for different models in subsequent node updates and message passing operations between nodes, where standard Gaussian latent variables are employed per agent and timestep. We cast our approach as a variational autoencoder that provides a disentangled latent space to capture variability in team sport movements and propose a neural architecture for its optimization. We empirically evaluate our approach on tracking data from basketball and soccer and observe that our contribution outperforms the state-of-art in all experiments. [ABSTRACT FROM AUTHOR]
Published: 2025
Full Text: View/download PDF

30. Beyond traditional visual object tracking: a survey.

Author: Abdelaziz, Omar, Shehata, Mohamed, and Mohamed, Mohamed
Abstract: Single object tracking is a vital task of many applications in critical fields. However, it is still considered one of the most challenging vision tasks. In recent years, computer vision, especially object tracking, witnessed the introduction or adoption of many novel techniques, setting new fronts for performance. In this survey, we visit some of the cutting-edge techniques in vision, such as Sequence Models, Generative Models, Self-supervised Learning, Unsupervised Learning, Reinforcement Learning, Meta-Learning, Continual Learning, and Domain Adaptation, focusing on their application in single object tracking. We propose a novel categorization of single object tracking methods based on novel techniques and trends. Also, we conduct a comparative analysis of the performance reported by the methods presented on popular tracking benchmarks. Moreover, we analyze the pros and cons of the presented approaches and present a guide for non-traditional techniques in single object tracking. Finally, we suggest potential avenues for future research in single-object tracking. [ABSTRACT FROM AUTHOR]
Published: 2025
Full Text: View/download PDF

31. Pr-Ge-Ne: Efficient Encoding of Pervasive Video Sensing Streams by Pruned Generative Networks.

Author: Wu, Ji-Yan, Gamlath, Kasun, and Misra, Archan
Subjects: STREAMING video & television, VIDEO compression, ARTIFICIAL intelligence, RELATIVE motion, DEEP learning
Abstract: While video sensing, performed by resource-constrained pervasive devices, is a key enabler of many machine intelligence applications, the high energy and bandwidth overheads of streaming video transmission continue to present formidable deployment challenges. Motivated by the recent advancements in deep learning models, this article proposes the usage of a Generative Network-based technique for resource-efficient streaming video compression and transmission. However, we empirically show that while such generative network-based models offer superior compression gains compared to H.265, additional DNN optimization mechanisms are needed to substantially reduce their encoder complexity. Our proposed optimized system, dubbed Pr-Ge-Ne, adopts a carefully pruned encoder-decoder DNN, on the pervasive device, to efficiently encode a latent vector representation of intra-frame relative motion, and then uses a generator network at the decoder to reconstruct the frames by overlaying such motion information to "animate" an initial reference frame. By evaluating three representative streaming video datasets, we show that Pr-Ge-Ne achieves around $6$ – $10$ -fold reduction in video transmission rates (with negligible impact on the accuracy of machine perception tasks) compared to H.265, while simultaneously reducing latency and energy overheads on a pervasive device by $\sim$ 90% and 15–50%, respectively. [ABSTRACT FROM AUTHOR]
Published: 2025
Full Text: View/download PDF

32. Precipitation nowcasting with generative diffusion models: Precipitation nowcasting with generative diffusion models: A. Asperti et al.

Author: Asperti, Andrea, Merizzi, Fabio, Paparella, Alberto, Pedrazzi, Giorgio, Angelinelli, Matteo, and Colamonaco, Stefano
Abstract: In recent years traditional numerical methods for accurate weather prediction have been increasingly challenged by deep learning methods. Numerous historical datasets used for short and medium-range weather forecasts are typically organized into a regular spatial grid structure. This arrangement closely resembles images: each weather variable can be visualized as a map or, when considering the temporal axis, as a video. Several classes of generative models, comprising Generative Adversarial Networks, Variational Autoencoders, or the recent Denoising Diffusion Models have largely proved their applicability to the next-frame prediction problem, and is thus natural to test their performance on the weather prediction benchmarks. Diffusion models are particularly appealing in this context, due to the intrinsically probabilistic nature of weather forecasting: what we are really interested to model is the probability distribution of weather indicators, whose expected value is the most likely prediction. In our study, we focus on a specific subset of the ERA-5 dataset, which includes hourly data pertaining to Central Europe from the years 2016 to 2021. Within this context, we examine the efficacy of diffusion models in handling the task of precipitation nowcasting, with a lead time of 1 to 3 hours. Our work is conducted in comparison to the performance of well-established U-Net models, as documented in the existing literature. An additional comparative analysis has been done with the forecasting capabilities of the CERRA system, part of the Copernicus Climate Change Service. The novelty of our approach, Generative Ensemble Diffusion (GED), lies in its innovative use of a diffusion model to generate a diverse set of possible weather scenarios. These scenarios are then amalgamated into a single prediction in a post-processing phase. This approach mimics the usual weather forecasting technique consisting in running an ensemble of numerical simulations under slightly different initial conditions by exploiting instead the intrinsic stochasticity of the generative model. In comparison to recent deep learning models addressing the same problem, our approach results in approximately a 25% reduction in the mean squared error. Reverse diffusion is a core concept in our GED approach, is particularly relevant to weather forecasting. In the context of diffusion models, reverse diffusion refers to the process of iteratively refining a noisy initial prediction into a coherent and realistic forecast. By leveraging reverse diffusion, our model effectively simulates the complex temporal dynamics of weather systems, mirroring the inherent uncertainty and variability in weather patterns. [ABSTRACT FROM AUTHOR]
Published: 2025
Full Text: View/download PDF

33. Continuous Generative Neural Networks: A Wavelet-Based Architecture in Function Spaces.

Author: Alberti, Giovanni S., Santacesaria, Matteo, and Sciutto, Silvia
Subjects: *INVERSE problems, *FUNCTION spaces, *SPACE (Architecture), *NONLINEAR functions, *COMPUTER simulation
Abstract: In this work, we present and study Continuous Generative Neural Networks (CGNNs), namely, generative models in the continuous setting: the output of a CGNN belongs to an infinite-dimensional function space. The architecture is inspired by DCGAN, with one fully connected layer, several convolutional layers and nonlinear activation functions. In the continuous L2 setting, the dimensions of the spaces of each layer are replaced by the scales of a multiresolution analysis of a compactly supported wavelet. We present conditions on the convolutional filters and on the nonlinearity that guarantee that a CGNN is injective. This theory finds applications to inverse problems, and allows for deriving Lipschitz stability estimates for (possibly nonlinear) infinite-dimensional inverse problems with unknowns belonging to the manifold generated by a CGNN. Several numerical simulations, including signal deblurring, illustrate and validate this approach. [ABSTRACT FROM AUTHOR]
Published: 2025
Full Text: View/download PDF

34. Generative Models for the Psychology of Art and Aesthetics.

Author: Hertzmann, Aaron
Subjects: GENERATIVE artificial intelligence, PSYCHOLOGY of art, ARTISTIC creation, AESTHETICS of art, COMPUTER graphics
Abstract: This paper describes how computational generative models can describe aspects of the artistic process, and how these generative models can provide tools for formulating and testing psychological theories of art. The term "generative models" here refers to algorithms that can generate artistic imagery, video, text, or other artistic media, including techniques developed in both computer graphics and AI research. Generative models can both describe artistic processes and offer useful experimental tools. This paper first outlines different ways to understand the types of research in generative models. It then surveys several recent examples of using generative models to develop theories and to perform experiments. The paper then discusses misleading uses of the concept of "AI-generated art" in psychological studies, and the need for study of our relationship with new artistic technologies. Finally, the paper offers a few remarks on pursuing interdisciplinary research across psychology and computer graphics. [ABSTRACT FROM AUTHOR]
Published: 2025
Full Text: View/download PDF

35. Data augmentation in predictive maintenance applicable to hydrogen combustion engines: a review.

Author: Schwarz, Alexander, Rahal, Jhonny Rodriguez, Sahelices, Benjamín, Barroso-García, Verónica, Weis, Ronny, and Duque Antón, Simon
Abstract: Machine-learning-based predictive maintenance models, i.e. models that predict breakdowns of machines based on condition information, have a high potential to minimize maintenance costs in industrial applications by determining the best possible time to perform maintenance. Modern machines have sensors that can collect all relevant data of the operating condition and for legacy machines which are still widely used in the industry, retrofit sensors are readily, easily and inexpensively available. With the help of this data it is possible to train such a predictive maintenance model. The main problem is that most data is obtained from normal operating conditions, whereas only limited data are from failures. This leads to highly unbalanced data sets, which makes it very difficult, if not impossible, to train a predictive maintenance model that can detect faults reliably and timely. Another issue is the lack of available real data due to privacy concerns. To address these problems, a suitable data generation strategy is needed. In this work, a literature review is conducted to identify a solution approach for a suitable data augmentation strategy that can be applied to our specific use case of hydrogen combustion engines in the automotive field. This literature review shows that, among the different state-of-the-art proposals, the most promising for the generation of reliable synthetic data are the ones based on generative models. The analysis of the different metrics used in the state of the art allows to identify the most suitable ones to evaluate the quality of generated signals. Finally, an open problem in research in this area is identified and it is the need to validate the plausibility of the data generated. The generation of results in this area will contribute decisively to the development of predictive maintenance models. [ABSTRACT FROM AUTHOR]
Published: 2025
Full Text: View/download PDF

36. How transfer learning is used in generative models for image classification: improved accuracy.

Author: Ebrahimzadeh, Danial, Sharif, Sarah, and Banad, Yaser
Abstract: Recent breakthroughs in generative neural networks have paved the way for transformative capabilities, particularly in their capacity to generate novel data, notably in the realm of images. The integration of these models with the increasingly popular technique of transfer learning, designed for proficient feature extraction, holds the promise of enhancing overall performance. This paper delves into the exploration of employing generative models in conjunction with transfer learning methods for feature extraction, with a specific focus on image classification tasks. Our investigation aims to scrutinize the effectiveness of leveraging generative models alongside pre-trained models as feature extractors in the context of image classification. To the best of our knowledge, our investigation is the first to link transfer learning and generative models for a discriminative task under one roof. The proposed approach undergoes rigorous evaluation on two distinct datasets, employing specific metrics to gauge the model’s performance. The results exhibit a notable nearly 10% enhancement achieved through the integration of generative models, underscoring their potential for achieving heightened accuracy in image classification. These findings highlight significant advancements in image classification accuracy, surpassing the performance of conventional Artificial Neural Network (ANN) models. [ABSTRACT FROM AUTHOR]
Published: 2025
Full Text: View/download PDF

37. GeoGail: A Model-Based Imitation Learning Framework for Human Trajectory Synthesizing.

Author: Wu, Yuchen, Wang, Huandong, Gao, Changzheng, Jin, Depeng, and Li, Yong
Subjects: GENERATIVE adversarial networks, LEARNING, IMITATIVE behavior, DECISION making
Abstract: Synthesized human trajectories are crucial for a large number of applications. Existing solutions are mainly based on the generative adversarial network (GAN), which is limited due to the lack of modeling the human decision-making process. In this article, we propose a novel imitation learning-based method to synthesize human trajectories. This model utilizes a novel semantics-based interaction mechanism between the decision-making strategy and visitations to diverse geographical locations to model them in the semantic domain in a uniform manner. To augment the modeling ability to the real-world human decision-making policy, we propose a feature extraction model to extract the internal latent factors of variation of different individuals and then propose a novel self-attention-based policy net to capture the long-term correlation of mobility and decision-making patterns. Then, to better reward users' mobility behavior, we propose a novel multi-scale reward net combined with mutual information to model the instant reward, long-term reward, and individual characteristics in a cohesive manner. Extensive experimental results on two real-world trajectory datasets show that our proposed model can synthesize the most high-quality trajectory data compared with six state-of-the-art baselines in terms of a number of key usability metrics and can well support practical applications based on trajectory data, demonstrating its effectiveness. Furthermore, our proposed method can learn explainable knowledge automatically from data, including explainable statistical features of trajectories and statistical relation between decision-making policy and features. [ABSTRACT FROM AUTHOR]
Published: 2025
Full Text: View/download PDF

38. Parents and Children: Distinguishing Multimodal Deepfakes from Natural Images.

Author: Amoroso, Roberto, Morelli, Davide, Cornia, Marcella, Baraldi, Lorenzo, Del Bimbo, Alberto, and Cucchiara, Rita
Subjects: TRANSFORMER models, STABLE Diffusion, IMAGE recognition (Computer vision), DEEPFAKES, NATURAL languages
Abstract: Recent advancements in diffusion models have enabled the generation of realistic deepfakes from textual prompts in natural language. While these models have numerous benefits across various sectors, they have also raised concerns about the potential misuse of fake images and cast new pressures on fake image detection. In this work, we pioneer a systematic study on deepfake detection generated by state-of-the-art diffusion models. Firstly, we conduct a comprehensive analysis of the performance of contrastive and classification-based visual features, respectively, extracted from CLIP-based models and ResNet or Vision Transformer (ViT)-based architectures trained on image classification datasets. Our results demonstrate that fake images share common low-level cues, which render them easily recognizable. Further, we devise a multimodal setting wherein fake images are synthesized by different textual captions, which are used as seeds for a generator. Under this setting, we quantify the performance of fake detection strategies and introduce a contrastive-based disentangling method that lets us analyze the role of the semantics of textual descriptions and low-level perceptual cues. Finally, we release a new dataset, called COCOFake, containing about 1.2 million images generated from the original COCO image–caption pairs using two recent text-to-image diffusion models, namely Stable Diffusion v1.4 and v2.0. [ABSTRACT FROM AUTHOR]
Published: 2025
Full Text: View/download PDF

39. New Metrics and Dataset for Biological Development Video Generation.

Author: Celard, Pedro, Iglesias, Eva Lorenzo, Sorribes-Fernández, Jose Manuel, Borrajo, Lourdes, and Vieira, Adrián Seara
Subjects: GENERATIVE adversarial networks, CONVOLUTIONAL neural networks, HIGH resolution imaging, BIOLOGICAL evolution, NEURAL development
Abstract: Image generative models have advanced in many areas to produce synthetic images of high resolution and detail. This success has enabled its use in the biomedical field, paving the way for the generation of videos showing the biological evolution of its content. Despite the power of generative video models, their use has not yet extended to time-based development, focusing almost exclusively on generating motion in space. This situation is largely due to the lack of specific datasets and metrics to measure the individual quality of videos, particularly when there is no ground truth available for comparison. We propose a new dataset, called GoldenDOT, which tracks the evolution of apples cut in parallel over 10 days, allowing to observe their progress over time while remaining static. In addition, four new metrics are proposed that provide different analyses of the generated videos as a whole and individually. In this article, the proposed dataset and measures are used to study three state-of-the-art video generative models and their feasibility for video generation with biological development: Temporal GAN version 2 (TGANv2), Low-Dimensional Video Discriminator Generative Adversarial Network (LDVDGAN), and Video Diffusion Model (VDM). Among them, the TGANv2 model has managed to obtain the best results in most metrics, including those already known in the state of the art, demonstrating the viability of the new proposed metrics and their congruence with these standard measures. [ABSTRACT FROM AUTHOR]
Published: 2025
Full Text: View/download PDF

40. Enhancing Generative Class Incremental Learning Performance With a Model Forgetting Approach

Author: Taro Togo, Ren Togo, Keisuke Maeda, Takahiro Ogawa, and Miki Haseyama
Subjects: Class incremental learning, continual learning, generative models, machine unlearning, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: This study presents a novel approach to Generative Class Incremental Learning (GCIL) by introducing the forgetting mechanism, aimed at dynamically managing class information for better adaptation to streaming data. GCIL is one of the hot topics in the field of computer vision, and it is considered one of the important tasks in society as one of the continual learning approaches for generative models. The ability to forget is a crucial brain function that facilitates continual learning by selectively discarding less relevant information for humans. However, in the field of machine learning models, the concept of intentionally forgetting has not been extensively investigated. In this study, we aim to bridge this gap by incorporating the forgetting mechanisms into GCIL, thereby examining their impact on the models' ability to learn in continual learning. Through our experiments, we have found that integrating the forgetting mechanisms significantly enhances the models' performance in acquiring new knowledge, underscoring the positive role that strategic forgetting plays in the process of continual learning.
Published: 2025
Full Text: View/download PDF

41. AI-Synthesized Image Detection: Source Camera Fingerprinting to Discern the Authenticity of Digital Images

Author: Manisha, Chang-Tsun Li, and Karunakar A. Kotegar
Subjects: AI-synthesized image detection, camera fingerprinting, digital image forensics, generative models, image authenticity, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Detecting AI-synthesized images remains a challenge due to their increasing realism. Traditional methods often fall short in addressing this evolving landscape where testing images can be produced by more powerful generative models or combined with various post-processing operations. To address this, we propose a robust method capitalizing on the distinctive global fingerprints present in real images from physical cameras and fixed patterns in synthesized images generated by generative models. Our approach builds on adapting camera fingerprinting initially designed for source camera identification, leveraging global fingerprints from the mid-frequency band. We rigorously evaluated our methodology through cross-model testing, demonstrating its exceptional generalization capability across different generative models and datasets. Our results, validated on the $DF^{3}$ and CNN-aug datasets with a broader range of generative models, highlight the method’s effectiveness in detecting AI-synthesized images and distinguishing between generative models. Comparisons with state-of-the-art methods show that our approach achieves superior performance, particularly in scenarios involving post-processing operations. This camera fingerprinting approach stands out for its resilience and accuracy, offering a promising solution for image forensics in addressing the challenges posed by sophisticated AI-generated content.
Published: 2025
Full Text: View/download PDF

42. Deep Learning for Traffic Scene Understanding: A Review

Author: Parya Dolatyabi, Jacob Regan, and Mahdi Khodayar
Subjects: Deep learning, traffic scene understanding, discriminative models, generative models, domain adaptation, classification, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: This review paper presents an in-depth analysis of deep learning (DL) models applied to traffic scene understanding, a key aspect of modern intelligent transportation systems. It examines fundamental techniques such as classification, object detection, and segmentation, and extends to more advanced applications like action recognition, object tracking, path prediction, scene generation and retrieval, anomaly detection, Image-to-Image Translation (I2IT), and person re-identification (Person Re-ID). The paper synthesizes insights from a broad range of studies, tracing the evolution from traditional image processing methods to sophisticated DL techniques, such as Convolutional Neural Networks (CNNs) and Generative Adversarial Networks (GANs). The review also explores three primary categories of domain adaptation (DA) methods: clustering-based, discrepancy-based, and adversarial-based, highlighting their significance in traffic scene understanding. The significance of Hyperparameter Optimization (HPO) is also discussed, emphasizing its critical role in enhancing model performance and efficiency, particularly in adapting DL models for practical, real-world use. Special focus is given to the integration of these models in real-world applications, including autonomous driving, traffic management, and pedestrian safety. The review also addresses key challenges in traffic scene understanding, such as occlusions, the dynamic nature of urban traffic, and environmental complexities like varying weather and lighting conditions. By critically analyzing current technologies, the paper identifies limitations in existing research and proposes areas for future exploration. It underscores the need for improved interpretability, real-time processing, and the integration of multi-modal data. This review serves as a valuable resource for researchers and practitioners aiming to apply or advance DL techniques in traffic scene understanding.
Published: 2025
Full Text: View/download PDF

43. Guided Conditional Diffusion Classifier (ConDiff) for Enhanced Prediction of Infection in Diabetic Foot Ulcers

Author: Palawat Busaranuvong, Emmanuel Agu, Deepak Kumar, Shefalika Gautam, Reza Saadati Fard, Bengisu Tulu, and Diane Strong
Subjects: Diabetic foot ulcers, diffusion models, distance-based image classification, generative models, wound infection, Computer applications to medicine. Medical informatics, R858-859.7, Medical technology, R855-855.5
Abstract: Goal: To accurately detect infections in Diabetic Foot Ulcers (DFUs) using photographs taken at the Point of Care (POC). Achieving high performance is critical for preventing complications and amputations, as well as minimizing unnecessary emergency department visits and referrals. Methods: This paper proposes the Guided Conditional Diffusion Classifier (ConDiff). This novel deep-learning framework combines guided image synthesis with a denoising diffusion model and distance-based classification. The process involves (1) generating guided conditional synthetic images by injecting Gaussian noise to a guide (input) image, followed by denoising the noise-perturbed image through a reverse diffusion process, conditioned on infection status and (2) classifying infections based on the minimum Euclidean distance between synthesized images and the original guide image in embedding space. Results: ConDiff demonstrated superior performance with an average accuracy of 81% that outperformed state-of-the-art (SOTA) models by at least 3%. It also achieved the highest sensitivity of 85.4%, which is crucial in clinical domains while significantly improving specificity to 74.4%, surpassing the best SOTA model. Conclusions: ConDiff not only improves the diagnosis of DFU infections but also pioneers the use of generative discriminative models for detailed medical image analysis, offering a promising approach for improving patient outcomes.
Published: 2025
Full Text: View/download PDF

44. VAEneu: a new avenue for VAE application on probabilistic forecasting: VAEneu: a new avenue for VAE...: A. Koochali et al.

Author: Koochali, Alireza, Tahaei, Ensiye, Dengel, Andreas, and Ahmed, Sheraz
Abstract: This paper introduces VAEneu, a novel autoregressive method for multistep ahead univariate probabilistic time series forecasting, designed to address the challenges of generating sharp and well-calibrated probabilistic forecasts without assuming a specific parametric form for the predictive distribution. VAEneu leverages the Conditional VAE framework and optimizes the likelihood of the predictive distribution using the Continuous Ranked Probability Score (CRPS), a strictly proper scoring rule, as the loss function. This approach enables the model to learn flexible, sharp, and well-calibrated predictive distributions without the need for a tractable likelihood function. In a comprehensive empirical study, VAEneu is rigorously benchmarked against 12 baseline models across 12 datasets, demonstrating superior performance in both forecasting accuracy and uncertainty quantification. VAEneu provides a valuable tool for quantifying future uncertainties, and our extensive empirical study lays the foundation for future comparative studies for univariate multistep ahead probabilistic forecasting. [ABSTRACT FROM AUTHOR]
Published: 2025
Full Text: View/download PDF

45. Transfer learning with pre-trained conditional generative models.

Author: Yamaguchi, Shin’ya, Kanai, Sekitoshi, Kumagai, Atsutoshi, Chijiwa, Daiki, and Kashima, Hisashi
Abstract: Transfer learning is crucial in training deep neural networks on new target tasks. Current transfer learning methods always assume at least one of (i) Source and target task label spaces overlap, (ii) Source datasets are available, and (iii) Target network architectures are consistent with source ones. However, holding these assumptions is difficult in practical settings because the target task rarely has the same labels as the source task, the source dataset access is restricted due to storage costs and privacy, and the target architecture is often specialized to each task. To transfer source knowledge without these assumptions, we propose a transfer learning method that uses deep generative models and is composed of the following two stages: pseudo pre-training (PP) and pseudo semi-supervised learning (P-SSL). PP trains a target architecture with an artificial dataset synthesized by using conditional source generative models. P-SSL applies SSL algorithms to labeled target data and unlabeled pseudo samples, which are generated by cascading the source classifier and generative models to condition them with target samples. Our experimental results indicate that our method can outperform the baselines of scratch training and knowledge distillation. [ABSTRACT FROM AUTHOR]
Published: 2025
Full Text: View/download PDF

46. Comprehensive exploration of diffusion models in image generation: a survey: Comprehensive exploration...: H. Chen et al.

Author: Chen, Hang, Xiang, Qian, Hu, Jiaxin, Ye, Meilin, Yu, Chao, Cheng, Hao, and Zhang, Lei
Subjects: DATA privacy, ARTIFICIAL intelligence, COPYRIGHT, DATA security, SOCIAL impact
Abstract: The rapid development of deep learning technology has led to the emergence of diffusion models as a promising generative model with diverse applications. These include image generation, audio and video synthesis, molecular design, and text generation. The distinctive generation mechanism and exceptional generation quality of diffusion models have made them a valuable tool in these diverse fields. However, with the extensive deployment of diffusion models in the domain of image generation, concerns pertaining to data privacy, data security, and artistic ethics have emerged with increasing prominence. Given the accelerated pace of development in the field of diffusion models, the majority of extant surveys are deficient in two respects: firstly, they fail to encompass the latest advances in diffusion-based image synthesis; and secondly, they seldom consider the potential social implications of diffusion models. In order to address these issues, this paper presents a comprehensive survey of the most recent applications of diffusion models in the field of image generation. Furthermore, it provides an in-depth analysis of the potential social impacts that may result from their use. Firstly, this paper presents a systematic survey of the background principles and theoretical foundations of diffusion models. Subsequently, this paper provides a detailed examination of the most recent applications of diffusion models across a range of image generation subfields, including style transfer, image completion, image editing, super-resolution, and beyond. Finally, we present a comprehensive examination of these social issues, addressing data privacy concerns, such as the potential for data leakage and the implementation of protective measures during model training. We also analyse the risk of malicious exploitation of the model and the defensive strategies employed to mitigate such risks. Additionally, we examine the implications of the authenticity and originality of generated images on artistic creativity and copyright protection. [ABSTRACT FROM AUTHOR]
Published: 2025
Full Text: View/download PDF

47. Hackathon as a testing ground for creating digital tools in domestic Oriental studies

Author: Kudakaev R.F., Mokretskiy A.Ch., and Kostyrkin A.V.
Subjects: oriental studies, generative models, research & development, large language models, machine translation, South Asia. Southeast Asia. East Asia, KN, Bibliography. Library science. Information resources, History of Asia, DS1-937
Abstract: With the formation of a new technological paradigm and global competition for leadership in the digital space, the attention of experts is shifting towards the growth of political, economic and R&D influence of Eastern countries, which imposes new demands on research methods and tools. The article summarizes the experience of enhancing the academic research process through involving young IT professionals in the Oriental studies in competitive mode. During two weeks of the “Hackathon” contest organized by Yandex Cloud, Napoleon IT and AI Talent Hub in collaboration with the experts from the Institute on Oriental Studies and the Institute of China and Contemporary Asia of the Russian Academy of Sciences, contestants were challenged to develop a chat bot employing generative models and machine translation to analyze news flows of East Asian countries, primarily China and Japan. A review of the winners’ approaches and solutions proves overall feasibility of the idea and shows that many specific linguistic and engineering tasks that were relevant only recently have already been successfully solved. Therefore, when planning and developing next-generation digital tools, it is necessary to operate at functionally and semantically higher levels of generalization closer to human reasoning.
Published: 2024
Full Text: View/download PDF

48. Prompts for generative artificial intelligence in legal discourse

Author: Alexander E. Kirpichev
Subjects: ai, generative models, prompts, legal actions, copyright, legal practice, legal education, standardization of prompts, human-ai interaction, legal regulation of ai, Law
Abstract: The development of generative models of artificial intelligence (AI) poses new challenges for legal science and practice. This requires understanding of the legal nature of prompts (queries to AI) and development of appropriate legal regulation. The article aims to determine the legal significance of prompts and outlines the prospects for their research in the context of the interaction between law and AI. The study is based on the analysis of contemporary scientific literature devoted to the problems of legal regulation of AI, as well as investigation of the first cases of the use of generative AI models in legal practice and education. Methods of legal qualification, comparative legal analysis, and legal modeling are applied. Prompts are qualified as legal actions (legal facts in the strict sense), which opens the path to addressing the applicability of copyright criteria to them. The potential and risks of using prompts in legal practice and education are identified, and the need for standardizing prompts and developing specialized methods for teaching lawyers to interact with AI is substantiated. Prompts, as a tool for human-AI interaction, represent a fundamentally important subject of legal research, upon which the prospects for AI application in law largely rely. The article concludes that interdisciplinary and international studies are necessary to unite the efforts of legal professionals, AI specialists, and the generative models themselves in developing optimal legal solutions.
Published: 2024
Full Text: View/download PDF

49. Let’s Practice Better... on Cats: Description and Visualisation of Artistic Images in Generative AI Models

Author: Ruslan Khandogin and Nina S. Proner
Subjects: digital art, generative models, neural networks, artistic image, visualisation, socio-cultural context, prompt, dall·e, stable diffusion, kandinsky, Communication. Mass media, P87-96
Abstract: Artificial Intelligence (AI) plays an increasingly prominent role in various spheres of life in today’s world, including generation of a variety of visual content from selfie stream processing to creating works of digital art. The present paper raises the question of whether AI is capable of creating real art or it just imitates its external form. The paper examines the specificity of prompts: from concrete named ones to interpretive descriptive queries in linguistic, artistic and socio-cultural contexts. The article dwells upon some important aspects of evaluating the quality of keyword extraction algorithms and their relation to artistic practice. The authors rely on semiotic analysis to uncover encoded meanings and imports in the text. The article emphasises that the literary text is at the top of the hierarchy of cultural texts; it is characterised by intentionality and coherence and represents a complex semantic field where key words and images interact with the explicit and implicit contexts. The study examines and analyses the visualised images of Cheshire Cat, Cat Behemoth and Tomcat Murr created by the authors with the use of three generative neural networks: Stable Diffusion, Dall‑E and Kandinsky. Understanding and visualising the literary text by generative systems and models realising specific algorithms requires the ability to reveal its multilayered semantics and connection with the cultural context, which ultimately helps to understand the in-depth meanings of the work and its place in culture. Consideration of the operational quality of algorithms for keyword system extraction and image generation is deemed possible from the point of view of their structural organisation. Generative algorithms create an imitative reality, while the immanence of the artistic value determines the uniqueness and meanings of the created figurative world. The article can be useful to anyone interested in the substance and specificity of digital art, the relationship between technological innovations and socio-cultural context, the creation and visualisation of artistic images in generative AI models, their conceptualisation and interpretation.
Published: 2024
Full Text: View/download PDF

50. Zero-Shot Food Image Detection Based on Transformer

Author: Jingru SONG, Weiqing MIN, Pengfei ZHOU, Quanrui RAO, Guorui SHENG, Yancun YANG, Lili WANG, and Shuqiang JIANG
Subjects: food image detection, zero-shot learning, generative models, transformer, deep learning, Food processing and manufacture, TP368-456
Abstract: As a fundamental task in food computing, food detection played a crucial role in locating and identifying food items from input images, particularly in applications such as intelligent canteen settlement and dietary health management. However, food categories were constantly updating in practical scenarios, making it difficult for food detectors trained on fixed categories to accurately detect previously unseen food categories. To address this issue, this paper proposed a zero-shot food image detection method. Firstly, a Transformer-based food primitive generator was constructed, where each primitive contained fine-grained attributes relevant to food categories. These primitives could be selectively assembled based on the food characteristics to synthesize new food features. Secondly, an enhancement component of visual feature disentanglement was proposed in order to impose more constraints on the visual features of unseen food categories. The visual features of food images were decomposed into semantically related features and semantically unrelated features, thereby better transferring semantic knowledge of food categories to their visual features. The proposed method was extensively evaluated on the ZSFooD and UEC-FOOD256 datasets through numerous experiments and ablation studies. Under the zero-shot detection (ZSD) setting, optimal average precision on unseen classes reached 4.9% and 24.1%, respectively, demonstrating the effectiveness of the proposed approach. Under the generalized zero-shot detection (GZSD) setting, the harmonic mean of visible and unseen classes reaches 5.8% and 22.0%, respectively, further validating the effectiveness of the proposed method.
Published: 2024
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

1,089 results on '"Generative models"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources