2,167 results on '"Generative models"'
Search Results
252. Selectively Increasing the Diversity of GAN-Generated Samples
- Author
-
Dubiński, Jan, Deja, Kamil, Wenzel, Sandro, Rokita, Przemysław, Trzcinski, Tomasz, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Tanveer, Mohammad, editor, Agarwal, Sonali, editor, Ozawa, Seiichi, editor, Ekbal, Asif, editor, and Jatowt, Adam, editor
- Published
- 2023
- Full Text
- View/download PDF
253. DEEPFAKE CLI: Accelerated Deepfake Detection Using FPGAs
- Author
-
Bhilare, Omkar, Singh, Rahul, Paranjape, Vedant, Chittupalli, Sravan, Suratkar, Shraddha, Kazi, Faruk, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Takizawa, Hiroyuki, editor, Shen, Hong, editor, Hanawa, Toshihiro, editor, Hyuk Park, Jong, editor, Tian, Hui, editor, and Egawa, Ryusuke, editor
- Published
- 2023
- Full Text
- View/download PDF
254. Optimization of Annealed Importance Sampling Hyperparameters
- Author
-
Goshtasbpour, Shirin, Perez-Cruz, Fernando, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Amini, Massih-Reza, editor, Canu, Stéphane, editor, Fischer, Asja, editor, Guns, Tias, editor, Kralj Novak, Petra, editor, and Tsoumakas, Grigorios, editor
- Published
- 2023
- Full Text
- View/download PDF
255. GP-Based Generative Adversarial Models
- Author
-
Machado, Penousal, Baeta, Francisco, Martins, Tiago, Correia, João, Banzhaf, Wolfgang, Series Editor, Deb, Kalyanmoy, Series Editor, Trujillo, Leonardo, editor, Winkler, Stephan M., editor, and Silva, Sara, editor
- Published
- 2023
- Full Text
- View/download PDF
256. Brain Tumor Synthetic Data Generation with Adaptive StyleGANs
- Author
-
Tariq, Usama, Qureshi, Rizwan, Zafar, Anas, Aftab, Danyal, Wu, Jia, Alam, Tanvir, Shah, Zubair, Ali, Hazrat, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Longo, Luca, editor, and O’Reilly, Ruairi, editor
- Published
- 2023
- Full Text
- View/download PDF
257. Spot the Fake Lungs: Generating Synthetic Medical Images Using Neural Diffusion Models
- Author
-
Ali, Hazrat, Murad, Shafaq, Shah, Zubair, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Longo, Luca, editor, and O’Reilly, Ruairi, editor
- Published
- 2023
- Full Text
- View/download PDF
258. Bokeh-Loss GAN: Multi-stage Adversarial Training for Realistic Edge-Aware Bokeh
- Author
-
Lee, Brian, Lei, Fei, Chen, Huaijin, Baudron, Alexis, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Karlinsky, Leonid, editor, Michaeli, Tomer, editor, and Nishino, Ko, editor
- Published
- 2023
- Full Text
- View/download PDF
259. Applying Disentanglement in the Medical Domain: An Introduction for the MAD Workshop
- Author
-
Fragemann, Jana, Liu, Xiao, Li, Jianning, Tsaftaris, Sotirios A., Egger, Jan, Kleesiek, Jens, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Fragemann, Jana, editor, Li, Jianning, editor, Liu, Xiao, editor, Tsaftaris, Sotirios A., editor, Egger, Jan, editor, and Kleesiek, Jens, editor
- Published
- 2023
- Full Text
- View/download PDF
260. Retinotopic Image Encoding by Samples of Counts
- Author
-
Antsiperov, Viacheslav, Kershner, Vladislav, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, De Marsico, Maria, editor, Sanniti di Baja, Gabriella, editor, and Fred, Ana, editor
- Published
- 2023
- Full Text
- View/download PDF
261. Probabilistic Models
- Author
-
Joshi, Ameet V. and Joshi, Ameet V
- Published
- 2023
- Full Text
- View/download PDF
262. Deep Generative Models Under GAN: Variants, Applications, and Privacy Issues
- Author
-
Raveendran, Remya, Raj, Ebin Deni, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Bhateja, Vikrant, editor, Sunitha, K. V. N., editor, Chen, Yen-Wei, editor, and Zhang, Yu-Dong, editor
- Published
- 2023
- Full Text
- View/download PDF
263. A Variational Autoencoder—General Adversarial Networks (VAE-GAN) Based Model for Ligand Designing
- Author
-
Mukesh, K., Ippatapu Venkata, Srisurya, Chereddy, Spandana, Anbazhagan, E., Oviya, I. R., Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Gupta, Deepak, editor, Khanna, Ashish, editor, Bhattacharyya, Siddhartha, editor, Hassanien, Aboul Ella, editor, Anand, Sameer, editor, and Jaiswal, Ajay, editor
- Published
- 2023
- Full Text
- View/download PDF
264. Synthesizing Vehicle Speed-Related Features with Neural Networks
- Author
-
Michal Krepelka and Jiri Vrany
- Subjects
Hardware-in-the-loop ,generative models ,synthetic data ,MLP ,LSTM ,Mechanical engineering and machinery ,TJ1-1570 ,Machine design and drawing ,TJ227-240 ,Motor vehicles. Aeronautics. Astronautics ,TL1-4050 - Abstract
In today’s automotive industry, digital technology trends such as Big Data, Digital Twin, and Hardware-in-the-loop simulations using synthetic data offer opportunities that have the potential to transform the entire industry towards being more software-oriented and thus more effective and environmentally friendly. In this paper, we propose generative models to synthesize car features related to vehicle speed: brake pressure, percentage of the pressed throttle pedal, engaged gear, and engine RPM. Synthetic data are essential to digitize Hardware-in-the-loop integration testing of the vehicle’s dashboard, navigation, or infotainment and for Digital Twin simulations. We trained models based on Multilayer Perceptron and bidirectional Long-Short Term Memory neural network for each feature. These models were evaluated on a real-world dataset and demonstrated sufficient accuracy in predicting the desired features. Combining our current research with previous work on generating a speed profile for an arbitrary trip, where Open Street Map data and elevation data are available, allows us to digitally drive this trip. At the time of writing, we are unaware of any similar data-driven approach for generating desired speed-related features.
- Published
- 2023
- Full Text
- View/download PDF
265. Start small: Training controllable game level generators without training data by learning at multiple sizes
- Author
-
Yahia Zakaria, Magda Fayek, and Mayada Hadhoud
- Subjects
Procedural content generation ,Level generation ,Deep learning ,Generative flow networks ,Generative models ,Engineering (General). Civil engineering (General) ,TA1-2040 - Abstract
A level generator is a tool that generates game levels from noise. Training a generator without a dataset suffers from feedback sparsity, since it is unlikely to generate a playable level via random exploration. A common solution is shaped rewards, which guides the generator to achieve subgoals towards level playability, but they consume effort to design and require game-specific domain knowledge. This paper proposes a novel approach to train generators without datasets or shaped rewards by learning at multiple level sizes starting from small sizes and up to the desired sizes. The denser feedback at small sizes negates the need for shaped rewards. Additionally, the generators learn to build levels at various sizes, including sizes they were not trained for. We apply our approach to train recurrent autoregressive generative flow networks (GFlowNets) for controllable level generation. We also adapt diversity sampling to be compatible with GFlowNets. The results show that our generators create diverse playable levels at various sizes for Sokoban, Zelda, and Danger Dave. When compared with controllable reinforcement learning level generators for Sokoban, the results show that our generators achieve better controllability and competitive diversity, while being 9× faster at training and level generation.
- Published
- 2023
- Full Text
- View/download PDF
266. A survey of deep learning-based 3D shape generation
- Author
-
Qun-Ce Xu, Tai-Jiang Mu, and Yong-Liang Yang
- Subjects
3D representations ,geometry learning ,generative models ,deep learning ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Abstract Deep learning has been successfully used for tasks in the 2D image domain. Research on 3D computer vision and deep geometry learning has also attracted attention. Considerable achievements have been made regarding feature extraction and discrimination of 3D shapes. Following recent advances in deep generative models such as generative adversarial networks, effective generation of 3D shapes has become an active research topic. Unlike 2D images with a regular grid structure, 3D shapes have various representations, such as voxels, point clouds, meshes, and implicit functions. For deep learning of 3D shapes, shape representation has to be taken into account as there is no unified representation that can cover all tasks well. Factors such as the representativeness of geometry and topology often largely affect the quality of the generated 3D shapes. In this survey, we comprehensively review works on deep-learning-based 3D shape generation by classifying and discussing them in terms of the underlying shape representation and the architecture of the shape generator. The advantages and disadvantages of each class are further analyzed. We also consider the 3D shape datasets commonly used for shape generation. Finally, we present several potential research directions that hopefully can inspire future works on this topic.
- Published
- 2023
- Full Text
- View/download PDF
267. Emotional body language synthesis for humanoid robots
- Author
-
Marmpena, Asimina
- Subjects
629.8 ,human-robot interaction ,social robotics ,affective computing ,generative models ,variational autoencoder ,deep learning ,emotional body language generation - Abstract
In the next decade, societies will witness a rise in service robots deployed in social environments, such as schools, homes, or shops, where they will operate as assistants, public relation agents, or companions. People are expected to willingly engage and collaborate with these robots to accomplish positive outcomes. To facilitate collaboration, robots need to comply with the behavioural and social norms used by humans in their daily interactions. One such behavioural norm is the expression of emotion through body language. Previous work on emotional body language synthesis for humanoid robots has been mainly focused on hand-coded design methods, often employing features extracted from human body language. However, the hand-coded design is cumbersome and results in a limited number of expressions with low variability. This limitation can be at the expense of user engagement since the robotic behaviours will appear repetitive and predictable, especially in long-term interaction. Furthermore, design approaches strictly based on human emotional body language might not transfer effectively on robots because of their simpler morphology. Finally, most previous work is using six or fewer basic emotion categories in the design and the evaluation phase of emotional expressions. This approach might result in lossy compression of the granularity in emotion expression. The current thesis presents a methodology for developing a complete framework of emotional body language generation for a humanoid robot, intending to address these three limitations. Our starting point is a small set of animations designed by professional animators with the robot morphology in mind. We conducted an initial user study to acquire reliable dimensional labels of valence and arousal for each animation. In the next step, we used the motion sequences from these animations to train a Variational Autoencoder, a deep learning model, to generate numerous new animations in an unsupervised setting. Finally, we extended the model to condition the generative process with valence and arousal attributes, and we conducted a user study to evaluate the interpretability of the animations in terms of valence, arousal, and dominance. The results indicate moderate to strong interpretability.
- Published
- 2021
268. Comparison of Affine and Rational Quadratic Spline Coupling and Autoregressive Flows through Robust Statistical Tests
- Author
-
Andrea Coccaro, Marco Letizia, Humberto Reyes-González, and Riccardo Torre
- Subjects
machine learning ,generative models ,density estimation ,normalizing flows ,Mathematics ,QA1-939 - Abstract
Normalizing flows have emerged as a powerful brand of generative models, as they not only allow for efficient sampling of complicated target distributions but also deliver density estimation by construction. We propose here an in-depth comparison of coupling and autoregressive flows, both based on symmetric (affine) and non-symmetric (rational quadratic spline) bijectors, considering four different architectures: real-valued non-Volume preserving (RealNVP), masked autoregressive flow (MAF), coupling rational quadratic spline (C-RQS), and autoregressive rational quadratic spline (A-RQS). We focus on a set of multimodal target distributions of increasing dimensionality ranging from 4 to 400. The performances were compared by means of different test statistics for two-sample tests, built from known distance measures: the sliced Wasserstein distance, the dimension-averaged one-dimensional Kolmogorov–Smirnov test, and the Frobenius norm of the difference between correlation matrices. Furthermore, we included estimations of the variance of both the metrics and the trained models. Our results indicate that the A-RQS algorithm stands out both in terms of accuracy and training speed. Nonetheless, all the algorithms are generally able, without too much fine-tuning, to learn complicated distributions with limited training data and in a reasonable time of the order of hours on a Tesla A40 GPU. The only exception is the C-RQS, which takes significantly longer to train, does not always provide good accuracy, and becomes unstable for large dimensionalities. All algorithms were implemented using TensorFlow2 and TensorFlow Probability and have been made available on GitHub.
- Published
- 2024
- Full Text
- View/download PDF
269. Open-Vocabulary Predictive World Models from Sensor Observations
- Author
-
Robin Karlsson, Ruslan Asfandiyarov, Alexander Carballo, Keisuke Fujii, Kento Ohtani, and Kazuya Takeda
- Subjects
world models ,open-vocabulary semantics ,generative models ,BEV generation ,continual learning ,self-supervised learning ,Chemical technology ,TP1-1185 - Abstract
Cognitive scientists believe that adaptable intelligent agents like humans perform spatial reasoning tasks by learned causal mental simulation. The problem of learning these simulations is called predictive world modeling. We present the first framework for a learning open-vocabulary predictive world model (OV-PWM) from sensor observations. The model is implemented through a hierarchical variational autoencoder (HVAE) capable of predicting diverse and accurate fully observed environments from accumulated partial observations. We show that the OV-PWM can model high-dimensional embedding maps of latent compositional embeddings representing sets of overlapping semantics inferable by sufficient similarity inference. The OV-PWM simplifies the prior two-stage closed-set PWM approach to the single-stage end-to-end learning method. CARLA simulator experiments show that the OV-PWM can learn compact latent representations and generate diverse and accurate worlds with fine details like road markings, achieving 69 mIoU over six query semantics on an urban evaluation sequence. We propose the OV-PWM as a versatile continual learning paradigm for providing spatio-semantic memory and learned internal simulation capabilities to future general-purpose mobile robots.
- Published
- 2024
- Full Text
- View/download PDF
270. Generating Urban Road Networks with Conditional Diffusion Models
- Author
-
Xiaoyan Gu, Mengmeng Zhang, Jinxin Lyu, and Quansheng Ge
- Subjects
road network generation ,generative models ,diffusion models ,geospatial context ,Geography (General) ,G1-922 - Abstract
The auto-generation of urban roads can greatly improve efficiency and productivity in urban planning and designing. However, it has also raised concerns amongst researchers over the past decade. In this paper, we present an image-based urban road network generation framework using conditional diffusion models. We first trained a diffusion model capable of generating road images with similar characteristics to the ground truth using four context factors. Then, we used the trained model as the generator to synthesize road images conditioned in a geospatial context. Finally, we converted the generated road images into road networks with several post-processes. The experiments conducted in five cities of the United States showed that our model can generate reasonable road networks, maintaining the layouts and styles of real examples. Moreover, our model has the ability to show the obstructive effect of geographic barriers on urban roads. By comparing models with different context factors as input, we find that the model that considers all four factors generally performs the best. The most important factor in guiding the shape of road networks is intersections, implying that the development of urban roads is not only restricted by the natural environment but is more strongly influenced by human design.
- Published
- 2024
- Full Text
- View/download PDF
271. EDiffuRec: An Enhanced Diffusion Model for Sequential Recommendation
- Author
-
Hanbyul Lee and Junghyun Kim
- Subjects
sequential recommendation ,diffusion models ,generative models ,noise distribution ,Mathematics ,QA1-939 - Abstract
Sequential recommender models should capture evolving user preferences over time, but there is a risk of obtaining biased results such as false positives and false negatives due to noisy interactions. Generative models effectively learn the underlying distribution and uncertainty of the given data to generate new data, and they exhibit robustness against noise. In particular, utilizing the Diffusion model, which generates data through a multi-step process of adding and removing noise, enables stable and effective recommendations. The Diffusion model typically leverages a Gaussian distribution with a mean fixed at zero, but there is potential for performance improvement in generative models by employing distributions with higher degrees of freedom. Therefore, we propose a Diffusion model-based sequential recommender model that uses a new noise distribution. The proposed model improves performance through a Weibull distribution with two parameters determining shape and scale, a modified Transformer architecture based on Macaron Net, normalized loss, and a learning rate warmup strategy. Experimental results on four types of real-world e-commerce data show that the proposed model achieved performance gains ranging from a minimum of 2.53% to a maximum of 13.52% across HR@K and NDCG@K metrics compared to the existing Diffusion model-based sequential recommender model.
- Published
- 2024
- Full Text
- View/download PDF
272. GAN-Based Anomaly Detection Tailored for Classifiers
- Author
-
Ľubomír Králik, Martin Kontšek, Ondrej Škvarek, and Martin Klimo
- Subjects
anomaly detection ,generative models ,deep neural networks ,machine learning ,Mathematics ,QA1-939 - Abstract
Pattern recognition systems always misclassify anomalies, which can be dangerous for uninformed users. Therefore, anomalies must be filtered out from each classification. The main challenge for the anomaly filter design is the huge number of possible anomaly samples compared with the number of samples in the training set. Tailoring the filter for the given classifier is just the first step in this reduction. Paper tests the hypothesis that the filter trained in avoiding “near” anomalies will also refuse the “far” anomalies, and the anomaly detector is then just a classifier distinguishing between “far real” and “near anomaly” samples. As a “far real” samples generator was used, a Generative Adversarial Network (GAN) fake generator that transforms normally distributed random seeds into fakes similar to the training samples. The paper proves the assumption that seeds unused in fake training will generate anomalies. These seeds are distinguished according to their Chebyshev norms. While the fakes have seeds within the hypersphere with a given radius, the near anomalies have seeds within the sphere near cover. Experiments with various anomaly test sets have shown that GAN-based anomaly detectors create a reliable anti-anomaly shield using the abovementioned assumptions. The proposed anomaly detector is tailored to the given classifier, but its limitation is due to the need for the availability of the database on which the classifier was trained.
- Published
- 2024
- Full Text
- View/download PDF
273. Exploring chemical space — Generative models and their evaluation
- Author
-
Martin Vogt
- Subjects
Artificial intelligence ,Chemical space ,Chemical space exploration ,Deep neural networks ,Generative models ,Inverse QSAR/QSPR ,Science (General) ,Q1-390 - Abstract
Recent advances in the field of artificial intelligence, specifically regarding deep learning methods, have invigorated research into novel ways for the exploration of chemical space. Compared to more traditional methods that rely on chemical fragments and combinatorial recombination deep generative models generate molecules in a non-transparent way that defies easy rationalization. However, this opaque nature also promises to explore uncharted chemical space in novel ways that do not rely on structural similarity directly. These aspects and the complexity of training such models makes model assessment regarding novelty, uniqueness, and distribution of generated molecules a central aspect. This perspective gives an overview of current methodologies for chemical space exploration with an emphasis on deep neural network approaches. Key aspects of generative models include choice of molecular representation, the targeted chemical space, and the methodology for assessing and validating chemical space coverage.
- Published
- 2023
- Full Text
- View/download PDF
274. Novel and flexible parameter estimation methods for data-consistent inversion in mechanistic modelling
- Author
-
Timothy Rumbell, Jaimit Parikh, James Kozloski, and Viatcheslav Gurev
- Subjects
stochastic inverse problem ,generative models ,computational modelling ,mechanistic modelling ,parameter inference ,data-consistent inversion ,Science - Abstract
Predictions for physical systems often rely upon knowledge acquired from ensembles of entities, e.g. ensembles of cells in biological sciences. For qualitative and quantitative analysis, these ensembles are simulated with parametric families of mechanistic models (MMs). Two classes of methodologies, based on Bayesian inference and population of models, currently prevail in parameter estimation for physical systems. However, in Bayesian analysis, uninformative priors for MM parameters introduce undesirable bias. Here, we propose how to infer parameters within the framework of stochastic inverse problems (SIPs), also termed data-consistent inversion, wherein the prior targets only uncertainties that arise due to MM non-invertibility. To demonstrate, we introduce new methods to solve SIPs based on rejection sampling, Markov chain Monte Carlo, and generative adversarial networks (GANs). In addition, to overcome limitations of SIPs, we reformulate SIPs based on constrained optimization and present a novel GAN to solve the constrained optimization problem.
- Published
- 2023
- Full Text
- View/download PDF
275. An Introduction to Predictive Processing Models of Perception and Decision‐Making.
- Author
-
Sprevak, Mark and Smith, Ryan
- Abstract
The predictive processing framework includes a broad set of ideas, which might be articulated and developed in a variety of ways, concerning how the brain may leverage predictive models when implementing perception, cognition, decision‐making, and motor control. This article provides an up‐to‐date introduction to the two most influential theories within this framework: predictive coding and active inference. The first half of the paper (Sections 2–5) reviews the evolution of predictive coding, from early ideas about efficient coding in the visual system to a more general model encompassing perception, cognition, and motor control. The theory is characterized in terms of the claims it makes at Marr's computational, algorithmic, and implementation levels of description, and the conceptual and mathematical connections between predictive coding, Bayesian inference, and variational free energy (a quantity jointly evaluating model accuracy and complexity) are explored. The second half of the paper (Sections 6–8) turns to recent theories of active inference. Like predictive coding, active inference models assume that perceptual and learning processes minimize variational free energy as a means of approximating Bayesian inference in a biologically plausible manner. However, these models focus primarily on planning and decision‐making processes that predictive coding models were not developed to address. Under active inference, an agent evaluates potential plans (action sequences) based on their expected free energy (a quantity that combines anticipated reward and information gain). The agent is assumed to represent the world as a partially observable Markov decision process with discrete time and discrete states. Current research applications of active inference models are described, including a range of simulation work, as well as studies fitting models to empirical data. The paper concludes by considering future research directions that will be important for further development of both models. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
276. Quantifying quality of class-conditional generative models in time series domain.
- Author
-
Koochali, Alireza, Walch, Maria, Thota, Sankrutyayan, Schichtel, Peter, Dengel, Andreas, and Ahmed, Sheraz
- Subjects
PROBABILISTIC generative models ,TIME series analysis - Abstract
Despite recent breakthroughs in the domain of implicit generative models, the task of evaluating these models remains a challenging task. With no single metric to assess overall performance, various existing metrics only offer partial information. This issue is further compounded for unintuitive data types such as time series, where manual inspection is infeasible. This deficiency hinders the confident application of modern implicit generative models on time series data. To alleviate this problem, we propose two new metrics, the InceptionTime Score (ITS) and the Fréchet InceptionTime Distance (FITD), to assess the quality of class-conditional generative models on time series data. We conduct extensive experiments on 80 different datasets to study the discriminative capabilities of proposed metrics alongside two existing evaluation metrics: Train on Synthetic Test on Real (TSTR) and Train on Real Test on Synthetic (TRTS). Our evaluations reveal that the proposed assessment evaluation metrics, i.e., ITS and FITD in combination with TSTR, can accurately assess class-conditional generative model performance and detect common issues in implicit generative models. Our findings suggest that the proposed evaluation framework can be a valuable tool for confidently applying modern implicit generative models in time series analysis. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
277. Digital staining in optical microscopy using deep learning - a review.
- Author
-
Kreiss, Lucas, Jiang, Shaowei, Li, Xiang, Xu, Shiqi, Zhou, Kevin C., Lee, Kyung Chul, Mühlberg, Alexander, Kim, Kanghyun, Chaware, Amey, Ando, Michael, Barisoni, Laura, Lee, Seung Ah, Zheng, Guoan, Lafata, Kyle J., Friedrich, Oliver, and Horstmeyer, Roarke
- Subjects
MICROSCOPY ,DEEP learning ,CONTRAST media ,THREE-dimensional imaging ,SAMPLING (Process) ,NANOBIOTECHNOLOGY - Abstract
Until recently, conventional biochemical staining had the undisputed status as well-established benchmark for most biomedical problems related to clinical diagnostics, fundamental research and biotechnology. Despite this role as gold-standard, staining protocols face several challenges, such as a need for extensive, manual processing of samples, substantial time delays, altered tissue homeostasis, limited choice of contrast agents, 2D imaging instead of 3D tomography and many more. Label-free optical technologies, on the other hand, do not rely on exogenous and artificial markers, by exploiting intrinsic optical contrast mechanisms, where the specificity is typically less obvious to the human observer. Over the past few years, digital staining has emerged as a promising concept to use modern deep learning for the translation from optical contrast to established biochemical contrast of actual stainings. In this review article, we provide an in-depth analysis of the current state-of-the-art in this field, suggest methods of good practice, identify pitfalls and challenges and postulate promising advances towards potential future implementations and applications. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
278. ECG Synthesis via Diffusion-Based State Space Augmented Transformer.
- Author
-
Zama, Md Haider and Schwenker, Friedhelm
- Subjects
- *
ELECTROCARDIOGRAPHY , *TIME series analysis , *DATA distribution , *HUMAN activity recognition , *CARDIOVASCULAR diseases , *WORLD health , *ELECTRIC transformers - Abstract
Cardiovascular diseases (CVDs) are a major global health concern, causing significant morbidity and mortality. AI's integration with healthcare offers promising solutions, with data-driven techniques, including ECG analysis, emerging as powerful tools. However, privacy concerns pose a major barrier to distributing healthcare data for addressing data-driven CVD classification. To address confidentiality issues related to sensitive health data distribution, we propose leveraging artificially synthesized data generation. Our contribution introduces a novel diffusion-based model coupled with a State Space Augmented Transformer. This synthesizes conditional 12-lead electrocardiograms based on the 12 multilabeled heart rhythm classes of the PTB-XL dataset, with each lead depicting the heart's electrical activity from different viewpoints. Recent advances establish diffusion models as groundbreaking generative tools, while the State Space Augmented Transformer captures long-term dependencies in time series data. The quality of generated samples was assessed using metrics like Dynamic Time Warping (DTW) and Maximum Mean Discrepancy (MMD). To evaluate authenticity, we assessed the similarity of performance of a pre-trained classifier on both generated and real ECG samples. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
279. O IMAGINE PIXELATĂ ASUPRA ÎNCĂLCĂRII DREPTURILOR DE AUTOR ÎN UTILIZAREA INTELIGENȚEI ARTIFICIALE.
- Author
-
STANCIU, Marius
- Subjects
ARTIFICIAL intelligence ,MACHINE learning - Abstract
Copyright of Romanian Journal of Intellectual Property Law / Revista Română de Dreptul Proprietăţii Intelectuale is the property of Universul Juridic Publishing House and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2023
280. Metrics and methods for robustness evaluation of neural networks with generative models.
- Author
-
Buzhinsky, Igor, Nerinovsky, Arseny, and Tripakis, Stavros
- Subjects
IMAGE recognition (Computer vision) ,COMPUTER vision ,EVALUATION methodology ,DISTRIBUTION (Probability theory) - Abstract
Recent studies have shown that modern deep neural network classifiers are easy to fool, assuming that an adversary is able to slightly modify their inputs. Many papers have proposed adversarial attacks, defenses and methods to measure robustness to such adversarial perturbations. However, most commonly considered adversarial examples are based on perturbations in the input space of the neural network that are unlikely to arise naturally. Recently, especially in computer vision, researchers discovered "natural" perturbations, such as rotations, changes of brightness, or more high-level changes, but these perturbations have not yet been systematically used to measure the performance of classifiers. In this paper, we propose several metrics to measure robustness of classifiers to natural adversarial examples, and methods to evaluate them. These metrics, called latent space performance metrics, are based on the ability of generative models to capture probability distributions. On four image classification case studies, we evaluate the proposed metrics for several classifiers, including ones trained in conventional and robust ways. We find that the latent counterparts of adversarial robustness are associated with the accuracy of the classifier rather than its conventional adversarial robustness, but the latter is still reflected on the properties of found latent perturbations. In addition, our novel method of finding latent adversarial perturbations demonstrates that these perturbations are often perceptually small. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
281. Medical variational autoencoder and generative adversarial network for medical imaging.
- Author
-
Rguibi, Zakaria, Hajami, Abdelmajid, Zitouni, Dya, Maleh, Yassine, and Elqaraoui, Amine
- Subjects
GENERATIVE adversarial networks ,DIAGNOSTIC imaging ,IMAGE analysis ,TASK analysis ,IMAGE fusion - Abstract
Generative adversarial networks have succeeded promising results in the medical imaging field. One of the most significant challenges in this regard is the lack of or limited data sharing. In our work, an approach for combining generative adversarial network (GAN) and variational autoencoder (VAE) models has been proposed to improve the accuracy and efficiency of medical image analysis tasks. Our approach leverages the capacity of VAEs to acquire condensed feature representations, and the ability of GANs to generate high-quality synthetic images to learn an embedding that keeps high-level abstract visual qualities. Inception score (IS) and Fréchet inception distance (FID) score have been generated in order to demonstrate the high quality of images. Based on the score results, our approach demonstrates the potential of VAE-GAN fusion models and clearly outperforms existing methods on a variety of medical image analysis tasks. The suggested algorithm is explained, as are the results and evaluations. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
282. Generative models of morphogenesis in developmental biology.
- Author
-
Stillman, Namid R. and Mayor, Roberto
- Subjects
- *
DEVELOPMENTAL biology , *CELL differentiation , *MORPHOGENESIS , *MACHINE learning , *CELL motility , *CELL migration , *ENVIRONMENTALISM - Abstract
Understanding the mechanism by which cells coordinate their differentiation and migration is critical to our understanding of many fundamental processes such as wound healing, disease progression, and developmental biology. Mathematical models have been an essential tool for testing and developing our understanding, such as models of cells as soft spherical particles, reaction-diffusion systems that couple cell movement to environmental factors, and multi-scale multi-physics simulations that combine bottom-up rule-based models with continuum laws. However, mathematical models can often be loosely related to data or have so many parameters that model behaviour is weakly constrained. Recent methods in machine learning introduce new means by which models can be derived and deployed. In this review, we discuss examples of mathematical models of aspects of developmental biology, such as cell migration, and how these models can be combined with these recent machine learning methods. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
283. In Search of Disentanglement in Tandem Mass Spectrometry Datasets.
- Author
-
Abram, Krzysztof Jan and McCloskey, Douglas
- Subjects
- *
MASS spectrometry , *COLLISION induced dissociation - Abstract
Generative modeling and representation learning of tandem mass spectrometry data aim to learn an interpretable and instrument-agnostic digital representation of metabolites directly from MS/MS spectra. Interpretable and instrument-agnostic digital representations would facilitate comparisons of MS/MS spectra between instrument vendors and enable better and more accurate queries of large MS/MS spectra databases for metabolite identification. In this study, we apply generative modeling and representation learning using variational autoencoders to understand the extent to which tandem mass spectra can be disentangled into their factors of generation (e.g., collision energy, ionization mode, instrument type, etc.) with minimal prior knowledge of the factors. We find that variational autoencoders can disentangle tandem mass spectra data with the proper choice of hyperparameters into meaningful latent representations aligned with known factors of variation. We develop a two-step approach to facilitate the selection of models that are disentangled, which could be applied to other complex and high-dimensional data sets. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
284. ContourGAN: Auto‐contouring of organs at risk in abdomen computed tomography images using generative adversarial network.
- Author
-
Francis, Seenia, Jayaraj, P. B., Pournami, P. N., and Puzhakkal, Niyas
- Subjects
- *
GENERATIVE adversarial networks , *ABDOMEN , *ORGANS (Anatomy) , *COMPUTED tomography , *PANCREAS , *DEEP learning - Abstract
Accurately identifying and contouring the organs at risk (OARs) is a crucial step in radiation treatment planning for precise dose calculation. This task becomes especially challenging in computed tomography (CT) images due to the irregular boundaries of the organs under study. The method currently employed in clinical practice is the manual contouring of CT images, which tends to be highly tedious and time‐consuming. The results are also prone to variations depending on the observer's skill level, environment, or equipment types. A deep learning‐based automatic contouring technique for segmenting OARs would help eliminate these problems and generate consistent results with minimal time and human effort. Our approach is to design a conditional generative adversarial network (GAN)‐based technique for the semantic segmentation of OARs in abdominal CT images. The residual blocks of the generator network have a multi‐scale context layer that explores more generic characteristics, greatly enhancing performance and lowering losses. A comparative analysis is undertaken based on various assessment measures widely employed in segmentation. The results show substantial improvement, with mean dice scores of 98.0%, 96.6%, 98.2%, and 86.1% for the respective organs—liver, kidney, spleen, and pancreas—in the abdominal CT. The proposed GAN‐based model could accurately segment the four abdominal organs, including the liver, kidney, spleen, and pancreas. The obtained results prove that the suggested model is able to compete with existing state‐of‐the‐art abdominal OAR segmentation techniques. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
285. A survey of deep learning-based 3D shape generation.
- Author
-
Xu, Qun-Ce, Mu, Tai-Jiang, and Yang, Yong-Liang
- Subjects
DEEP learning ,GENERATIVE adversarial networks ,COMPUTER vision ,FEATURE extraction ,FORM perception ,POINT cloud - Abstract
Deep learning has been successfully used for tasks in the 2D image domain. Research on 3D computer vision and deep geometry learning has also attracted attention. Considerable achievements have been made regarding feature extraction and discrimination of 3D shapes. Following recent advances in deep generative models such as generative adversarial networks, effective generation of 3D shapes has become an active research topic. Unlike 2D images with a regular grid structure, 3D shapes have various representations, such as voxels, point clouds, meshes, and implicit functions. For deep learning of 3D shapes, shape representation has to be taken into account as there is no unified representation that can cover all tasks well. Factors such as the representativeness of geometry and topology often largely affect the quality of the generated 3D shapes. In this survey, we comprehensively review works on deep-learning-based 3D shape generation by classifying and discussing them in terms of the underlying shape representation and the architecture of the shape generator. The advantages and disadvantages of each class are further analyzed. We also consider the 3D shape datasets commonly used for shape generation. Finally, we present several potential research directions that hopefully can inspire future works on this topic. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
286. Investigating Self-Rationalizing Models for Commonsense Reasoning.
- Author
-
Rancourt, Fanny, Vondrlik, Paula, Maupomé, Diego, and Meurs, Marie-Jean
- Subjects
LANGUAGE models ,NATURAL language processing - Abstract
The rise of explainable natural language processing spurred a bulk of work on datasets augmented with human explanations, as well as technical approaches to leverage them. Notably, generative large language models offer new possibilities, as they can output a prediction as well as an explanation in natural language. This work investigates the capabilities of fine-tuned text-to-text transfer Transformer (T5) models for commonsense reasoning and explanation generation. Our experiments suggest that while self-rationalizing models achieve interesting results, a significant gap remains: classifiers consistently outperformed self-rationalizing models, and a substantial fraction of model-generated explanations are not valid. Furthermore, training with expressive free-text explanations substantially altered the inner representation of the model, suggesting that they supplied additional information and may bridge the knowledge gap. Our code is publicly available, and the experiments were run on open-access datasets, hence allowing full reproducibility. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
287. Synthesizing Vehicle Speed-Related Features with Neural Networks.
- Author
-
Krepelka, Michal and Vrany, Jiri
- Subjects
DIGITAL technology ,DIGITAL twins ,HARDWARE-in-the-loop simulation ,DIGITAL computer simulation ,BIG data ,AUTOMOBILE industry - Abstract
In today's automotive industry, digital technology trends such as Big Data, Digital Twin, and Hardware-in-the-loop simulations using synthetic data offer opportunities that have the potential to transform the entire industry towards being more software-oriented and thus more effective and environmentally friendly. In this paper, we propose generative models to synthesize car features related to vehicle speed: brake pressure, percentage of the pressed throttle pedal, engaged gear, and engine RPM. Synthetic data are essential to digitize Hardware-in-the-loop integration testing of the vehicle's dashboard, navigation, or infotainment and for Digital Twin simulations. We trained models based on Multilayer Perceptron and bidirectional Long-Short Term Memory neural network for each feature. These models were evaluated on a real-world dataset and demonstrated sufficient accuracy in predicting the desired features. Combining our current research with previous work on generating a speed profile for an arbitrary trip, where Open Street Map data and elevation data are available, allows us to digitally drive this trip. At the time of writing, we are unaware of any similar data-driven approach for generating desired speed-related features. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
288. Early adversity changes the economic conditions of mouse structural brain network organization.
- Author
-
Carozza, Sofia, Holmes, Joni, Vértes, Petra E., Bullmore, Ed, Arefin, Tanzil M., Pugliese, Alexa, Zhang, Jiangyang, Kaffman, Arie, Akarca, Danyal, and Astle, Duncan E.
- Abstract
Early adversity can change educational, cognitive, and mental health outcomes. However, the neural processes through which early adversity exerts these effects remain largely unknown. We used generative network modeling of the mouse connectome to test whether unpredictable postnatal stress shifts the constraints that govern the organization of the structural connectome. A model that trades off the wiring cost of long‐distance connections with topological homophily (i.e., links between regions with shared neighbors) generated simulations that successfully replicate the rodent connectome. The imposition of early life adversity shifted the best‐performing parameter combinations toward zero, heightening the stochastic nature of the generative process. Put simply, unpredictable postnatal stress changes the economic constraints that reproduce rodent connectome organization, introducing greater randomness into the development of the simulations. While this change may constrain the development of cognitive abilities, it could also reflect an adaptive mechanism that facilitates effective responses to future challenges. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
289. A review on Generative Adversarial Networks for image generation.
- Author
-
Trevisan de Souza, Vinicius Luis, Marques, Bruno Augusto Dorta, Batagelo, Harlen Costa, and Gois, João Paulo
- Subjects
- *
GENERATIVE adversarial networks , *DEEP learning , *IMAGE processing - Abstract
Generative Adversarial Networks (GANs) are a type of deep learning architecture that uses two networks namely a generator and a discriminator that, by competing against each other, pursue to create realistic but previously unseen samples. They have become a popular research topic in recent years, particularly for image processing and synthesis, leading to many advances and applications in various fields. With the profusion of published works and interest from professionals of different areas, surveys on GANs are necessary, mainly for those who aim starting on this topic. In this work, we cover the basics and notable architectures of GANs, focusing on their applications in image generation. We also discuss how the challenges to be addressed in GANs architectures have been faced, such as mode coverage, stability, convergence, and evaluating image quality using metrics. [Display omitted] • A review on GANs for image generation, aiming at readers who are new to the area. • A comprehensive overview of GAN fundamentals, and methods to address the most common issues. • A detailed explanation of how various works applied GANs in image-based applications. • A discussion of future directions for this area. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
290. Disentangling high-level factors and their features with conditional vector quantized VAEs.
- Author
-
Zou, Kaifeng, Faisan, Sylvain, Heitz, Fabrice, and Valette, Sébastien
- Subjects
- *
RANDOM variables , *LATENT variables - Abstract
• We propose a disentangled representation that models the labels and their features. • Our proposal includes a novel dependency structure and a two-step learning procedure. • It allows accurate control of labels and their features in the generated images. • We show several examples of label and feature manipulation for 2D images. • Our approach improves disentanglement properties and the quality of generated images. Two recent works have shown the benefit of modeling both high-level factors and their related features to learn disentangled representations with variational autoencoders (VAE). We propose here a novel VAE-based approach that follows this principle. Inspired by conditional VAE, the features are no longer treated as random variables over which integration must be performed. Instead, they are deterministically computed from the input data using a neural network whose parameters can be estimated jointly with those of the decoder and of the encoder. Moreover, the quality of the generated images has been improved by using discrete latent variables and a two-step learning procedure, which makes it possible to increase the size of the latent space without altering the disentanglement properties of the model. Results obtained on two different datasets validate the proposed approach that achieves better performance than the two aforementioned works in terms of disentanglement, while providing higher quality images. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
291. Listen, Denoise, Action! Audio-Driven Motion Synthesis with Diffusion Models.
- Author
-
Alexanderson, Simon, Nagy, Rajmund, Beskow, Jonas, and Henter, Gustav Eje
- Subjects
DANCE ,AUTOMATIC speech recognition ,INTERPOLATION ,GESTURE ,LISTENING - Abstract
Diffusion models have experienced a surge of interest as highly expressive yet efficiently trainable probabilistic models. We show that these models are an excellent fit for synthesising human motion that co-occurs with audio, e.g., dancing and co-speech gesticulation, since motion is complex and highly ambiguous given audio, calling for a probabilistic description. Specifically, we adapt the DiffWave architecture to model 3D pose sequences, putting Conformers in place of dilated convolutions for improved modelling power. We also demonstrate control over motion style, using classifier-free guidance to adjust the strength of the stylistic expression. Experiments on gesture and dance generation confirm that the proposed method achieves top-of-the-line motion quality, with distinctive styles whose expression can be made more or less pronounced. We also synthesise path-driven locomotion using the same model architecture. Finally, we generalise the guidance procedure to obtain product-of-expert ensembles of diffusion models and demonstrate how these may be used for, e.g., style interpolation, a contribution we believe is of independent interest. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
292. 3DShape2VecSet: A 3D Shape Representation for Neural Fields and Generative Diffusion Models.
- Author
-
Zhang, Biao, Tang, Jiapeng, Nießner, Matthias, and Wonka, Peter
- Subjects
POINT cloud ,POINT set theory - Abstract
We introduce 3DShape2VecSet, a novel shape representation for neural fields designed for generative diffusion models. Our shape representation can encode 3D shapes given as surface models or point clouds, and represents them as neural fields. The concept of neural fields has previously been combined with a global latent vector, a regular grid of latent vectors, or an irregular grid of latent vectors. Our new representation encodes neural fields on top of a set of vectors. We draw from multiple concepts, such as the radial basis function representation, and the cross attention and self-attention function, to design a learnable representation that is especially suitable for processing with transformers. Our results show improved performance in 3D shape encoding and 3D shape generative modeling tasks. We demonstrate a wide variety of generative applications: unconditioned generation, category-conditioned generation, text-conditioned generation, point-cloud completion, and image-conditioned generation. Code: https://1zb.github.io/3DShape2VecSet/. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
293. Flexible Isosurface Extraction for Gradient-Based Mesh Optimization.
- Author
-
Shen, Tianchang, Munkberg, Jacob, Hasselgren, Jon, Yin, Kangxue, Wang, Zian, Chen, Wenzheng, Gojcic, Zan, Fidler, Sanja, Sharp, Nicholas, and Gao, Jun
- Subjects
SCALAR field theory ,AUTOMATIC differentiation ,ISOGEOMETRIC analysis ,TOPOLOGICAL property ,DEGREES of freedom ,CUBES - Abstract
This work considers gradient-based mesh optimization, where we iteratively optimize for a 3D surface mesh by representing it as the isosurface of a scalar field, an increasingly common paradigm in applications including photogrammetry, generative modeling, and inverse physics. Existing implementations adapt classic isosurface extraction algorithms like Marching Cubes or Dual Contouring; these techniques were designed to extract meshes from fixed, known fields, and in the optimization setting they lack the degrees of freedom to represent high-quality feature-preserving meshes, or suffer from numerical instabilities. We introduce FlexiCubes, an isosurface representation specifically designed for optimizing an unknown mesh with respect to geometric, visual, or even physical objectives. Our main insight is to introduce additional carefully-chosen parameters into the representation, which allow local flexible adjustments to the extracted mesh geometry and connectivity. These parameters are updated along with the underlying scalar field via automatic differentiation when optimizing for a downstream task. We base our extraction scheme on Dual Marching Cubes for improved topological properties, and present extensions to optionally generate tetrahedral and hierarchically-adaptive meshes. Extensive experiments validate FlexiCubes on both synthetic benchmarks and real-world applications, showing that it offers significant improvements in mesh quality and geometric fidelity. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
294. Universal Adversarial Training Using Auxiliary Conditional Generative Model-Based Adversarial Attack Generation.
- Author
-
Dingeto, Hiskias and Kim, Juntae
- Subjects
MACHINE learning ,GENERATIVE adversarial networks ,DATA augmentation ,BOOSTING algorithms - Abstract
While Machine Learning has become the holy grail of modern-day computing, it has many security flaws that have yet to be addressed and resolved. Adversarial attacks are one of these security flaws, in which an attacker appends noise to data samples that machine learning models take as input with the aim of fooling the model. Various adversarial training methods have been proposed that augment adversarial examples in the training dataset for defense against such attacks. However, a general limitation exists where a robust model can only protect itself against adversarial attacks that are known or similar to those it was trained on. To address this limitation, this paper proposes a Universal Adversarial Training algorithm using adversarial examples generated by an Auxiliary Classifier Generative Adversarial Network (AC-GAN) in parallel with other data augmentation techniques, such as the mixup method. This method builds on a previously proposed technique, Adversarial Training, in which adversarial examples produced by gradient-based methods are augmented and added to the training data. Our method improves the AC-GAN architecture for adversarial example generation to make it more suitable for adversarial training by updating different loss terms and testing its performance against various attacks compared to other robust adversarial models. In this way, it becomes apparent that generative models are better suited for boosting adversarial robustness through adversarial training. When tested using various attack types, our proposed model had an average accuracy of 97.48% on the MNIST dataset and 94.02% on the CelebA dataset, proving that generative models have a higher chance of boosting adversarial security through adversarial training. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
295. A Generative Account of Latent Abstractions
- Author
-
Xie, Sirui
- Subjects
Artificial intelligence ,Decision-making ,Generative models ,Latent abstraciton - Abstract
Abstractions are fundamental to human intelligence, extending far beyond pattern recognition. They enable the distillation and organization of complex information into structured knowledge, facilitate the succinct communication of intricate ideas, and empower us to navigate complex decision-making scenarios with consistent value prediction. The ability to abstract is particularly fascinating because abstractions are not inherently present in raw data --- they are latent variables underlying our observations. Despite the recent phenomenal advances in modeling data distributions, Generative Artificial Intelligence (GenAI) systems still lack robust principles for the autonomous emergence of latent abstractions.This dissertation studies the problem of unsupervised latent abstraction learning, focusing on developing modeling, learning, and inference methods for latent-variable generative models across diverse high-dimensional data modalities. The core premise is that by incorporating algebraic, geometric, and statistical structures into the latent space and generator, we can cultivate representations of latent variables that explain observed data in alignment with human understanding.The dissertation consists of four parts. The first three explore the generative constructs of latent abstractions for Category, Object, and Decision, respectively. Part I examines the basic structure of categories, emphasizing their symbol-vector duality. We develop a latent-variable text model with a coupling of symbols and vectors in its representations. We investigate another representation that is both discrete and continuous --- iconic symbols --- in a visual communication game. Part II enriches the abstract structure by shifting focus to object-centric abstractions in visual data. We introduce a generative model that disentangles objects from backgrounds in the latent space. We then rethink the algebraic structures of object abstractions and propose a novel metric that measures compositionality as a more generic form than disentanglement. Part III incorporates situational context by introducing a sequential decision-making aspect with trajectory data. Here, latent abstractions manifest as actions and plans. We bridge the theories of decision-making and generative modeling, proving that the inference of latent decisions enhances consistency with the model's understanding while optimizing intrinsic values. Whereas these three parts adopt the paradigm of directly learning from raw data, Part IV introduces a dialectic discussion with an alternative paradigm, Knowledge Distillation. We demonstrate how to distill from and accelerate the state-of-the-art massive-scale data-space models by re-purposing our methods and techniques for latent-variable generative modeling. Together, the contributions of this dissertation enable GenAI systems to overcome the critical bottlenecks of alignment, efficiency, and consistency in representation, inference, and decision-making.
- Published
- 2024
296. Electrocardiogram Synthesis Using Denoising Diffusion Probabilistic Models and Bidirectional State-Space Models
- Author
-
Alsharif, Haya Adnan N
- Subjects
Statistics ,Computer science ,Artificial intelligence ,Diffusion Models ,ECG Synthesis ,Electrocardiography ,Generative Models ,Signal Processing ,Synthetic data - Abstract
This thesis investigates the application of Denoising Diffusion Probabilistic Models (DDPM) for synthesizing 12-lead Electrocardiogram (ECG) signals. Utilizing classifier-free guidance along with the bidirectional Mamba State Space Model (SSM) within a DiffWave framework, we developed a model capable of both unconditional and conditional ECG signal generation.Despite the promising potential of Mamba for enhancing temporal signal encoding, its performance compared to the time-invariant SSM model, S4, was either worse or inconclusive, likely due to the limitations of current metrics. Visual assessments often contradicted the automated metrics, indicating a significant gap in current evaluation methods. We also explored the feasibility of training models on all 12 leads, contrasting previous studies that used fewer leads. Our findings indicate that diffusion models can adequately learn the linear relationships between leads without significantly increasing model size. Overall, our work reduces the need for extensive data pre- or post-processing, streamlines ECG data generation, and highlights the limitations of existing evaluation methodologies, suggesting the need for further evaluation.
- Published
- 2024
297. Multi-Track Music Generation with Latent Diffusion Models
- Author
-
Karchkhadze, Tornike
- Subjects
Music ,diffusion models ,generative models ,machine learning ,multi track - Abstract
In recent years, diffusion models have demonstrated promising results in cross-modalgeneration tasks within generative media, encompassing image, video, and audio generation.This development has introduced a great deal of novelty to audio and music-related tasks, suchas text-to-sound and text-to-music generation. However, these text-controlled music generationmodels typically focus on capturing global musical attributes, such as genre and mood, and donot allow for the more fine-grained control that composers might desire. Music composition is acomplex, multilayered task that frequently involves intricate musical arrangements as an essentialpart of the creative process. This task requires composers to carefully align each instrument withexisting tracks in terms of beat, dynamics, harmony, and melody, demanding a level of precisionand control over individual tracks that current text-driven prompts often fail to provide.In this work, we address these challenges by presenting a multi-track music generationmodel, one of the first of its kind. Our model, by learning the joint probability of tracks sharinga context, is capable of generating music across several tracks that correspond well to each other,either conditionally or unconditionally. We achieve this by extending the MusicLDM—a latentdiffusion model for music—into a multi-track generative model. Additionally, our model iscapable of arrangement generation, where it can generate any subset of tracks given the others(e.g., generating a piano track that complements given bass and drum tracks). We compared ourmodel with existing multi-track generative models and demonstrated that our model achievesconsiderable improvements across objective metrics, for both total and arrangement generationtasks. Additionally, we demonstrated that our model is capable of meaningful conditioninggeneration with text and reference musical audio, corresponding well to text meaning andreference audio content/style. Sound examples form this work can be found at https://mtmusicldm.github.io.
- Published
- 2024
298. Robust Modeling through Causal Priors and Data Purification in Machine Learning
- Author
-
Bhat, Sunay Gajanan
- Subjects
Computer science ,Computer engineering ,Electrical engineering ,Artificial Intelligence ,Causality ,Generative Models ,Machine Learning ,Poison Defense ,Robust Classification - Abstract
The continued success and ubiquity of machine learning techniques, particularly Deep Learning, have necessitated research in robust model training to enhance generalization capabilities and security against incomplete data, distributional shifts, and adversarial attacks. This thesis presents two primary sets of contributions to robust modeling in machine learning through the use of causal priors and data purification with generative models such as the Variational Autoencoder (VAE), Energy-Based Model (EBM), and Denoising Diffusion Probabilistic Model (DDPM), focusing on image datasets. In the first set of contributions, we use structural causal priors in the latent spaces of VAEs. Initially, we demonstrate counterfactual synthetic data generation outside the training data distribution. This technique allows for the creation of diverse and novel data points, which is critical to enhancing model robustness and generalization capabilities. We utilize a similar VAE architecture to compare causal structural (graphical) hypotheses, showing that the fit of generated data from various hypotheses on distributionally shifted test data is an effective method for hypothesis comparison. Additionally, we explore using augmentations in the latent space of a VAE as an efficient and effective way to generate realistic augmented data. The second set of contributions focuses on data purification using EBMs and DDPMs. We propose a framework of universal data purification methods to defend against train-time data poisoning attacks. This framework utilizes stochastic transforms realized via iterative Langevin dynamics of EBMs, DDPMs, or both, to purify poisoned data with minimal impact on classifier generalization. Our specially trained EBMs and DDPMs provide state-of-the-art defense against various poisoning attacks while preserving natural accuracy. Preprocessing data with these techniques pushes poisoned images into the natural, clean image manifold, effectively neutralizing adversarial perturbations. The framework achieves state-of-the-art performance without needing attack or classifier-specific information, even when the generative models are trained on poisoned or distributionally shifted data. Beyond defense against data poisoning, our framework also shows promise in applications such as the degradation and removal of unwanted intellectual property. The flexibility and generality of these data purification techniques represent a significant step forward in the adversarial model training paradigm. All of these methods enable new perspectives and approaches to robust machine learning, advancing an essential field in artificial intelligence research.
- Published
- 2024
299. Magicmol: a light-weighted pipeline for drug-like molecule evolution and quick chemical space exploration
- Author
-
Lin Chen, Qing Shen, and Jungang Lou
- Subjects
Generative models ,Reinforcement learning ,Deep learning ,Synthetic accessibility ,De novo drug design ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Biology (General) ,QH301-705.5 - Abstract
Abstract The flourishment of machine learning and deep learning methods has boosted the development of cheminformatics, especially regarding the application of drug discovery and new material exploration. Lower time and space expenses make it possible for scientists to search the enormous chemical space. Recently, some work combined reinforcement learning strategies with recurrent neural network (RNN)-based models to optimize the property of generated small molecules, which notably improved a batch of critical factors for these candidates. However, a common problem among these RNN-based methods is that several generated molecules have difficulty in synthesizing despite owning higher desired properties such as binding affinity. However, RNN-based framework better reproduces the molecule distribution among the training set than other categories of models during molecule exploration tasks. Thus, to optimize the whole exploration process and make it contribute to the optimization of specified molecules, we devised a light-weighted pipeline called Magicmol; this pipeline has a re-mastered RNN network and utilize SELFIES presentation instead of SMILES. Our backbone model achieved extraordinary performance while reducing the training cost; moreover, we devised reward truncate strategies to eliminate the model collapse problem. Additionally, adopting SELFIES presentation made it possible to combine STONED-SELFIES as a post-processing procedure for specified molecule optimization and quick chemical space exploration.
- Published
- 2023
- Full Text
- View/download PDF
300. MolFilterGAN: a progressively augmented generative adversarial network for triaging AI-designed molecules
- Author
-
Xiaohong Liu, Wei Zhang, Xiaochu Tong, Feisheng Zhong, Zhaojun Li, Zhaoping Xiong, Jiacheng Xiong, Xiaolong Wu, Zunyun Fu, Xiaoqin Tan, Zhiguo Liu, Sulin Zhang, Hualiang Jiang, Xutong Li, and Mingyue Zheng
- Subjects
De novo molecule design ,Generative models ,Deep learning ,Virtual screening ,Compound quality control ,Information technology ,T58.5-58.64 ,Chemistry ,QD1-999 - Abstract
Abstract Artificial intelligence (AI)-based molecular design methods, especially deep generative models for generating novel molecule structures, have gratified our imagination to explore unknown chemical space without relying on brute-force exploration. However, whether designed by AI or human experts, the molecules need to be accessibly synthesized and biologically evaluated, and the trial-and-error process remains a resources-intensive endeavor. Therefore, AI-based drug design methods face a major challenge of how to prioritize the molecular structures with potential for subsequent drug development. This study indicates that common filtering approaches based on traditional screening metrics fail to differentiate AI-designed molecules. To address this issue, we propose a novel molecular filtering method, MolFilterGAN, based on a progressively augmented generative adversarial network. Comparative analysis shows that MolFilterGAN outperforms conventional screening approaches based on drug-likeness or synthetic ability metrics. Retrospective analysis of AI-designed discoidin domain receptor 1 (DDR1) inhibitors shows that MolFilterGAN significantly increases the efficiency of molecular triaging. Further evaluation of MolFilterGAN on eight external ligand sets suggests that MolFilterGAN is useful in triaging or enriching bioactive compounds across a wide range of target types. These results highlighted the importance of MolFilterGAN in evaluating molecules integrally and further accelerating molecular discovery especially combined with advanced AI generative models.
- Published
- 2023
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.