33 results on '"Unpaired image-to-image translation"'
Search Results
2. Latent-SDE: guiding stochastic differential equations in latent space for unpaired image-to-image translation.
- Author
-
Zhang, Xianjie, Li, Min, He, Yujie, Gou, Yao, and Zhang, Yusen
- Subjects
STOCHASTIC differential equations ,DOGS ,PIXELS - Abstract
Score-based diffusion models have shown promising results in unpaired image-to-image translation (I2I). However, the existing methods only perform unpaired I2I in pixel space, which requires high computation costs. To this end, we propose guiding stochastic differential equations in latent space (Latent-SDE) that extracts domain-specific and domain-independent features of the image in the latent space to calculate the loss and guides the inference process of a pretrained SDE in the latent space for unpaired I2I. To refine the image in the latent space, we propose a latent time-travel strategy that increases the sampling timestep. Empirically, we compare Latent-SDE to the baseline of the score-based diffusion model on three widely adopted unpaired I2I tasks under two metrics. Latent-SDE achieves state-of-the-art on Cat → Dog and is competitive on the other two tasks. Our code will be freely available for public use upon acceptance at https://github.com/zhangXJ147/Latent-SDE. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Latent-SDE: guiding stochastic differential equations in latent space for unpaired image-to-image translation
- Author
-
Xianjie Zhang, Min Li, Yujie He, Yao Gou, and Yusen Zhang
- Subjects
Unpaired image-to-image translation ,Score-based diffusion models ,Latent space ,Latent time-travel ,Electronic computers. Computer science ,QA75.5-76.95 ,Information technology ,T58.5-58.64 - Abstract
Abstract Score-based diffusion models have shown promising results in unpaired image-to-image translation (I2I). However, the existing methods only perform unpaired I2I in pixel space, which requires high computation costs. To this end, we propose guiding stochastic differential equations in latent space (Latent-SDE) that extracts domain-specific and domain-independent features of the image in the latent space to calculate the loss and guides the inference process of a pretrained SDE in the latent space for unpaired I2I. To refine the image in the latent space, we propose a latent time-travel strategy that increases the sampling timestep. Empirically, we compare Latent-SDE to the baseline of the score-based diffusion model on three widely adopted unpaired I2I tasks under two metrics. Latent-SDE achieves state-of-the-art on Cat $$\rightarrow $$ → Dog and is competitive on the other two tasks. Our code will be freely available for public use upon acceptance at https://github.com/zhangXJ147/Latent-SDE .
- Published
- 2024
- Full Text
- View/download PDF
4. Enhancing Night-to-Day Image Translation with Semantic Prior and Reference Image Guidance
- Author
-
Ning, Junzhi, Gong, Mingming, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Bao, Zhifeng, editor, Borovica-Gajic, Renata, editor, Qiu, Ruihong, editor, Choudhury, Farhana, editor, and Yang, Zhengyi, editor
- Published
- 2024
- Full Text
- View/download PDF
5. Photogenic Guided Image-to-Image Translation With Single Encoder
- Author
-
Rina Oh and T. Gonsalves
- Subjects
AI ,deep learning ,GAN ,unpaired image-to-image translation ,style synthesis translation ,Electronic computers. Computer science ,QA75.5-76.95 ,Information technology ,T58.5-58.64 - Abstract
Image-to-image translation involves combining content and style from different images to generate new images. This technology is particularly valuable for exploring artistic aspects, such as how artists from different eras would depict scenes. Deep learning models are ideal for achieving these artistic styles. This study introduces an unpaired image-to-image translation architecture that extracts style features directly from input style images, without requiring a special encoder. Instead, the model uses a single encoder for the content image. To process the spatial features of the content image and the artistic features of the style image, a new normalization function called Direct Adaptive Instance Normalization with Pooling is developed. This function extracts style images more effectively, reducing the computational costs compared to existing guided image-to-image translation models. Additionally, we employed a Vision Transformer (ViT) in the Discriminator to analyze entire spatial features. The new architecture, named Single-Stream Image-to-Image Translation (SSIT), was tested on various tasks, including seasonal translation, weather-based environment transformation, and photo-to-art conversion. The proposed model successfully reflected the design information of the style images, particularly in translating photos to artworks, where it faithfully reproduced color characteristics. Moreover, the model consistently outperformed state-of-the-art translation models in each experiment, as confirmed by Fréchet Inception Distance (FID) and Kernel Inception Distance (KID) scores.
- Published
- 2024
- Full Text
- View/download PDF
6. M-GenSeg: Domain Adaptation for Target Modality Tumor Segmentation with Annotation-Efficient Supervision
- Author
-
Alefsen, Malo, Vorontsov, Eugene, Kadoury, Samuel, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Greenspan, Hayit, editor, Madabhushi, Anant, editor, Mousavi, Parvin, editor, Salcudean, Septimiu, editor, Duncan, James, editor, Syeda-Mahmood, Tanveer, editor, and Taylor, Russell, editor
- Published
- 2023
- Full Text
- View/download PDF
7. Face Generation from Skull Photo Using GAN and 3D Face Models
- Author
-
Vo, Duy K., Bui, Len T., Le, Thai H., Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, and Arai, Kohei, editor
- Published
- 2023
- Full Text
- View/download PDF
8. Controllable Unsupervised Snow Synthesis by Latent Style Space Manipulation.
- Author
-
Yang, Hanting, Carballo, Alexander, Zhang, Yuxiao, and Takeda, Kazuya
- Subjects
- *
GENERATIVE adversarial networks , *LATENT variables , *DECODING algorithms - Abstract
In the field of intelligent vehicle technology, there is a high dependence on images captured under challenging conditions to develop robust perception algorithms. However, acquiring these images can be both time-consuming and dangerous. To address this issue, unpaired image-to-image translation models offer a solution by synthesizing samples of the desired domain, thus eliminating the reliance on ground truth supervision. However, the current methods predominantly focus on single projections rather than multiple solutions, not to mention controlling the direction of generation, which creates a scope for enhancement. In this study, we propose a generative adversarial network (GAN)–based model, which incorporates both a style encoder and a content encoder, specifically designed to extract relevant information from an image. Further, we employ a decoder to reconstruct an image using these encoded features, while ensuring that the generated output remains within a permissible range by applying a self-regression module to constrain the style latent space. By modifying the hyperparameters, we can generate controllable outputs with specific style codes. We evaluate the performance of our model by generating snow scenes on the Cityscapes and the EuroCity Persons datasets. The results reveal the effectiveness of our proposed methodology, thereby reinforcing the benefits of our approach in the ongoing evolution of intelligent vehicle technology. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
9. Improving Generative Adversarial Networks for Patch-Based Unpaired Image-to-Image Translation
- Author
-
Moritz Bohland, Roman Bruch, Simon Bauerle, Luca Rettenberger, and Markus Reischl
- Subjects
GAN ,unpaired image-to-image translation ,3D image synthesis ,stitching ,CycleGAN ,tiling ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Deep learning models for image segmentation achieve high-quality results, but need large amounts of training data. Training data is primarily annotated manually, which is time-consuming and often not feasible for large-scale 2D and 3D images. Manual annotation can be reduced using synthetic training data generated by generative adversarial networks that perform unpaired image-to-image translation. As of now, large images need to be processed patch-wise during inference, resulting in local artifacts in border regions after merging the individual patches. To reduce these artifacts, we propose a new method that integrates overlapping patches into the training process. We incorporated our method into CycleGAN and tested it on our new 2D tiling strategy benchmark dataset. The results show that the artifacts are reduced by 85% compared to state-of-the-art weighted tiling. While our method increases training time, inference time decreases. Additionally, we demonstrate transferability to real-world 3D biological image data, receiving a high-quality synthetic dataset. Increasing the quality of synthetic training datasets can reduce manual annotation, increase the quality of model output, and can help develop and evaluate deep learning models.
- Published
- 2023
- Full Text
- View/download PDF
10. Attention-Aided Generative Learning for Multi-Scale Multi-Modal Fundus Image Translation
- Author
-
Van-Nguyen Pham, Duc-Tai Le, Junghyun Bum, Eun Jung Lee, Jong Chul Han, and Hyunseung Choo
- Subjects
Conventional fundus images ,deep learning ,generative learning ,ophthalmology ,unpaired image-to-image translation ,ultra wide-field fundus images ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Conventional fundus images (CFIs) and ultra-widefield fundus images (UFIs) are two fundamental image modalities in ophthalmology. While CFIs provide a detailed view of the optic nerve head and the posterior pole of an eye, their clinical use is associated with high costs and patient inconvenience due to the requirement of good pupil dilation. On the other hand, UFIs capture peripheral lesions, but their image quality is sensitive to factors such as pupil size, eye position, and eyelashes, leading to greater variability between examinations compared to CFIs. The widefield retina view of UFIs offers the theoretical possibility of generating CFIs from available UFIs to reduce patient examination costs. A recent study has shown the feasibility of this approach by leveraging deep learning techniques for the UFI-to-CFI translation task. However, the technique suffers from the heterogeneous scales of the image modalities and variations in the brightness of the training data. In this paper, we address these issues with a novel framework consisting of three stages: cropping, enhancement, and translation. The first stage is an optic disc-centered cropping strategy that helps to alleviate the scale difference between the two image domains. The second stage mitigates the variation in training data brightness and unifies the mask between the two modalities. In the last stage, we introduce an attention-aided generative learning model to translate a given UFI into the CFI domain. Our experimental results demonstrate the success of the proposed method on 1,011 UFIs, with 99.8% of the generated CFIs evaluated as good quality and usable. Expert evaluations confirm significant visual quality improvements in the generated CFIs compared to the UFIs, ranging from 10% to 80% for features such as optic nerve structure, vascular distribution, and drusen. Furthermore, using generated CFIs in an AI-based diagnosis system for age-related macular degeneration results in superior accuracy compared to UFIs and competitive performance relative to real CFIs. These results showcase the potential of our approach for automatic disease diagnosis and monitoring.
- Published
- 2023
- Full Text
- View/download PDF
11. Ultra-High-Resolution Unpaired Stain Transformation via Kernelized Instance Normalization
- Author
-
Ho, Ming-Yang, Wu, Min-Sheng, Wu, Che-Ming, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Avidan, Shai, editor, Brostow, Gabriel, editor, Cissé, Moustapha, editor, Farinella, Giovanni Maria, editor, and Hassner, Tal, editor
- Published
- 2022
- Full Text
- View/download PDF
12. Exploring the Possibility of Measuring Vertebrae Bone Structure Metrics Using MDCT Images: An Unpaired Image-to-Image Translation Method.
- Author
-
Jin, Dan, Zheng, Han, and Yuan, Huishu
- Subjects
- *
X-ray computed microtomography , *COMPUTED tomography , *EARLY diagnosis , *TOMOGRAPHY , *LUMBAR vertebrae , *POSSIBILITY - Abstract
Bone structure metrics are vital for the evaluation of vertebral bone strength. However, the gold standard for measuring bone structure metrics, micro-Computed Tomography (micro-CT), cannot be used in vivo, which hinders the early diagnosis of fragility fractures. This paper used an unpaired image-to-image translation method to capture the mapping between clinical multidetector computed tomography (MDCT) and micro-CT images and then generated micro-CT-like images to measure bone structure metrics. MDCT and micro-CT images were scanned from 75 human lumbar spine specimens and formed training and testing sets. The generator in the model focused on learning both the structure and detailed pattern of bone trabeculae and generating micro-CT-like images, and the discriminator determined whether the generated images were micro-CT images or not. Based on similarity metrics (i.e., SSIM and FID) and bone structure metrics (i.e., bone volume fraction, trabecular separation and trabecular thickness), a set of comparisons were performed. The results show that the proposed method can perform better in terms of both similarity metrics and bone structure metrics and the improvement is statistically significant. In particular, we compared the proposed method with the paired image-to-image method and analyzed the pros and cons of the method used. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
13. A Deep Learning Approach for Digital Color Reconstruction of Van Gogh’s Paintings Using Unpaired Areas Under the Frame
- Author
-
Parvathaneni, Tirumala Sai Ram, Pisarenco, Maxim, and Onvlee, Hans
- Published
- 2024
- Full Text
- View/download PDF
14. Towards semantically continuous unpaired image-to-image translation via margin adaptive contrastive learning and wavelet transform.
- Author
-
Zhang, Heng, Yang, Yi-Jun, and Zeng, Wei
- Subjects
- *
WAVELET transforms , *DISCRETE wavelet transforms , *TRANSFORMER models , *PETRI nets , *TRANSLATING & interpreting - Abstract
Unpaired image-to-image translation aims to preserve the semantics of the input image while mimicking the style of target domains without paired data. However, existing methods often suffer from semantic distortions if the source and target domains have large mismatched semantics distributions. To address semantic distortions in translation outputs without paired supervision, we propose a Margin Adaptive Contrastive Learning Network (MACL-Net) that drives contrastive learning as a local semantic descriptor while using a pre-trained Vision Transformer (ViT) as a global semantic descriptor to learn domain-invariant features in the translation process. Specifically, we design a novel margin adaptive contrastive loss to enforce intra-class compactness and inter-class discrepancy. Besides, to better retain the semantic structure of the translated image and improve its fidelity, we use Discrete Wavelet Transform (DWT) to supplement the low-frequency and high-frequency information of the input image into the generator, and effectively fuse the feedforward features and inversed frequency information through a novel normalization scheme, Feature-Frequency Transformation Normalization (FFTN). In terms of experimental results, MACL-Net effectively reduces semantic distortions and generates translation outputs that outperform state-of-the-art techniques both quantitatively and qualitatively. • We propose MACL-Net to reduce image artifacts and objecthallucinations. • We propose a novel Margin Adaptive Contrastive Loss (MACL). • We design an Adaptive Frequency Fusion (AFF) module. • We design a novel normalization scheme Feature-Frequency Transformation Normalization (FFTN). [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. Controllable Unsupervised Snow Synthesis by Latent Style Space Manipulation
- Author
-
Hanting Yang, Alexander Carballo, Yuxiao Zhang, and Kazuya Takeda
- Subjects
intelligent vehicles ,snow scenes ,unpaired image-to-image translation ,diversity ,style latent space ,Gaussian distribution ,Chemical technology ,TP1-1185 - Abstract
In the field of intelligent vehicle technology, there is a high dependence on images captured under challenging conditions to develop robust perception algorithms. However, acquiring these images can be both time-consuming and dangerous. To address this issue, unpaired image-to-image translation models offer a solution by synthesizing samples of the desired domain, thus eliminating the reliance on ground truth supervision. However, the current methods predominantly focus on single projections rather than multiple solutions, not to mention controlling the direction of generation, which creates a scope for enhancement. In this study, we propose a generative adversarial network (GAN)–based model, which incorporates both a style encoder and a content encoder, specifically designed to extract relevant information from an image. Further, we employ a decoder to reconstruct an image using these encoded features, while ensuring that the generated output remains within a permissible range by applying a self-regression module to constrain the style latent space. By modifying the hyperparameters, we can generate controllable outputs with specific style codes. We evaluate the performance of our model by generating snow scenes on the Cityscapes and the EuroCity Persons datasets. The results reveal the effectiveness of our proposed methodology, thereby reinforcing the benefits of our approach in the ongoing evolution of intelligent vehicle technology.
- Published
- 2023
- Full Text
- View/download PDF
16. Generating Synthesized CT from Cone-Beam Computed Tomography (CBCT) Using Artifact Disentanglement Network for Image-Guided Radiotherapy (IGRT)
- Author
-
Cheng, Hanlin, Liu, Jiwei, Liu, Jianfei, Mao, Ronghu, Sun, Pengjian, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Hirche, Sandra, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Möller, Sebastian, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zhang, Junjie James, Series Editor, Jia, Yingmin, editor, Zhang, Weicun, editor, and Fu, Yongling, editor
- Published
- 2021
- Full Text
- View/download PDF
17. Towards Fine-Grained Control over Latent Space for Unpaired Image-to-Image Translation
- Author
-
Luo, Lei, Hsu, William, Wang, Shangxian, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Farkaš, Igor, editor, Masulli, Paolo, editor, Otte, Sebastian, editor, and Wermter, Stefan, editor
- Published
- 2021
- Full Text
- View/download PDF
18. Cross-Domain Cascaded Deep Translation
- Author
-
Katzir, Oren, Lischinski, Dani, Cohen-Or, Daniel, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Vedaldi, Andrea, editor, Bischof, Horst, editor, Brox, Thomas, editor, and Frahm, Jan-Michael, editor
- Published
- 2020
- Full Text
- View/download PDF
19. Normalization of HE-stained histological images using cycle consistent generative adversarial networks
- Author
-
Marlen Runz, Daniel Rusche, Stefan Schmidt, Martin R. Weihrauch, Jürgen Hesser, and Cleo-Aron Weis
- Subjects
Histology stain normalization ,HE-stain ,Digital pathology ,Generative adversarial networks ,Unpaired image-to-image translation ,Style transfer ,Pathology ,RB1-214 - Abstract
Abstract Background Histological images show strong variance (e.g. illumination, color, staining quality) due to differences in image acquisition, tissue processing, staining, etc. This can impede downstream image analysis such as staining intensity evaluation or classification. Methods to reduce these variances are called image normalization techniques. Methods In this paper, we investigate the potential of CycleGAN (cycle consistent Generative Adversarial Network) for color normalization in hematoxylin-eosin stained histological images using daily clinical data with consideration of the variability of internal staining protocol variations. The network consists of a generator network G B that learns to map an image X from a source domain A to a target domain B, i.e. G B :X A →X B . In addition, a discriminator network D B is trained to distinguish whether an image from domain B is real or generated. The same process is applied to another generator-discriminator pair (G A ,D A ), for the inverse mapping G A :X B →X A . Cycle consistency ensures that a generated image is close to its original when being mapped backwards (G A (G B (X A ))≈X A and vice versa). We validate the CycleGAN approach on a breast cancer challenge and a follicular thyroid carcinoma data set for various stain variations. We evaluate the quality of the generated images compared to the original images using similarity measures. In addition, we apply stain normalization on pathological lymph node data from our institute and test the gain from normalization on a ResNet classifier pre-trained on the Camelyon16 data set. Results Qualitative results of the images generated by our network are compared to original color distributions. Our evaluation indicates that by mapping images to a target domain, the similarity training images from that domain improves up to 96%. We also achieve a high cycle consistency for the generator networks by obtaining similarity indices greater than 0.9. When applying the CycleGAN normalization to HE-stain images from our institute the kappa-value of the ResNet-model that is only trained on Camelyon16 data is increased more than 50%. Conclusions CycleGANs have proven to efficiently normalize HE-stained images. The approach compensates for deviations resulting from image acquisition (e.g. different scanning devices) as well as from tissue staining (e.g. different staining protocols), and thus overcomes the staining variations in images from various institutions.The code is publicly available at https://github.com/m4ln/stainTransfer_CycleGAN_pytorch . The data set supporting the solutions is available at https://doi.org/10.11588/data/8LKEZF .
- Published
- 2021
- Full Text
- View/download PDF
20. Exploring the Possibility of Measuring Vertebrae Bone Structure Metrics Using MDCT Images: An Unpaired Image-to-Image Translation Method
- Author
-
Dan Jin, Han Zheng, and Huishu Yuan
- Subjects
micro-CT-like images ,unpaired image-to-image translation ,vertebrae ,bone structure ,Technology ,Biology (General) ,QH301-705.5 - Abstract
Bone structure metrics are vital for the evaluation of vertebral bone strength. However, the gold standard for measuring bone structure metrics, micro-Computed Tomography (micro-CT), cannot be used in vivo, which hinders the early diagnosis of fragility fractures. This paper used an unpaired image-to-image translation method to capture the mapping between clinical multidetector computed tomography (MDCT) and micro-CT images and then generated micro-CT-like images to measure bone structure metrics. MDCT and micro-CT images were scanned from 75 human lumbar spine specimens and formed training and testing sets. The generator in the model focused on learning both the structure and detailed pattern of bone trabeculae and generating micro-CT-like images, and the discriminator determined whether the generated images were micro-CT images or not. Based on similarity metrics (i.e., SSIM and FID) and bone structure metrics (i.e., bone volume fraction, trabecular separation and trabecular thickness), a set of comparisons were performed. The results show that the proposed method can perform better in terms of both similarity metrics and bone structure metrics and the improvement is statistically significant. In particular, we compared the proposed method with the paired image-to-image method and analyzed the pros and cons of the method used.
- Published
- 2023
- Full Text
- View/download PDF
21. 基于双专用注意力机制引导的循环生成对抗网络.
- Author
-
劳俊明, 叶武剑, 刘怡俊, and 袁凯奕
- Subjects
GENERATIVE adversarial networks ,MACHINE learning ,RADARSAT satellites - Abstract
Copyright of Chinese Journal of Liquid Crystal & Displays is the property of Chinese Journal of Liquid Crystal & Displays and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2022
- Full Text
- View/download PDF
22. Not every sample is efficient: Analogical generative adversarial network for unpaired image-to-image translation.
- Author
-
Zheng, Ziqiang, Yang, Jie, Yu, Zhibin, Wang, Yubo, Sun, Zhijian, and Zheng, Bing
- Subjects
- *
GENERATIVE adversarial networks , *PROBABILISTIC generative models - Abstract
Image translation is to learn an effective mapping function that aims to convert an image from a source domain to another target domain. With the proposal and further developments of generative adversarial networks (GANs), the generative models have achieved great breakthroughs. The image-to-image (I2I) translation methods can mainly fall into two categories: Paired and Unpaired. The former paired methods usually require a large amount of input–output sample pairs to perform one-side image translation, which heavily limits its practicability. To address the lack of the paired samples, CycleGAN and its extensions utilize the cycle-consistency loss to provide an elegant and generic solution to perform the unpaired I2I translation between two domains based on unpaired data. This thread of dual learning-based methods usually adopts the random sampling strategy for optimizing and does not consider the content similarity between samples. However, not every sample is efficient and effective for the desired optimization and leads to optimal convergence. Inspired by analogical learning, which is to utilize the relationships and similarities between sample observations, we propose a novel generic metric-based sampling strategy to effectively select samples from different domains for training. Besides, we introduce a novel analogical adversarial loss to force the model to learn from the effective samples and alleviate the influence of the negative samples. Experimental results on various vision tasks have demonstrated the superior performance of the proposed method. The proposed method is also a generic framework that can be easily extended to other I2I translation methods and result in a performance gain. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
23. Multi‐head mutual‐attention CycleGAN for unpaired image‐to‐image translation
- Author
-
Wei Ji, Jing Guo, and Yun Li
- Subjects
unpaired image‐to‐image translation ,source image domain ,multihead mutual‐attention CycleGAN model ,image size ,multihead mutual‐attention mechanism ,photorealistic images ,Photography ,TR1-1050 ,Computer software ,QA76.75-76.765 - Abstract
The image‐to‐image translation, i.e. from source image domain to target image domain, has made significant progress in recent years. The most popular method for unpaired image‐to‐image translation is CycleGAN. However, it always cannot accurately and rapidly learn the key features in target domains. So, the CycleGAN model learns slowly and the translation quality needs to be improved. In this study, a multi‐head mutual‐attention CycleGAN (MMA‐CycleGAN) model is proposed for unpaired image‐to‐image translation. In MMA‐CycleGAN, the cycle‐consistency loss and adversarial loss in CycleGAN are still used, but a mutual‐attention (MA) mechanism is introduced, which allows attention‐driven, long‐range dependency modelling between the two image domains. Moreover, to efficiently deal with the large image size, the MA is further improved to the multi‐head mutual‐attention (MMA) mechanism. On the other hand, domain labels are adopted to simplify the MMA‐CycleGAN architecture, so only one generator is required to perform bidirectional translation tasks. Experiments on multiple datasets demonstrate MMA‐CycleGAN is able to learn rapidly and obtain photo‐realistic images in a shorter time than CycleGAN.
- Published
- 2020
- Full Text
- View/download PDF
24. Semi-supervised video-driven facial animation transfer for production.
- Author
-
Moser, Lucio, Chien, Chinyu, Williams, Mark, Serra, Jose, Hendler, Darren, and Roble, Doug
- Subjects
MOTION capture (Human mechanics) ,LINEAR operators ,RENDERING (Computer graphics) ,FACIAL expression ,PHYSIOGNOMY ,PARAMETERIZATION - Abstract
We propose a simple algorithm for automatic transfer of facial expressions, from videos to a 3D character, as well as between distinct 3D characters through their rendered animations. Our method begins by learning a common, semantically-consistent latent representation for the different input image domains using an unsupervised image-to-image translation model. It subsequently learns, in a supervised manner, a linear mapping from the character images' encoded representation to the animation coefficients. At inference time, given the source domain (i.e., actor footage), it regresses the corresponding animation coefficients for the target character. Expressions are automatically remapped between the source and target identities despite differences in physiognomy. We show how our technique can be used in the context of markerless motion capture with controlled lighting conditions, for one actor and for multiple actors. Additionally, we show how it can be used to automatically transfer facial animation between distinct characters without consistent mesh parameterization and without engineered geometric priors. We compare our method with standard approaches used in production and with recent state-of-the-art models on single camera face tracking. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
25. Normalization of HE-stained histological images using cycle consistent generative adversarial networks.
- Author
-
Runz, Marlen, Rusche, Daniel, Schmidt, Stefan, Weihrauch, Martin R., Hesser, Jürgen, and Weis, Cleo-Aron
- Subjects
- *
GENERATIVE adversarial networks , *BREAST cancer , *IMAGE analysis , *THYROID cancer , *HEMATOXYLIN & eosin staining - Abstract
Background: Histological images show strong variance (e.g. illumination, color, staining quality) due to differences in image acquisition, tissue processing, staining, etc. This can impede downstream image analysis such as staining intensity evaluation or classification. Methods to reduce these variances are called image normalization techniques. Methods: In this paper, we investigate the potential of CycleGAN (cycle consistent Generative Adversarial Network) for color normalization in hematoxylin-eosin stained histological images using daily clinical data with consideration of the variability of internal staining protocol variations. The network consists of a generator network GB that learns to map an image X from a source domain A to a target domain B, i.e. GB:XA→XB. In addition, a discriminator network DB is trained to distinguish whether an image from domain B is real or generated. The same process is applied to another generator-discriminator pair (GA,DA), for the inverse mapping GA:XB→XA. Cycle consistency ensures that a generated image is close to its original when being mapped backwards (GA(GB(XA))≈XA and vice versa). We validate the CycleGAN approach on a breast cancer challenge and a follicular thyroid carcinoma data set for various stain variations. We evaluate the quality of the generated images compared to the original images using similarity measures. In addition, we apply stain normalization on pathological lymph node data from our institute and test the gain from normalization on a ResNet classifier pre-trained on the Camelyon16 data set. Results: Qualitative results of the images generated by our network are compared to original color distributions. Our evaluation indicates that by mapping images to a target domain, the similarity training images from that domain improves up to 96%. We also achieve a high cycle consistency for the generator networks by obtaining similarity indices greater than 0.9. When applying the CycleGAN normalization to HE-stain images from our institute the kappa-value of the ResNet-model that is only trained on Camelyon16 data is increased more than 50%. Conclusions: CycleGANs have proven to efficiently normalize HE-stained images. The approach compensates for deviations resulting from image acquisition (e.g. different scanning devices) as well as from tissue staining (e.g. different staining protocols), and thus overcomes the staining variations in images from various institutions.The code is publicly available at https://github.com/m4ln/stainTransfer%5fCycleGAN%5fpytorch. The data set supporting the solutions is available at https://doi.org/10.11588/data/8LKEZF. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
26. Multi‐head mutual‐attention CycleGAN for unpaired image‐to‐image translation.
- Author
-
Ji, Wei, Guo, Jing, and Li, Yun
- Abstract
The image‐to‐image translation, i.e. from source image domain to target image domain, has made significant progress in recent years. The most popular method for unpaired image‐to‐image translation is CycleGAN. However, it always cannot accurately and rapidly learn the key features in target domains. So, the CycleGAN model learns slowly and the translation quality needs to be improved. In this study, a multi‐head mutual‐attention CycleGAN (MMA‐CycleGAN) model is proposed for unpaired image‐to‐image translation. In MMA‐CycleGAN, the cycle‐consistency loss and adversarial loss in CycleGAN are still used, but a mutual‐attention (MA) mechanism is introduced, which allows attention‐driven, long‐range dependency modelling between the two image domains. Moreover, to efficiently deal with the large image size, the MA is further improved to the multi‐head mutual‐attention (MMA) mechanism. On the other hand, domain labels are adopted to simplify the MMA‐CycleGAN architecture, so only one generator is required to perform bidirectional translation tasks. Experiments on multiple datasets demonstrate MMA‐CycleGAN is able to learn rapidly and obtain photo‐realistic images in a shorter time than CycleGAN. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
27. FFPE++: Improving the quality of formalin-fixed paraffin-embedded tissue imaging via contrastive unpaired image-to-image translation.
- Author
-
Kassab, Mohamad, Jehanzaib, Muhammad, Başak, Kayhan, Demir, Derya, Keles, G. Evren, and Turan, Mehmet
- Subjects
- *
TURING test , *SQUAMOUS cell carcinoma , *CLINICAL pathology , *THYROID cancer , *PAPILLARY carcinoma , *GENERATIVE adversarial networks - Abstract
Formalin-fixation and paraffin-embedding (FFPE) is a technique for preparing and preserving tissue specimens that has been utilized in histopathology since the late 19th century. This process is further complicated by FFPE preparation steps such as fixation, processing, embedding, microtomy, staining, and coverslipping, which often results in artifacts due to the complex histological and cytological characteristics of a tissue specimen. The term "artifacts" includes, but is not limited to, staining inconsistencies, tissue folds, chattering, pen marks, blurring, air bubbles, and contamination. The presence of artifacts may interfere with pathological diagnosis in disease detection, subtyping, grading, and choice of therapy. In this study, we propose FFPE++, an unpaired image-to-image translation method based on contrastive learning with a mixed channel-spatial attention module and self-regularization loss that drastically corrects the aforementioned artifacts in FFPE tissue sections. Turing tests were performed by 10 board-certified pathologists with more than 10 years of experience. These tests which were performed for ovarian carcinoma, lung adenocarcinoma, lung squamous cell carcinoma, and papillary thyroid carcinoma, demonstrate the clear superiority of the proposed method in many clinical aspects compared with standard FFPE images. Based on the qualitative experiments and feedback from the Turing tests, we believe that FFPE++ can contribute to substantial diagnostic and prognostic accuracy in clinical pathology in the future and can also improve the performance of AI tools in digital pathology. The code and dataset are publicly available at https://github.com/DeepMIALab/FFPEPlus. [Display omitted] • FFPE++ is unpaired image-to-image translation that corrects histologic artifacts. • FFPE++ incorporates contrastive learning to obtain high-quality FFPE images. • Mixed-channel spatial attention is used to achieve accurate morphological structures. • Cancer Genome Atlas (TCGA) slides were used for the study. • Three surveys were performed to determine the clinical effectiveness. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. Exploring the Possibility of Measuring Vertebrae Bone Structure Metrics Using MDCT Images: An Unpaired Image-to-Image Translation Method
- Author
-
Yuan, Dan Jin, Han Zheng, and Huishu
- Subjects
micro-CT-like images ,unpaired image-to-image translation ,vertebrae ,bone structure - Abstract
Bone structure metrics are vital for the evaluation of vertebral bone strength. However, the gold standard for measuring bone structure metrics, micro-Computed Tomography (micro-CT), cannot be used in vivo, which hinders the early diagnosis of fragility fractures. This paper used an unpaired image-to-image translation method to capture the mapping between clinical multidetector computed tomography (MDCT) and micro-CT images and then generated micro-CT-like images to measure bone structure metrics. MDCT and micro-CT images were scanned from 75 human lumbar spine specimens and formed training and testing sets. The generator in the model focused on learning both the structure and detailed pattern of bone trabeculae and generating micro-CT-like images, and the discriminator determined whether the generated images were micro-CT images or not. Based on similarity metrics (i.e., SSIM and FID) and bone structure metrics (i.e., bone volume fraction, trabecular separation and trabecular thickness), a set of comparisons were performed. The results show that the proposed method can perform better in terms of both similarity metrics and bone structure metrics and the improvement is statistically significant. In particular, we compared the proposed method with the paired image-to-image method and analyzed the pros and cons of the method used.
- Published
- 2023
- Full Text
- View/download PDF
29. Normalization of HE-stained histological images using cycle consistent generative adversarial networks
- Author
-
Jürgen Hesser, Martin R. Weihrauch, Daniel Rusche, Stefan Schmidt, Marlen Runz, and Cleo-Aron Weis
- Subjects
Normalization (statistics) ,Pathology ,medicine.medical_specialty ,Histology ,Generative adversarial networks ,H&E stain ,Color ,Breast Neoplasms ,Pathology and Forensic Medicine ,Style transfer ,Histology stain normalization ,chemistry.chemical_compound ,Unpaired image-to-image translation ,Adenocarcinoma, Follicular ,HE-stain ,medicine ,Image Processing, Computer-Assisted ,Humans ,Digital pathology ,RB1-214 ,Thyroid Neoplasms ,Coloring Agents ,Hematoxylin ,Mathematics ,Models, Statistical ,Eosin ,Staining and Labeling ,Research ,Reproducibility of Results ,Deep learning ,General Medicine ,chemistry ,Standard protocol ,Eosine Yellowish-(YS) ,Female - Abstract
Background Histological images show strong variance (e.g. illumination, color, staining quality) due to differences in image acquisition, tissue processing, staining, etc. This can impede downstream image analysis such as staining intensity evaluation or classification. Methods to reduce these variances are called image normalization techniques. Methods In this paper, we investigate the potential of CycleGAN (cycle consistent Generative Adversarial Network) for color normalization in hematoxylin-eosin stained histological images using daily clinical data with consideration of the variability of internal staining protocol variations. The network consists of a generator network GB that learns to map an image X from a source domain A to a target domain B, i.e. GB:XA→XB. In addition, a discriminator network DB is trained to distinguish whether an image from domain B is real or generated. The same process is applied to another generator-discriminator pair (GA,DA), for the inverse mapping GA:XB→XA. Cycle consistency ensures that a generated image is close to its original when being mapped backwards (GA(GB(XA))≈XA and vice versa). We validate the CycleGAN approach on a breast cancer challenge and a follicular thyroid carcinoma data set for various stain variations. We evaluate the quality of the generated images compared to the original images using similarity measures. In addition, we apply stain normalization on pathological lymph node data from our institute and test the gain from normalization on a ResNet classifier pre-trained on the Camelyon16 data set. Results Qualitative results of the images generated by our network are compared to original color distributions. Our evaluation indicates that by mapping images to a target domain, the similarity training images from that domain improves up to 96%. We also achieve a high cycle consistency for the generator networks by obtaining similarity indices greater than 0.9. When applying the CycleGAN normalization to HE-stain images from our institute the kappa-value of the ResNet-model that is only trained on Camelyon16 data is increased more than 50%. Conclusions CycleGANs have proven to efficiently normalize HE-stained images. The approach compensates for deviations resulting from image acquisition (e.g. different scanning devices) as well as from tissue staining (e.g. different staining protocols), and thus overcomes the staining variations in images from various institutions.The code is publicly available at https://github.com/m4ln/stainTransfer_CycleGAN_pytorch. The data set supporting the solutions is available at 10.11588/data/8LKEZF.
- Published
- 2021
30. Multi‐head mutual‐attention CycleGAN for unpaired image‐to‐image translation
- Author
-
Yun Li, Wei Ji, and Jing Guo
- Subjects
Computer science ,Image processing ,02 engineering and technology ,multihead mutual‐attention mechanism ,Translation (geometry) ,source image domain ,Domain (software engineering) ,Image (mathematics) ,QA76.75-76.765 ,0202 electrical engineering, electronic engineering, information engineering ,Photography ,Computer software ,Electrical and Electronic Engineering ,Language translation ,TR1-1050 ,Image resolution ,business.industry ,unpaired image‐to‐image translation ,020206 networking & telecommunications ,Pattern recognition ,image size ,photorealistic images ,multihead mutual‐attention CycleGAN model ,Signal Processing ,Image translation ,020201 artificial intelligence & image processing ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,Software ,Generator (mathematics) - Abstract
The image-to-image translation, i.e. from source image domain to target image domain, has made significant progress in recent years. The most popular method for unpaired image-to-image translation is CycleGAN. However, it always cannot accurately and rapidly learn the key features in target domains. So, the CycleGAN model learns slowly and the translation quality needs to be improved. In this study, a multi-head mutual-attention CycleGAN (MMA-CycleGAN) model is proposed for unpaired image-to-image translation. In MMA-CycleGAN, the cycle-consistency loss and adversarial loss in CycleGAN are still used, but a mutual-attention (MA) mechanism is introduced, which allows attention-driven, long-range dependency modelling between the two image domains. Moreover, to efficiently deal with the large image size, the MA is further improved to the multi-head mutual-attention (MMA) mechanism. On the other hand, domain labels are adopted to simplify the MMA-CycleGAN architecture, so only one generator is required to perform bidirectional translation tasks. Experiments on multiple datasets demonstrate MMA-CycleGAN is able to learn rapidly and obtain photo-realistic images in a shorter time than CycleGAN.
- Published
- 2020
31. AttentionGAN: Unpaired Image-to-Image Translation Using Attention-Guided Generative Adversarial Networks
- Author
-
Hong Liu, Nicu Sebe, Dan Xu, Philip H. S. Torr, and Hao Tang
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Computer Networks and Communications ,Computer science ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Attention guided ,generative adversarial networks (GANs) ,unpaired image-to-image translation ,Semantics ,Translation (geometry) ,Machine Learning (cs.LG) ,Domain (software engineering) ,Image (mathematics) ,Discriminative model ,Artificial Intelligence ,FOS: Electrical engineering, electronic engineering, information engineering ,Code (cryptography) ,Visual artifact ,business.industry ,Image and Video Processing (eess.IV) ,Pattern recognition ,Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science Applications ,Image translation ,Artificial intelligence ,business ,Software - Abstract
State-of-the-art methods in image-to-image translation are capable of learning a mapping from a source domain to a target domain with unpaired image data. Though the existing methods have achieved promising results, they still produce visual artifacts, being able to translate low-level information but not high-level semantics of input images. One possible reason is that generators do not have the ability to perceive the most discriminative parts between the source and target domains, thus making the generated images low quality. In this paper, we propose a new Attention-Guided Generative Adversarial Networks (AttentionGAN) for the unpaired image-to-image translation task. AttentionGAN can identify the most discriminative foreground objects and minimize the change of the background. The attention-guided generators in AttentionGAN are able to produce attention masks, and then fuse the generation output with the attention masks to obtain high-quality target images. Accordingly, we also design a novel attention-guided discriminator which only considers attended regions. Extensive experiments are conducted on several generative tasks with eight public datasets, demonstrating that the proposed method is effective to generate sharper and more realistic images compared with existing competitive models. The code is available at https://github.com/Ha0Tang/AttentionGAN., Accepted to TNNLS, an extended version of a paper published in IJCNN2019. arXiv admin note: substantial text overlap with arXiv:1903.12296
- Published
- 2021
32. Cross-domain object detection using unsupervised image translation.
- Author
-
Arruda, Vinicius F., Berriel, Rodrigo F., Paixão, Thiago M., Badue, Claudine, De Souza, Alberto F., Sebe, Nicu, and Oliveira-Santos, Thiago
- Subjects
- *
GENERATIVE adversarial networks , *AUTONOMOUS vehicles - Abstract
Unsupervised domain adaptation for object detection addresses the adaption of detectors trained in a source domain to work accurately in an unseen target domain. Recently, methods approaching the alignment of the intermediate features proven to be promising, achieving state-of-the-art results. However, these methods are laborious to implement and hard to interpret. Although promising, there is still room for improvements to close the performance gap toward the upper-bound (when training with the target data). In this work, we propose a method to generate an artificial dataset in the target domain to train an object detector. We employed two unsupervised image translators (CycleGAN and an AdaIN-based model) using only annotated data from the source domain and non-annotated data from the target domain. Our key contributions are the proposal of a less complex yet more effective method that also has an improved interpretability. Results on real-world scenarios for autonomous driving show significant improvements, outperforming state-of-the-art methods in most cases, further closing the gap toward the upper-bound. • A simple yet effective method for detecting objects on unsupervised domain adaptation. • Artificially generated images are useful for unsupervised domain adaptation. • An extensive comparison with the state-of-the-art is provided. • Experiments in three scenarios: synthetic data, adverse weather, and cross-camera. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
33. EdgeGAN: One-way mapping generative adversarial network based on the edge information for unpaired training set.
- Author
-
Li, Yijie, Liang, Qiaokang, Li, Zhengwei, Lei, Youcheng, Sun, Wei, Wang, Yaonan, and Zhang, Dan
- Subjects
- *
GENERATIVE adversarial networks , *IMAGE converters , *SUBSET selection , *EDGE detection (Image processing) , *PARAMETERS (Statistics) - Abstract
Image conversion has attracted mounting attention due to its practical applications. This paper proposes a lightweight network structure that can implement unpaired training sets to complete one-way image mapping, based on the generative adversarial network (GAN) and a fixed-parameter edge detection convolution kernel. Compared with the cycle consistent adversarial network (CycleGAN), the proposed network features simpler structure, fewer parameters (only 37.48% of the parameters in CycleGAN), and less training cost (only 35.47% of the GPU memory usage and 17.67% of the single iteration time in CycleGAN). Remarkably, the cyclic consistency becomes not mandatory for ensuring the consistency of the content before and after image mapping. This network has achieved significant processing effects in some image translation tasks, and its effectiveness and validity have been well demonstrated through typical experiments. In the quantitative classification evaluation based on VGG-16, the algorithm proposed in this paper has achieved superior performance. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.