46 results on '"Shim, Hyunjung"'
Search Results
2. Entropy regularization for weakly supervised object localization
- Author
-
Hwang, Dongjun, Ha, Jung-Woo, Shim, Hyunjung, and Choe, Junsuk
- Published
- 2023
- Full Text
- View/download PDF
3. CoMix: Collaborative filtering with mixup for implicit datasets
- Author
-
Moon, Jaewan, Jeong, Yoonki, Chae, Dong-Kyu, Choi, Jaeho, Shim, Hyunjung, and Lee, Jongwuk
- Published
- 2023
- Full Text
- View/download PDF
4. Knowledge distillation meets recommendation: collaborative distillation for top-N recommendation
- Author
-
Lee, Jae-woong, Choi, Minjin, Sael, Lee, Shim, Hyunjung, and Lee, Jongwuk
- Published
- 2022
- Full Text
- View/download PDF
5. Distilling from professors: Enhancing the knowledge distillation of teachers
- Author
-
Bang, Duhyeon, Lee, Jongwuk, and Shim, Hyunjung
- Published
- 2021
- Full Text
- View/download PDF
6. Region-based dropout with attention prior for weakly supervised object localization
- Author
-
Choe, Junsuk, Han, Dongyoon, Yun, Sangdoo, Ha, Jung-Woo, Oh, Seong Joon, and Shim, Hyunjung
- Published
- 2021
- Full Text
- View/download PDF
7. GridMix: Strong regularization through local context mapping
- Author
-
Baek, Kyungjune, Bang, Duhyeon, and Shim, Hyunjung
- Published
- 2021
- Full Text
- View/download PDF
8. Adapting low‐dose CT denoisers for texture preservation using zero‐shot local noise‐level matching.
- Author
-
Ko, Youngjun, Song, Seongjong, Baek, Jongduk, and Shim, Hyunjung
- Subjects
IMAGE denoising ,COMPUTED tomography ,SUPERCONDUCTING quantum interference devices ,DEEP learning ,RADIOLOGISTS - Abstract
Background: On enhancing the image quality of low‐dose computed tomography (LDCT), various denoising methods have achieved meaningful improvements. However, they commonly produce over‐smoothed results; the denoised images tend to be more blurred than the normal‐dose targets (NDCTs). Furthermore, many recent denoising methods employ deep learning(DL)‐based models, which require a vast amount of CT images (or image pairs). Purpose: Our goal is to address the problem of over‐smoothed results and design an algorithm that works regardless of the need for a large amount of training dataset to achieve plausible denoising results. Over‐smoothed images negatively affect the diagnosis and treatment since radiologists had developed clinical experiences with NDCT. Besides, a large‐scale training dataset is often not available in clinical situations. To overcome these limitations, we propose locally‐adaptive noise‐level matching (LANCH), emphasizing the output should retain the same noise‐level and characteristics to that of the NDCT without additional training. Methods: We represent the NDCT image as the pixel‐wisely weighted sum of an over‐smoothed output from off‐the‐shelf denoiser (OSD) and the difference between the LDCT image and the OSD output. Herein, LANCH determines a 2D ratio map (i.e., pixel‐wise weight matrix) by locally matching the noise‐level of output and NDCT, where the LDCT‐to‐NDCT device flux (mAs) ratio reveals the NDCT noise‐level. Thereby, LANCH can preserve important details in LDCT, and enhance the sharpness of the noise‐free regions. Note that LANCH can enhance any LDCT denoisers without additional training data (i.e., zero‐shot). Results: The proposed method is applicable to any OSD denoisers, reporting significant texture plausibility development over the baseline denoisers in quantitative and qualitative manners. It is surprising that the denoising accuracy achieved by our method with zero‐shot denoiser was comparable or superior to that of the best training‐based denoisers; our result showed 1% and 33% gains in terms of SSIM and DISTS, respectively. Reader study with experienced radiologists shows significant image quality improvements, a gain of + 1.18 on a five‐point mean opinion score scale. Conclusions: In this paper, we propose a technique to enhance any low‐dose CT denoiser by leveraging the fundamental physical relationship between the x‐ray flux and noise variance. Our method is capable of operating in a zero‐shot condition, which means that only a single low‐dose CT image is required for the enhancement process. We demonstrate that our approach is comparable or even superior to supervised DL‐based denoisers that are trained using numerous CT images. Extensive experiments illustrate that our method consistently improves the performance of all tested LDCT denoisers. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Discriminator Feature-Based Inference by Recycling the Discriminator of GANs
- Author
-
Bang, Duhyeon, Kang, Seoungyoon, and Shim, Hyunjung
- Published
- 2020
- Full Text
- View/download PDF
10. Semantic-aware neural style transfer
- Author
-
Park, Joo Hyun, Park, Song, and Shim, Hyunjung
- Published
- 2019
- Full Text
- View/download PDF
11. Robust approach to inverse lighting using RGB-D images
- Author
-
Choe, Junsuk and Shim, Hyunjung
- Published
- 2018
- Full Text
- View/download PDF
12. Mismatched image identification using histogram of loop closure error for feature-based optical mapping
- Author
-
Elibol, Armagan, Chong, Nak-Young, Shim, Hyunjung, Kim, Jinwhan, Gracias, Nuno, and Garcia, Rafael
- Published
- 2019
- Full Text
- View/download PDF
13. Skewed stereo time-of-flight camera for translucent object imaging
- Author
-
Lee, Seungkyu and Shim, Hyunjung
- Published
- 2015
- Full Text
- View/download PDF
14. Utilization of an attentive map to preserve anatomical features for training convolutional neural‐network‐based low‐dose CT denoiser.
- Author
-
Han, Minah, Shim, Hyunjung, and Baek, Jongduk
- Subjects
- *
CONVOLUTIONAL neural networks , *SUPERVISED learning , *COMPUTED tomography - Abstract
Background: The purpose of a convolutional neural network (CNN)‐based denoiser is to increase the diagnostic accuracy of low‐dose computed tomography (LDCT) imaging. To increase diagnostic accuracy, there is a need for a method that reflects the features related to diagnosis during the denoising process. Purpose: To provide a training strategy for LDCT denoisers that relies more on diagnostic task‐related features to improve diagnostic accuracy. Methods: An attentive map derived from a lesion classifier (i.e., determining lesion‐present or not) is created to represent the extent to which each pixel influences the decision by the lesion classifier. This is used as a weight to emphasize important parts of the image. The proposed training method consists of two steps. In the first one, the initial parameters of the CNN denoiser are trained using LDCT and normal‐dose CT image pairs via supervised learning. In the second one, the learned parameters are readjusted using the attentive map to restore the fine details of the image. Results: Structural details and the contrast are better preserved in images generated by using the denoiser trained via the proposed method than in those generated by conventional denoisers. The proposed denoiser also yields higher lesion detectability and localization accuracy than conventional denoisers. Conclusions: A denoiser trained using the proposed method preserves the small structures and the contrast in the denoised images better than without it. Specifically, using the attentive map improves the lesion detectability and localization accuracy of the denoiser. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
15. Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data
- Author
-
Baek, Kyungjune and Shim, Hyunjung
- Subjects
FOS: Computer and information sciences ,Artificial Intelligence (cs.AI) ,Computer Science - Artificial Intelligence ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Transfer learning for GANs successfully improves generation performance under low-shot regimes. However, existing studies show that the pretrained model using a single benchmark dataset is not generalized to various target datasets. More importantly, the pretrained model can be vulnerable to copyright or privacy risks as membership inference attack advances. To resolve both issues, we propose an effective and unbiased data synthesizer, namely Primitives-PS, inspired by the generic characteristics of natural images. Specifically, we utilize 1) the generic statistics on the frequency magnitude spectrum, 2) the elementary shape (i.e., image composition via elementary shapes) for representing the structure information, and 3) the existence of saliency as prior. Since our synthesizer only considers the generic properties of natural images, the single model pretrained on our dataset can be consistently transferred to various target datasets, and even outperforms the previous methods pretrained with the natural images in terms of Fr'echet inception distance. Extensive analysis, ablation study, and evaluations demonstrate that each component of our data synthesizer is effective, and provide insights on the desirable nature of the pretrained model for the transferability of GANs., CVPR 2022 accepted
- Published
- 2022
16. Online underwater optical mapping for trajectories with gaps
- Author
-
Elibol, Armagan, Shim, Hyunjung, Hong, Seonghun, Kim, Jinwhan, Gracias, Nuno, and Garcia, Rafael
- Published
- 2016
- Full Text
- View/download PDF
17. Threshold Matters in WSSS: Manipulating the Activation for the Robust and Accurate Segmentation Model Against Thresholds
- Author
-
Lee, Minhyun, Kim, Dongseob, and Shim, Hyunjung
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Weakly-supervised semantic segmentation (WSSS) has recently gained much attention for its promise to train segmentation models only with image-level labels. Existing WSSS methods commonly argue that the sparse coverage of CAM incurs the performance bottleneck of WSSS. This paper provides analytical and empirical evidence that the actual bottleneck may not be sparse coverage but a global thresholding scheme applied after CAM. Then, we show that this issue can be mitigated by satisfying two conditions; 1) reducing the imbalance in the foreground activation and 2) increasing the gap between the foreground and the background activation. Based on these findings, we propose a novel activation manipulation network with a per-pixel classification loss and a label conditioning module. Per-pixel classification naturally induces two-level activation in activation maps, which can penalize the most discriminative parts, promote the less discriminative parts, and deactivate the background regions. Label conditioning imposes that the output label of pseudo-masks should be any of true image-level labels; it penalizes the wrong activation assigned to non-target classes. Based on extensive analysis and evaluations, we demonstrate that each component helps produce accurate pseudo-masks, achieving the robustness against the choice of the global threshold. Finally, our model achieves state-of-the-art records on both PASCAL VOC 2012 and MS COCO 2014 datasets., CVPR 2022 accepted
- Published
- 2022
18. Automatic color realism enhancement for computer generated images
- Author
-
Shim, Hyunjung and Lee, Seungkyu
- Published
- 2012
- Full Text
- View/download PDF
19. A streak artifact reduction algorithm in sparse‐view CT using a self‐supervised neural representation.
- Author
-
Kim, Byeongjoon, Shim, Hyunjung, and Baek, Jongduk
- Subjects
- *
IMAGE reconstruction algorithms , *TRANSFER functions , *INVERSE problems , *COMPUTED tomography , *IMAGE reconstruction , *INSPECTION & review , *LUNGS , *TORSO - Abstract
Purpose: Sparse‐view computed tomography (CT) has been attracting attention for its reduced radiation dose and scanning time. However, analytical image reconstruction methods suffer from streak artifacts due to insufficient projection views. Recently, various deep learning‐based methods have been developed to solve this ill‐posed inverse problem. Despite their promising results, they are easily overfitted to the training data, showing limited generalizability to unseen systems and patients. In this work, we propose a novel streak artifact reduction algorithm that provides a system‐ and patient‐specific solution. Methods: Motivated by the fact that streak artifacts are deterministic errors, we regenerate the same artifacts from a prior CT image under the same system geometry. This prior image need not be perfect but should contain patient‐specific information and be consistent with full‐view projection data for accurate regeneration of the artifacts. To this end, we use a coordinate‐based neural representation that often causes image blur but can greatly suppress the streak artifacts while having multiview consistency. By employing techniques in neural radiance fields originally proposed for scene representations, the neural representation is optimized to the measured sparse‐view projection data via self‐supervised learning. Then, we subtract the regenerated artifacts from the analytically reconstructed original image to obtain the final corrected image. Results: To validate the proposed method, we used simulated data of extended cardiac‐torso phantoms and the 2016 NIH‐AAPM‐Mayo Clinic Low‐Dose CT Grand Challenge and experimental data of physical pediatric and head phantoms. The performance of the proposed method was compared with a total variation‐based iterative reconstruction method, naive application of the neural representation, and a convolutional neural network‐based method. In visual inspection, it was observed that the small anatomical features were best preserved by the proposed method. The proposed method also achieved the best scores in the visual information fidelity, modulation transfer function, and lung nodule segmentation. Conclusions: The results on both simulated and experimental data suggest that the proposed method can effectively reduce the streak artifacts while preserving small anatomical structures that are easily blurred or replaced with misleading features by the existing methods. Since the proposed method does not require any additional training datasets, it would be useful in clinical practice where the large datasets cannot be collected. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
20. It Ain't Over Till It's Over?: Correlates of Post-Separation Abuse Among Unmarried Women in the Republic of Korea.
- Author
-
Shim, Hyunjung, Wilkes, Nicole, and Hayes, Brittany E.
- Subjects
STATISTICS ,AGE distribution ,DATING violence ,INTIMATE partner violence ,INCOME ,SEX distribution ,T-test (Statistics) ,PSYCHOSOCIAL factors ,SEX crimes ,DESCRIPTIVE statistics ,SINGLE women ,LOGISTIC regression analysis ,VICTIMS ,AGGRESSION (Psychology) ,STATISTICAL sampling ,MARITAL status ,CONTROL (Psychology) - Abstract
This study investigates the correlates of post-separation abuse among unmarried women in the Republic of Korea (n = 744). The study employs a logistic regression model to consider if prior intimate partner violence victimization, relationship characteristics, and separation characteristics are associated with post-separation abuse. The results showed that experiencing coercive control by the former partner during the relationship, initiating the separation, and having a lower income than her former partner's income increased the odds of post-separation abuse. The findings imply that programs designed to prevent victimization or enhance victims' safety need to consider broader relationship and separation contexts. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
21. Time-of-flight sensor and color camera calibration for multi-view acquisition
- Author
-
Shim, Hyunjung, Adelsberger, Rolf, Kim, James Dokyoon, Rhee, Seon-Min, Rhee, Taehyun, Sim, Jae-Young, Gross, Markus, and Kim, Changyeong
- Published
- 2012
- Full Text
- View/download PDF
22. Few-shot Font Generation with Localized Style Representations and Factorization
- Author
-
Park, Song, Chun, Sanghyuk, Cha, Junbum, Lee, Bado, and Shim, Hyunjung
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,General Medicine - Abstract
Automatic few-shot font generation is a practical and widely studied problem because manual designs are expensive and sensitive to the expertise of designers. Existing few-shot font generation methods aim to learn to disentangle the style and content element from a few reference glyphs, and mainly focus on a universal style representation for each font style. However, such approach limits the model in representing diverse local styles, and thus makes it unsuitable to the most complicated letter system, e.g., Chinese, whose characters consist of a varying number of components (often called "radical") with a highly complex structure. In this paper, we propose a novel font generation method by learning localized styles, namely component-wise style representations, instead of universal styles. The proposed style representations enable us to synthesize complex local details in text designs. However, learning component-wise styles solely from reference glyphs is infeasible in the few-shot font generation scenario, when a target script has a large number of components, e.g., over 200 for Chinese. To reduce the number of reference glyphs, we simplify component-wise styles by a product of component factor and style factor, inspired by low-rank matrix factorization. Thanks to the combination of strong representation and a compact factorization strategy, our method shows remarkably better few-shot font generation results (with only 8 reference glyph images) than other state-of-the-arts, without utilizing strong locality supervision, e.g., location of each component, skeleton, or strokes. The source code is available at https://github.com/clovaai/lffont., Accepted at AAAI 2021, 12 pages, 11 figures, the first two authors contributed equally
- Published
- 2020
23. Two-phase learning-based 3D deblurring method for digital breast tomosynthesis images.
- Author
-
Choi, Yunsu, Han, Minah, Jang, Hanjoo, Shim, Hyunjung, and Baek, Jongduk
- Subjects
TOMOSYNTHESIS ,STANDARD deviations ,BREAST imaging ,ARTIFICIAL neural networks ,CONVOLUTIONAL neural networks - Abstract
In digital breast tomosynthesis (DBT) systems, projection data are acquired from a limited number of angles. Consequently, the reconstructed images contain severe blurring artifacts that might heavily degrade the DBT image quality and cause difficulties in detecting lesions. In this study, we propose a two-phase learning approach for artifact compensation in a coarse-to-fine manner to mitigate blurring artifacts effectively along all viewing directions of the DBT image volume (i.e., along the axial, coronal, and sagittal planes) to improve the detection performance of lesions. The proposed method employs a convolutional neural network model comprising two submodels/phases, with Phase 1 performing three-dimensional (3D) deblurring and Phase 2 performing additional 2D deblurring. To investigate the effects of loss functions on the proposed model's deblurring performance, we evaluated several loss functions, such as the pixel-based loss function, adversarial-based loss function, and perception-based loss function. Compared with the DBT image, the mean squared error of the image and the root mean squared errors of the gradient of the image decreased by 82.8% and 44.9%, respectively, and the contrast-to-noise ratio increased by 183.4% in the in-focus plane. We verified that the proposed method sequentially restored the missing frequency components as the DBT images were processed through the Phase 1 and Phase 2 steps. These results indicate that the proposed method performs effective 3D deblurring, significantly reducing the blurring artifacts in the in-focus plane and other planes of the DBT image, thus improving the detection performance of lesions. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
24. Attention-Based Dropout Layer for Weakly Supervised Single Object Localization and Semantic Segmentation.
- Author
-
Choe, Junsuk, Lee, Seungho, and Shim, Hyunjung
- Subjects
WEAK localization (Quantum mechanics) ,IMAGE segmentation ,FEATURE extraction ,TASK analysis - Abstract
Both weakly supervised single object localization and semantic segmentation techniques learn an object’s location using only image-level labels. However, these techniques are limited to cover only the most discriminative part of the object and not the entire object. To address this problem, we propose an attention-based dropout layer, which utilizes the attention mechanism to locate the entire object efficiently. To achieve this, we devise two key components, 1) hiding the most discriminative part from the model to capture the entire object, and 2) highlighting the informative region to improve the classification power of the model. These allow the classifier to be maintained with a reasonable accuracy while the entire object is covered. Through extensive experiments, we demonstrate that the proposed method effectively improves the weakly supervised single object localization accuracy, thereby achieving a new state-of-the-art localization accuracy on the CUB-200-2011 and a comparable accuracy existing state-of-the-arts on the ImageNet-1k. The proposed method is also effective in improving the weakly supervised semantic segmentation performance on the Pascal VOC and MS COCO. Furthermore, the proposed method is more efficient than existing techniques in terms of parameter and computation overheads. Additionally, the proposed method can be easily applied in various backbone networks. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
25. Low‐dose CT denoising via convolutional neural network with an observer loss function.
- Author
-
Han, Minah, Shim, Hyunjung, and Baek, Jongduk
- Subjects
- *
CONVOLUTIONAL neural networks , *COMPUTED tomography , *VISUAL training - Abstract
Purpose: Convolutional neural network (CNN)‐based denoising is an effective method for reducing complex computed tomography (CT) noise. However, the image blur induced by denoising processes is a major concern. The main source of image blur is the pixel‐level loss (e.g., mean squared error [MSE] and mean absolute error [MAE]) used to train a CNN denoiser. To reduce the image blur, feature‐level loss is utilized to train a CNN denoiser. A CNN denoiser trained using visual geometry group (VGG) loss can preserve the small structures, edges, and texture of the image.However, VGG loss, derived from an ImageNet‐pretrained image classifier, is not optimal for training a CNN denoiser for CT images. ImageNet contains natural RGB images, so the features extracted by the ImageNet‐pretrained model cannot represent the characteristics of CT images that are highly correlated with diagnosis. Furthermore, a CNN denoiser trained with VGG loss causes bias in CT number. Therefore, we propose to use a binary classification network trained using CT images as a feature extractor and newly define the feature‐level loss as observer loss. Methods: As obtaining labeled CT images for training classification network is difficult, we create labels by inserting simulated lesions. We conduct two separate classification tasks, signal‐known‐exactly (SKE) and signal‐known‐statistically (SKS), and define the corresponding feature‐level losses as SKE loss and SKS loss, respectively. We use SKE loss and SKS loss to train CNN denoiser. Results: Compared to pixel‐level losses, a CNN denoiser trained using observer loss (i.e., SKE loss and SKS loss) is effective in preserving structure, edge, and texture. Observer loss also resolves the bias in CT number, which is a problem of VGG loss. Comparing observer losses using SKE and SKS tasks, SKS yields images having a more similar noise structure to reference images. Conclusions: Using observer loss for training CNN denoiser is effective to preserve structure, edge, and texture in denoised images and prevent the CT number bias. In particular, when using SKS loss, denoised images having a similar noise structure to reference images are generated. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
26. Palliative Care in the Inner City: Patient Religious Affiliation, Underinsurance, and Symptom Attitude
- Author
-
Francoeur, Richard B., Payne, Richard, Raveis, Victoria H., and Shim, Hyunjung
- Published
- 2007
- Full Text
- View/download PDF
27. Editable Generative Adversarial Networks: Generating and Editing Faces Simultaneously
- Author
-
Baek, Kyungjune, Bang, Duhyeon, and Shim, Hyunjung
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition - Abstract
We propose a novel framework for simultaneously generating and manipulating the face images with desired attributes. While the state-of-the-art attribute editing technique has achieved the impressive performance for creating realistic attribute effects, they only address the image editing problem, using the input image as the condition of model. Recently, several studies attempt to tackle both novel face generation and attribute editing problem using a single solution. However, their image quality is still unsatisfactory. Our goal is to develop a single unified model that can simultaneously create and edit high quality face images with desired attributes. A key idea of our work is that we decompose the image into the latent and attribute vector in low dimensional representation, and then utilize the GAN framework for mapping the low dimensional representation to the image. In this way, we can address both the generation and editing problem by learning the generator. For qualitative and quantitative evaluations, the proposed algorithm outperforms recent algorithms addressing the same problem. Also, we show that our model can achieve the competitive performance with the state-of-the-art attribute editing technique in terms of attribute editing quality.
- Published
- 2018
28. Generative Adversarial Networks for Unsupervised Object Co-localization
- Author
-
Choe, Junsuk, Park, Joo Hyun, and Shim, Hyunjung
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition - Abstract
This paper introduces a novel approach for unsupervised object co-localization using Generative Adversarial Networks (GANs). GAN is a powerful tool that can implicitly learn unknown data distributions in an unsupervised manner. From the observation that GAN discriminator is highly influenced by pixels where objects appear, we analyze the internal layers of discriminator and visualize the activated pixels. Our important finding is that high image diversity of GAN, which is a main goal in GAN research, is ironically disadvantageous for object localization, because such discriminators focus not only on the target object, but also on the various objects, such as background objects. Based on extensive evaluations and experimental studies, we show the image diversity and localization performance have a negative correlation. In addition, our approach achieves meaningful accuracy for unsupervised object co-localization using publicly available benchmark datasets, even comparable to state-of-the-art weakly-supervised approach.
- Published
- 2018
29. MGGAN: Solving Mode Collapse using Manifold Guided Training
- Author
-
Bang, Duhyeon and Shim, Hyunjung
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Mode collapse is a critical problem in training generative adversarial networks. To alleviate mode collapse, several recent studies introduce new objective functions, network architectures or alternative training schemes. However, their achievement is often the result of sacrificing the image quality. In this paper, we propose a new algorithm, namely a manifold guided generative adversarial network (MGGAN), which leverages a guidance network on existing GAN architecture to induce generator learning all modes of data distribution. Based on extensive evaluations, we show that our algorithm resolves mode collapse without losing image quality. In particular, we demonstrate that our algorithm is easily extendable to various existing GANs. Experimental analysis justifies that the proposed algorithm is an effective and efficient tool for training GANs.
- Published
- 2018
30. A convolutional neural network‐based model observer for breast CT images.
- Author
-
Kim, Gihun, Han, Minah, Shim, Hyunjung, and Baek, Jongduk
- Subjects
BREAST imaging ,ARTIFICIAL neural networks ,RANDOM noise theory ,REAR-screen projection ,BREAST ,NONLINEAR functions - Abstract
Purpose: In this paper, we propose a convolutional neural network (CNN)‐based efficient model observer for breast computed tomography (CT) images. Methods: We first showed that the CNN‐based model observer provided similar detection performance to the ideal observer (IO) for signal‐known‐exactly and background‐known‐exactly detection tasks with an uncorrelated Gaussian background noise image. We then demonstrated that a single‐layer CNN without a nonlinear activation function provided similar detection performance in breast CT images to the Hotelling observer (HO). To train the CNN‐based model observer, we generated simulated breast CT images to produce a training dataset in which different background noise structures were generated using filtered back projection with a ramp, or a Hanning weighted ramp, filter. Circular, elliptical, and spiculated signals were used for the detection tasks. The optimal depth and the number of channels for the CNN‐based model observer were determined for each task. The detection performances of the HO and a channelized Hotelling observer (CHO) with Laguerre‐Gauss (LG) and partial least squares (PLS) channels were also estimated for comparison. Results: The results showed that the CNN‐based model observer provided higher detection performance than the HO, LG‐CHO, and PLS‐CHO for all tasks. In addition, it was shown that the proposed CNN‐based model observer provided higher detection performance than the HO using a smaller training dataset. Conclusions: In the presence of nonlinearity in the CNN, the proposed CNN‐based model observer showed better performance than other linear observers. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
31. A performance comparison of convolutional neural network‐based image denoising methods: The effect of loss functions on low‐dose CT images.
- Author
-
Kim, Byeongjoon, Han, Minah, Shim, Hyunjung, and Baek, Jongduk
- Subjects
IMAGE denoising ,COST functions ,TRANSFER functions ,POWER spectra - Abstract
Purpose: Convolutional neural network (CNN)‐based image denoising techniques have shown promising results in low‐dose CT denoising. However, CNN often introduces blurring in denoised images when trained with a widely used pixel‐level loss function. Perceptual loss and adversarial loss have been proposed recently to further improve the image denoising performance. In this paper, we investigate the effect of different loss functions on image denoising performance using task‐based image quality assessment methods for various signals and dose levels. Methods: We used a modified version of U‐net that was effective at reducing the correlated noise in CT images. The loss functions used for comparison were two pixel‐level losses (i.e., the mean‐squared error and the mean absolute error), Visual Geometry Group network‐based perceptual loss (VGG loss), adversarial loss used to train the Wasserstein generative adversarial network with gradient penalty (WGAN‐GP), and their weighted summation. Each image denoising method was applied to reconstructed images and sinogram images independently and validated using the extended cardiac‐torso (XCAT) simulation and Mayo Clinic datasets. In the XCAT simulation, we generated fan‐beam CT datasets with four different dose levels (25%, 50%, 75%, and 100% of a normal‐dose level) using 10 XCAT phantoms and inserted signals in a test set. The signals had two different shapes (spherical and spiculated), sizes (4 and 12 mm), and contrast levels (60 and 160 HU). To evaluate signal detectability, we used a detection task SNR (tSNR) calculated from a non‐prewhitening model observer with an eye filter. We also measured the noise power spectrum (NPS) and modulation transfer function (MTF) to compare the noise and signal transfer properties. Results: Compared to CNNs without VGG loss, VGG‐loss‐based CNNs achieved a more similar tSNR to that of the normal‐dose CT for all signals at different dose levels except for a small signal at the 25% dose level. For a low‐contrast signal at 25% or 50% dose, adding other losses to the VGG loss showed more improved performance than only using VGG loss. The NPS shapes from VGG‐loss‐based CNN closely matched that of normal‐dose CT images while CNN without VGG loss overly reduced the mid‐high‐frequency noise power at all dose levels. MTF also showed VGG‐loss‐based CNN with better‐preserved high resolution for all dose and contrast levels. It is also observed that additional WGAN‐GP loss helps improve the noise and signal transfer properties of VGG‐loss‐based CNN. Conclusions: The evaluation results using tSNR, NPS, and MTF indicate that VGG‐loss‐based CNNs are more effective than those without VGG loss for natural denoising of low‐dose images and WGAN‐GP loss improves the denoising performance of VGG‐loss‐based CNNs, which corresponds with the qualitative evaluation. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
32. Developing a Visual Stopping Criterion for Image Mosaicing Using Invariant Color Histograms.
- Author
-
Elibol, Armagan and Shim, Hyunjung
- Published
- 2015
- Full Text
- View/download PDF
33. Recovering Translucent Objects Using a Single Time-of-Flight Depth Camera.
- Author
-
Shim, Hyunjung and Lee, Seungkyu
- Subjects
- *
IMAGE reconstruction , *THREE-dimensional imaging , *CAMERAS , *TRANSLUCENCY (Optics) , *REFRACTION (Optics) - Abstract
Translucency introduces great challenges to 3-D acquisition because of complicated light behaviors such as refraction and transmittance. In this paper, we describe the development of a unified 3-D data acquisition framework that reconstructs translucent objects using a single commercial time-of-flight (ToF) camera. In our capture scenario, we record a depth map and intensity image of the scene twice using a static ToF camera; first, we capture the depth map and intensity image of an arbitrary background, and then we position the translucent foreground object and record a second depth map and intensity image with both the foreground and the background. As a result of material characteristics, the translucent object yields systematic distortions in the depth map. We developed a new distance representation that interprets the depth distortion induced as a result of translucency. By analyzing ToF depth sensing principles, we constructed a distance model governed by the level of translucency, foreground depth, and background depth. Using an analysis-by-synthesis approach, we can recover the 3-D geometry of a translucent object from a pair of depth maps and their intensity images. Extensive evaluation and case studies demonstrate that our method is effective for modeling the nonlinear depth distortion due to translucency and for reconstruction of a 3-D translucent object. [ABSTRACT FROM PUBLISHER]
- Published
- 2016
- Full Text
- View/download PDF
34. Estimating all frequency lighting using a color/depth image.
- Author
-
Shim, Hyunjung
- Abstract
This paper presents a novel approach to estimating the lighting using a pair of one color and one depth image. To effectively model all frequency lighting, we introduce a hybrid representation for lighting; the combination of spherical harmonic basis functions and point lights. Upon the existing framework of spherical harmonics based diffuse reflection, we divide the color image into diffuse and non-diffuse reflections. Then, we use the diffuse reflection for estimating the low frequency lighting. For high frequency lighting, we obtain the specular reflections by analyzing the non-diffuse reflection. Knowing specular reflections and scene geometry, we are capable of computing the direction of point lights, inverting the reflected ray with respect to the surface normal. Then, we optimize the intensity of point lights by analysis by synthesis paradigm. By superimposing the low and high frequency lighting, we recover the lighting present in the scene. While existing methods use the low frequency lighting to infer the high frequency lighting, we propose to use the nondiffuse reflection for directly estimating the high frequency lighting. In this way, we make good use of the non-diffuse reflections in scene analysis and understanding. Experimental results show that the proposed approach is an effective solution for the lighting estimation of real world environment. [ABSTRACT FROM PUBLISHER]
- Published
- 2012
- Full Text
- View/download PDF
35. Streak artifacts reduction algorithm using an implicit neural representation in sparse-view CT.
- Author
-
Kim, Byeongjoon, Shim, Hyunjung, and Baek, Jongduk
- Published
- 2021
- Full Text
- View/download PDF
36. Probabilistic Approach to Realistic Face Synthesis With a Single Uncalibrated Image.
- Author
-
Shim, Hyunjung
- Subjects
- *
HUMAN facial recognition software , *CALIBRATION , *IMAGE processing , *MATHEMATICAL models , *VECTOR analysis , *ALGORITHMS , *INFERENCE (Logic) - Abstract
This paper presents a novel approach to automatic face modeling for realistic synthesis from an unknown face image, using a probabilistic face diffuse model and a generic face specular map. We construct a probabilistic face diffuse model for estimating the albedo and normals of the input face. Then, we develop a generic face specular map for estimating the specularity of face. Using the estimated albedo, normal, and specular information, we can synthesize the face under arbitrary lighting and viewing directions realistically. Unlike many existing techniques, our approach can extract both the diffuse and specular information of face without involving an intensive 3-D matching procedure. We conduct three different experiments to show our improvement over the prior art. First, we compare the proposed algorithm with previous techniques, including the state of the art, to demonstrate our achievement in realistic face synthesis. Moreover, we evaluate the proposed algorithm over nonautomatic face modeling techniques through a subjective user study. This evaluation is meaningful in that it tells us how far our results as well as others are from the real photograph in terms of the perceptual quality. Finally, we apply our face model for improving the face recognition performance under varying illumination conditions and show that the proposed algorithm is effective in enhancing the face recognition rate. Thanks to the compact representation and the effective inference scheme, our technique is applicable for many practical applications, such as avatar creation, digital face cloning, face normalization, de-identification and many others. [ABSTRACT FROM AUTHOR]
- Published
- 2012
- Full Text
- View/download PDF
37. Weakly-supervised progressive denoising with unpaired CT images.
- Author
-
Kim, Byeongjoon, Shim, Hyunjung, and Baek, Jongduk
- Subjects
- *
COMPUTED tomography , *IMAGE denoising , *SUPERVISED learning , *SIGNAL detection , *INSPECTION & review , *SIGNAL denoising - Abstract
• We present a new weakly-supervised denoising framework using unpaired CT data. • We progressively denoise an image to tackle the difficulty in directly estimating a clean image. • We provide diagnosis-oriented image quality assessment to demonstrate the effectiveness of our method. • Results demonstrated that our method achieved the denoising performance close to the supervised learning method. [Display omitted] Although low-dose CT imaging has attracted a great interest due to its reduced radiation risk to the patients, it suffers from severe and complex noise. Recent fully-supervised methods have shown impressive performances on CT denoising task. However, they require a huge amount of paired normal-dose and low-dose CT images, which is generally unavailable in real clinical practice. To address this problem, we propose a weakly-supervised denoising framework that generates paired original and noisier CT images from unpaired CT images using a physics-based noise model. Our denoising framework also includes a progressive denoising module that bypasses the challenges of mapping from low-dose to normal-dose CT images directly via progressively compensating the small noise gap. To quantitatively evaluate diagnostic image quality, we present the noise power spectrum and signal detection accuracy, which are well correlated with the visual inspection. The experimental results demonstrate that our method achieves remarkable performances, even superior to fully-supervised CT denoising with respect to the signal detectability. Moreover, our framework increases the flexibility in data collection, allowing us to utilize any unpaired data at any dose levels. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
38. Automatic color realism enhancement for virtual reality.
- Author
-
Shim, Hyunjung and Lee, Seungkyu
- Abstract
Photorealism has been one of essential goals for virtual reality. The state-of-the-art techniques employ various rendering algorithms to simulate physically accurate light transport for generating the photorealistic appearance of scene. However, they still require a labor-intensive tone mapping and color tunes by an experienced artist. In this paper, we propose an automatic photorealism enhancement algorithm by manipulating the color distribution of graphics to match with that of real photographs. Based on the hypothesis that photorealism is highly correlated with the frequency of color characteristics appearing in real photographs, we find principal color components. Then, we transfer the statistical characteristics of photographs onto graphics so to enhance their photorealism. Experiments and a user study have confirmed the effectiveness of proposed method. [ABSTRACT FROM PUBLISHER]
- Published
- 2012
- Full Text
- View/download PDF
39. Rigid and non-rigid motion artifact reduction in X-ray CT using attention module.
- Author
-
Ko, Youngjun, Moon, Seunghyuk, Baek, Jongduk, and Shim, Hyunjung
- Subjects
- *
COMPUTER-assisted image analysis (Medicine) , *X-rays , *CONE beam computed tomography , *IMAGING systems , *MOTION , *BESSEL beams , *EULER-Bernoulli beam theory - Abstract
• Existing methods are limited to specific motions or customized CT systems. • We propose a new real-time method using residual learning and an attention module. • Our model is a generalized framework to handle any CT setups and motion types. Motion artifacts are a major factor that can degrade the diagnostic performance of computed tomography (CT) images. In particular, the motion artifacts become considerably more severe when an imaging system requires a long scan time such as in dental CT or cone-beam CT (CBCT) applications, where patients generate rigid and non-rigid motions. To address this problem, we proposed a new real-time technique for motion artifacts reduction that utilizes a deep residual network with an attention module. Our attention module was designed to increase the model capacity by amplifying or attenuating the residual features according to their importance. We trained and evaluated the network by creating four benchmark datasets with rigid motions or with both rigid and non-rigid motions under a step-and-shoot fan-beam CT (FBCT) or a CBCT. Each dataset provided a set of motion-corrupted CT images and their ground-truth CT image pairs. The strong modeling power of the proposed network model allowed us to successfully handle motion artifacts from the two CT systems under various motion scenarios in real-time. As a result, the proposed model demonstrated clear performance benefits. In addition, we compared our model with Wasserstein generative adversarial network (WGAN)-based models and a deep residual network (DRN)-based model, which are one of the most powerful techniques for CT denoising and natural RGB image deblurring, respectively. Based on the extensive analysis and comparisons using four benchmark datasets, we confirmed that our model outperformed the aforementioned competitors. Our benchmark datasets and implementation code are available at https://github.com/youngjun-ko/ct_mar_attention. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
40. Few-Shot Font Generation With Weakly Supervised Localized Representations.
- Author
-
Park S, Chun S, Cha J, Lee B, and Shim H
- Abstract
Automatic few-shot font generation aims to solve a well-defined, real-world problem because manual font designs are expensive and sensitive to the expertise of designers. Existing methods learn to disentangle style and content elements by developing a universal style representation for each font style. However, this approach limits the model in representing diverse local styles because it is unsuitable for complicated letter systems. For example, Chinese characters consist of a varying number of components (often called "radical") with a highly complex structure. In this paper, we propose a novel font generation method that learns localized styles, namely component-wise style representations, instead of universal styles. The proposed style representations enable synthesizing complex local details in text designs. However, learning component-wise styles solely from a few reference glyphs is infeasible when a target script has a large number of components, for example, over 200 for Chinese. To reduce the number of required reference glyphs, we represent component-wise styles by a product of component and style factors inspired by low-rank matrix factorization. Owing to the combination of strong representation and a compact factorization strategy, our method shows remarkably better few-shot font generation results (with only eight reference glyphs) than other state-of-the-art methods. Moreover, strong locality supervision was not utilized, such as the location of each component, skeleton, or strokes. The source code is available at https://github.com/clovaai/lffont.
- Published
- 2024
- Full Text
- View/download PDF
41. Saliency as Pseudo-Pixel Supervision for Weakly and Semi-Supervised Semantic Segmentation.
- Author
-
Lee M, Lee S, Lee J, and Shim H
- Abstract
Existing studies on semantic segmentation using image-level weak supervision have several limitations, including sparse object coverage, inaccurate object boundaries, and co-occurring pixels from non-target objects. To overcome these challenges, we propose a novel framework, an improved version of Explicit Pseudo-pixel Supervision (EPS++), which learns from pixel-level feedback by combining two types of weak supervision. Specifically, the image-level label provides the object identity via the localization map, and the saliency map from an off-the-shelf saliency detection model offers rich object boundaries. We devise a joint training strategy to fully utilize the complementary relationship between disparate information. Notably, we suggest an Inconsistent Region Drop (IRD) strategy, which effectively handles errors in saliency maps using fewer hyper-parameters than EPS. Our method can obtain accurate object boundaries and discard co-occurring pixels, significantly improving the quality of pseudo-masks. Experimental results show that EPS++ effectively resolves the key challenges of semantic segmentation using weak supervision, resulting in new state-of-the-art performances on three benchmark datasets in a weakly supervised semantic segmentation setting. Furthermore, we show that the proposed method can be extended to solve the semi-supervised semantic segmentation problem using image-level weak supervision. Surprisingly, the proposed model also achieves new state-of-the-art performances on two popular benchmark datasets.
- Published
- 2023
- Full Text
- View/download PDF
42. Evaluation for Weakly Supervised Object Localization: Protocol, Metrics, and Datasets.
- Author
-
Choe J, Oh SJ, Chun S, Lee S, Akata Z, and Shim H
- Abstract
Weakly-supervised object localization (WSOL) has gained popularity over the last years for its promise to train localization models with only image-level labels. Since the seminal WSOL work of class activation mapping (CAM), the field has focused on how to expand the attention regions to cover objects more broadly and localize them better. However, these strategies rely on full localization supervision for validating hyperparameters and model selection, which is in principle prohibited under the WSOL setup. In this paper, we argue that WSOL task is ill-posed with only image-level labels, and propose a new evaluation protocol where full supervision is limited to only a small held-out set not overlapping with the test set. We observe that, under our protocol, the five most recent WSOL methods have not made a major improvement over the CAM baseline. Moreover, we report that existing WSOL methods have not reached the few-shot learning baseline, where the full-supervision at validation time is used for model training instead. Based on our findings, we discuss some future directions for WSOL. Source code and dataset are available at https://github.com/clovaai/wsolevaluation https://github.com/clovaai/wsolevaluation.
- Published
- 2023
- Full Text
- View/download PDF
43. Robust approach to reconstructing transparent objects using a time-of-flight depth camera.
- Author
-
Kim K and Shim H
- Abstract
This study presents a robust approach to reconstructing a three dimensional (3-D) translucent object using a single time-of-flight depth camera with simple user marks. Because the appearance of translucent objects depends on the light interaction with the surrounding environment, the measurement using depth cameras is considerably biased or invalid. Although several existing methods attempt to model the depth error of translucent objects, their model remains partial because of object assumptions and its sensitivity to noise. In this study, we introduce a ground plane and piece-wise linear surface model as priors and construct a robust 3-D reconstruction framework for translucent objects. These two depth priors are combined with the depth error model built on the time-of-flight principle. Extensive evaluation of various real data reveals that the proposed method substantially improves the accuracy and reliability of 3-D reconstruction for translucent objects.
- Published
- 2017
- Full Text
- View/download PDF
44. Hybrid exposure for depth imaging of a time-of-flight depth sensor.
- Author
-
Shim H and Lee S
- Abstract
A time-of-flight (ToF) depth sensor produces noisy range data due to scene properties such as surface materials and reflectivity. Sensor measurement frequently includes either a saturated or severely noisy depth and effective depth accuracy is far below its ideal specification. In this paper, we propose a hybrid exposure technique for depth imaging in a ToF sensor so to improve the depth quality. Our method automatically determines an optimal depth for each pixel using two exposure conditions. To show that our algorithm is effective, we compare the proposed algorithm with two conventional methods in qualitative and quantitative manners showing the superior performance of proposed algorithm.
- Published
- 2014
- Full Text
- View/download PDF
45. A probabilistic approach to realistic face synthesis with a single uncalibrated image.
- Author
-
Shim H
- Subjects
- Calibration, Data Interpretation, Statistical, Humans, Image Enhancement methods, Reproducibility of Results, Sensitivity and Specificity, Algorithms, Biometry methods, Face anatomy & histology, Image Interpretation, Computer-Assisted methods, Imaging, Three-Dimensional methods, Pattern Recognition, Automated methods, Subtraction Technique
- Abstract
This paper presents a novel approach to automatic face modeling for realistic synthesis from an unknown face image, using a probabilistic face diffuse model and a generic face specular map. We construct a probabilistic face diffuse model for estimating the albedo and normals of the input face. Then, we develop a generic face specular map for estimating the specularity of face. Using the estimated albedo, normal and specular information, we can synthesize the face under arbitrary lighting and viewing directions realistically. Unlike many existing techniques, our approach can extract both the diffuse and specular information of face without involving an intensive 3D matching procedure. We conduct three different experiments to show our improvement over the prior art. First, we compare the proposed algorithm with previous techniques, including the state of the art, to demonstrate our achievement in realistic face synthesis. Moreover, we evaluate the proposed algorithm over non-automatic face modeling techniques through a subjective study. This evaluation is meaningful in that it tells us how far the proposed algorithm as well as others are from the real photograph in terms of the perceptual quality. Finally, we apply our face model for improving the face recognition performance under varying illumination conditions and show that the proposed algorithm is effective to enhance the face recognition rate. Thanks to the compact representation and the effective inference scheme, our technique is applicable for many practical applications, such as avatar creation, digital face cloning, face normalization, deidentification and many others.
- Published
- 2012
- Full Text
- View/download PDF
46. A subspace model-based approach to face relighting under unknown lighting and poses.
- Author
-
Shim H, Luo J, and Chen T
- Subjects
- Humans, Reproducibility of Results, Sensitivity and Specificity, Algorithms, Face anatomy & histology, Facial Expression, Image Enhancement methods, Image Interpretation, Computer-Assisted methods, Imaging, Three-Dimensional methods, Lighting methods
- Abstract
We present a new approach to face relighting by jointly estimating the pose, reflectance functions, and lighting from as few as one image of a face. Upon such estimation, we can synthesize the face image under any prescribed new lighting condition. In contrast to commonly used face shape models or shape-dependent models, we neither recover nor assume the 3-D face shape during the estimation process. Instead, we train a pose- and pixel-dependent subspace model of the reflectance function using a face database that contains samples of pose and illumination for a large number of individuals (e.g., the CMU PIE database and the Yale database). Using this subspace model, we can estimate the pose, the reflectance functions, and the lighting condition of any given face image. Our approach lends itself to practical applications thanks to many desirable properties, including the preservation of the non-Lambertian skin reflectance properties and facial hair, as well as reproduction of various shadows on the face. Extensive experiments show that, compared to recent representative face relighting techniques, our method successfully produces better results, in terms of subjective and objective quality, without reconstructing a 3-D shape.
- Published
- 2008
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.