8 results on '"Kim, Chaewon"'
Search Results
2. RADIO: Reference-Agnostic Dubbing Video Synthesis
- Author
-
Lee, Dongyeun, Kim, Chaewon, Yu, Sangjoon, Yoo, Jaejun, and Park, Gyeong-Moon
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning ,Computer Science - Sound ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
One of the most challenging problems in audio-driven talking head generation is achieving high-fidelity detail while ensuring precise synchronization. Given only a single reference image, extracting meaningful identity attributes becomes even more challenging, often causing the network to mirror the facial and lip structures too closely. To address these issues, we introduce RADIO, a framework engineered to yield high-quality dubbed videos regardless of the pose or expression in reference images. The key is to modulate the decoder layers using latent space composed of audio and reference features. Additionally, we incorporate ViT blocks into the decoder to emphasize high-fidelity details, especially in the lip region. Our experimental results demonstrate that RADIO displays high synchronization without the loss of fidelity. Especially in harsh scenarios where the reference frame deviates significantly from the ground truth, our method outperforms state-of-the-art methods, highlighting its robustness., Comment: Accepted by WACV 2024
- Published
- 2023
3. Atom-level design strategy for hydrogen evolution reaction of transition metal dichalcogenides catalysts
- Author
-
Lee, Sangjin, Lee, Sujin, Kim, Chaewon, and Han, Young-Kyu
- Subjects
Condensed Matter - Materials Science - Abstract
Two-dimensional transition metal dichalcogenides are among the most promising materials for water-splitting catalysts. While a variety of methods have been applied to promote the hydrogen evolution reaction on the transition metal dichalcogenides, doping of transition metal heteroatoms have attracted much attention since it provides effective ways to optimize the hydrogen adsorption and H2 generation reactions. Herein, we provide in-depth and systematic analyses on the trends of the free energy of hydrogen adsorption ({\Delta}GH*), the most well-known descriptor for evaluating hydrogen evolution reaction performance, in the doped transition metal dichalcogenides. Using the total 150 doped transition metal dichalcogenides, we carried out the atom-level analysis on the origin of {\Delta}GH* changes upon the transition metal heteroatom doping, and suggest two key factors that govern the hydrogen adsorption process on the doped transition metal dichalcogenides: 1) the changes in the charge of chalcogen atoms where hydrogen atoms adsorbed for the early transition metal doped structures, and 2) the structural deformation energies accompanying in introduced dopants for the late transition metal doped structures. Based on our findings, we interpret from a new perspective how vacancies in the TM-doped TMDs can provide optimal {\Delta}GH* in HER. We suggest electrostatic control for early TM doped systems and structural control for late TM doped systems as the effective strategies for the thermoneutral {\Delta}GH* in TMD.
- Published
- 2023
4. WaGI : Wavelet-based GAN Inversion for Preserving High-frequency Image Details
- Author
-
Moon, Seung-Jun, Kim, Chaewon, and Park, Gyeong-Moon
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning ,Electrical Engineering and Systems Science - Image and Video Processing - Abstract
Recent GAN inversion models focus on preserving image-specific details through various methods, e.g., generator tuning or feature mixing. While those are helpful for preserving details compared to a naiive low-rate latent inversion, they still fail to maintain high-frequency features precisely. In this paper, we point out that the existing GAN inversion models have inherent limitations in both structural and training aspects, which preclude the delicate reconstruction of high-frequency features. Especially, we prove that the widely-used loss term in GAN inversion, i.e., L2, is biased to reconstruct low-frequency features mainly. To overcome this problem, we propose a novel GAN inversion model, coined WaGI, which enables to handle high-frequency features explicitly, by using a novel wavelet-based loss term and a newly proposed wavelet fusion scheme. To the best of our knowledge, WaGI is the first attempt to interpret GAN inversion in the frequency domain. We demonstrate that WaGI shows outstanding results on both inversion and editing, compared to the existing state-of-the-art GAN inversion models. Especially, WaGI robustly preserves high-frequency features of images even in the editing scenario. We will release our code with the pre-trained model after the review.
- Published
- 2022
5. Zero-shot Blind Image Denoising via Implicit Neural Representations
- Author
-
Kim, Chaewon, Lee, Jaeho, and Shin, Jinwoo
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Artificial Intelligence ,Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning - Abstract
Recent denoising algorithms based on the "blind-spot" strategy show impressive blind image denoising performances, without utilizing any external dataset. While the methods excel in recovering highly contaminated images, we observe that such algorithms are often less effective under a low-noise or real noise regime. To address this gap, we propose an alternative denoising strategy that leverages the architectural inductive bias of implicit neural representations (INRs), based on our two findings: (1) INR tends to fit the low-frequency clean image signal faster than the high-frequency noise, and (2) INR layers that are closer to the output play more critical roles in fitting higher-frequency parts. Building on these observations, we propose a denoising algorithm that maximizes the innate denoising capability of INRs by penalizing the growth of deeper layer weights. We show that our method outperforms existing zero-shot denoising methods under an extensive set of low-noise or real-noise scenarios., Comment: 8 pages, 3 figures
- Published
- 2022
6. Scaling Neural Tangent Kernels via Sketching and Random Features
- Author
-
Zandieh, Amir, Han, Insu, Avron, Haim, Shoham, Neta, Kim, Chaewon, and Shin, Jinwoo
- Subjects
Computer Science - Machine Learning ,Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Data Structures and Algorithms - Abstract
The Neural Tangent Kernel (NTK) characterizes the behavior of infinitely-wide neural networks trained under least squares loss by gradient descent. Recent works also report that NTK regression can outperform finitely-wide neural networks trained on small-scale datasets. However, the computational complexity of kernel methods has limited its use in large-scale learning tasks. To accelerate learning with NTK, we design a near input-sparsity time approximation algorithm for NTK, by sketching the polynomial expansions of arc-cosine kernels: our sketch for the convolutional counterpart of NTK (CNTK) can transform any image using a linear runtime in the number of pixels. Furthermore, we prove a spectral approximation guarantee for the NTK matrix, by combining random features (based on leverage score sampling) of the arc-cosine kernels with a sketching algorithm. We benchmark our methods on various large-scale regression and classification tasks and show that a linear regressor trained on our CNTK features matches the accuracy of exact CNTK on CIFAR-10 dataset while achieving 150x speedup., Comment: This is a merger of arXiv:2104.01351, arXiv:2104.00415
- Published
- 2021
7. Random Features for the Neural Tangent Kernel
- Author
-
Han, Insu, Avron, Haim, Shoham, Neta, Kim, Chaewon, and Shin, Jinwoo
- Subjects
Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
The Neural Tangent Kernel (NTK) has discovered connections between deep neural networks and kernel methods with insights of optimization and generalization. Motivated by this, recent works report that NTK can achieve better performances compared to training neural networks on small-scale datasets. However, results under large-scale settings are hardly studied due to the computational limitation of kernel methods. In this work, we propose an efficient feature map construction of the NTK of fully-connected ReLU network which enables us to apply it to large-scale datasets. We combine random features of the arc-cosine kernels with a sketching-based algorithm which can run in linear with respect to both the number of data points and input dimension. We show that dimension of the resulting features is much smaller than other baseline feature map constructions to achieve comparable error bounds both in theory and practice. We additionally utilize the leverage score based sampling for improved bounds of arc-cosine random features and prove a spectral approximation guarantee of the proposed feature map to the NTK matrix of two-layer neural network. We benchmark a variety of machine learning tasks to demonstrate the superiority of the proposed scheme. In particular, our algorithm can run tens of magnitude faster than the exact kernel methods for large-scale settings without performance loss.
- Published
- 2021
8. 2018 Robotic Scene Segmentation Challenge
- Author
-
Allan, Max, Kondo, Satoshi, Bodenstedt, Sebastian, Leger, Stefan, Kadkhodamohammadi, Rahim, Luengo, Imanol, Fuentes, Felix, Flouty, Evangello, Mohammed, Ahmed, Pedersen, Marius, Kori, Avinash, Alex, Varghese, Krishnamurthi, Ganapathy, Rauber, David, Mendel, Robert, Palm, Christoph, Bano, Sophia, Saibro, Guinther, Shih, Chi-Sheng, Chiang, Hsun-An, Zhuang, Juntang, Yang, Junlin, Iglovikov, Vladimir, Dobrenkii, Anton, Reddiboina, Madhu, Reddy, Anubhav, Liu, Xingtong, Gao, Cong, Unberath, Mathias, Kim, Myeonghyeon, Kim, Chanho, Kim, Chaewon, Kim, Hyejin, Lee, Gyeongmin, Ullah, Ihsan, Luna, Miguel, Park, Sang Hyun, Azizian, Mahdi, Stoyanov, Danail, Maier-Hein, Lena, and Speidel, Stefanie
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Robotics - Abstract
In 2015 we began a sub-challenge at the EndoVis workshop at MICCAI in Munich using endoscope images of ex-vivo tissue with automatically generated annotations from robot forward kinematics and instrument CAD models. However, the limited background variation and simple motion rendered the dataset uninformative in learning about which techniques would be suitable for segmentation in real surgery. In 2017, at the same workshop in Quebec we introduced the robotic instrument segmentation dataset with 10 teams participating in the challenge to perform binary, articulating parts and type segmentation of da Vinci instruments. This challenge included realistic instrument motion and more complex porcine tissue as background and was widely addressed with modifications on U-Nets and other popular CNN architectures. In 2018 we added to the complexity by introducing a set of anatomical objects and medical devices to the segmented classes. To avoid over-complicating the challenge, we continued with porcine data which is dramatically simpler than human tissue due to the lack of fatty tissue occluding many organs.
- Published
- 2020
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.