Author: "Lee Donghwan" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Lee Donghwan"' showing total 626 results

Start Over Author "Lee Donghwan"

626 results on '"Lee Donghwan"'

1. Finite-Time Analysis of Simultaneous Double Q-learning

Author: Na, Hyunjun and Lee, Donghwan
Subjects: Computer Science - Machine Learning, Electrical Engineering and Systems Science - Systems and Control
Abstract: $Q$-learning is one of the most fundamental reinforcement learning (RL) algorithms. Despite its widespread success in various applications, it is prone to overestimation bias in the $Q$-learning update. To address this issue, double $Q$-learning employs two independent $Q$-estimators which are randomly selected and updated during the learning process. This paper proposes a modified double $Q$-learning, called simultaneous double $Q$-learning (SDQ), with its finite-time analysis. SDQ eliminates the need for random selection between the two $Q$-estimators, and this modification allows us to analyze double $Q$-learning through the lens of a novel switching system framework facilitating efficient finite-time analysis. Empirical studies demonstrate that SDQ converges faster than double $Q$-learning while retaining the ability to mitigate the maximization bias. Finally, we derive a finite-time expected error bound for SDQ., Comment: 25 pages, 3 figures
Published: 2024

2. A finite time analysis of distributed Q-learning

Author: Lim, Han-Dong and Lee, Donghwan
Subjects: Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Computer Science - Multiagent Systems
Abstract: Multi-agent reinforcement learning (MARL) has witnessed a remarkable surge in interest, fueled by the empirical success achieved in applications of single-agent reinforcement learning (RL). In this study, we consider a distributed Q-learning scenario, wherein a number of agents cooperatively solve a sequential decision making problem without access to the central reward function which is an average of the local rewards. In particular, we study finite-time analysis of a distributed Q-learning algorithm, and provide a new sample complexity result of $\tilde{\mathcal{O}}\left( \min\left\{\frac{1}{\epsilon^2}\frac{t_{\text{mix}}}{(1-\gamma)^6 d_{\min}^4 } ,\frac{1}{\epsilon}\frac{\sqrt{|\gS||\gA|}}{(1-\sigma_2(\boldsymbol{W}))(1-\gamma)^4 d_{\min}^3} \right\}\right)$ under tabular lookup
Published: 2024

3. Unified ODE Analysis of Smooth Q-Learning Algorithms

Author: Lee, Donghwan
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Convergence of Q-learning has been the focus of extensive research over the past several decades. Recently, an asymptotic convergence analysis for Q-learning was introduced using a switching system framework. This approach applies the so-called ordinary differential equation (ODE) approach to prove the convergence of the asynchronous Q-learning modeled as a continuous-time switching system, where notions from switching system theory are used to prove its asymptotic stability without using explicit Lyapunov arguments. However, to prove stability, restrictive conditions, such as quasi-monotonicity, must be satisfied for the underlying switching systems, which makes it hard to easily generalize the analysis method to other reinforcement learning algorithms, such as the smooth Q-learning variants. In this paper, we present a more general and unified convergence analysis that improves upon the switching system approach and can analyze Q-learning and its smooth variants. The proposed analysis is motivated by previous work on the convergence of synchronous Q-learning based on $p$-norm serving as a Lyapunov function. However, the proposed analysis addresses more general ODE models that can cover both asynchronous Q-learning and its smooth versions with simpler frameworks.
Published: 2024

4. Finite-Time Error Analysis of Soft Q-Learning: Switching System Approach

Author: Jeong, Narim and Lee, Donghwan
Subjects: Computer Science - Machine Learning
Abstract: Soft Q-learning is a variation of Q-learning designed to solve entropy regularized Markov decision problems where an agent aims to maximize the entropy regularized value function. Despite its empirical success, there have been limited theoretical studies of soft Q-learning to date. This paper aims to offer a novel and unified finite-time, control-theoretic analysis of soft Q-learning algorithms. We focus on two types of soft Q-learning algorithms: one utilizing the log-sum-exp operator and the other employing the Boltzmann operator. By using dynamical switching system models, we derive novel finite-time error bounds for both soft Q-learning algorithms. We hope that our analysis will deepen the current understanding of soft Q-learning by establishing connections with switching system models and may even pave the way for new frameworks in the finite-time analysis of other reinforcement learning algorithms., Comment: 18 pages
Published: 2024

5. Analysis of Off-Policy Multi-Step TD-Learning with Linear Function Approximation

Author: Lee, Donghwan
Subjects: Electrical Engineering and Systems Science - Systems and Control, Computer Science - Machine Learning
Abstract: This paper analyzes multi-step TD-learning algorithms within the `deadly triad' scenario, characterized by linear function approximation, off-policy learning, and bootstrapping. In particular, we prove that n-step TD-learning algorithms converge to a solution as the sampling horizon n increases sufficiently. The paper is divided into two parts. In the first part, we comprehensively examine the fundamental properties of their model-based deterministic counterparts, including projected value iteration, gradient descent algorithms, and the control theoretic approach, which can be viewed as prototype deterministic algorithms whose analysis plays a pivotal role in understanding and developing their model-free reinforcement learning counterparts. In particular, we prove that these algorithms converge to meaningful solutions when n is sufficiently large. Based on these findings, two n-step TD-learning algorithms are proposed and analyzed, which can be seen as the model-free reinforcement learning counterparts of the gradient and control theoretic algorithms.
Published: 2024

6. Finite-Time Error Analysis of Online Model-Based Q-Learning with a Relaxed Sampling Model

Author: Lim, Han-Dong, Lee, HyeAnn, and Lee, Donghwan
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Reinforcement learning has witnessed significant advancements, particularly with the emergence of model-based approaches. Among these, $Q$-learning has proven to be a powerful algorithm in model-free settings. However, the extension of $Q$-learning to a model-based framework remains relatively unexplored. In this paper, we delve into the sample complexity of $Q$-learning when integrated with a model-based approach. Through theoretical analyses and empirical evaluations, we seek to elucidate the conditions under which model-based $Q$-learning excels in terms of sample efficiency compared to its model-free counterpart.
Published: 2024

7. Harnessing Membership Function Dynamics for Stability Analysis of T-S Fuzzy Systems

Author: Lee, Donghwan and Kim, Do-Wan
Subjects: Electrical Engineering and Systems Science - Systems and Control, Mathematics - Optimization and Control
Abstract: The main goal of this paper is to develop a new linear matrix inequality (LMI) condition for the asymptotic stability of continuous-time Takagi-Sugeno (T-S) fuzzy systems. A key advantage of this new condition is its independence from the bounds on the time-derivatives of the membership functions, a requirement present in the existing approaches. This is achieved by introducing a novel fuzzy Lyapunov function that incorporates an augmented state vector. Notably, this augmented state vector encompasses the membership functions, allowing the dynamics of these functions to be integrated into the proposed condition. This inclusion of additional information about the membership function serves to reduce the conservativeness of the suggested stability condition. To demonstrate the effectiveness of the proposed method, examples are provided., Comment: arXiv admin note: substantial text overlap with arXiv:2309.06841
Published: 2024

8. A Theory of Non-Linear Feature Learning with One Gradient Step in Two-Layer Neural Networks

Author: Moniri, Behrad, Lee, Donghwan, Hassani, Hamed, and Dobriban, Edgar
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning
Abstract: Feature learning is thought to be one of the fundamental reasons for the success of deep neural networks. It is rigorously known that in two-layer fully-connected neural networks under certain conditions, one step of gradient descent on the first layer can lead to feature learning; characterized by the appearance of a separated rank-one component -- spike -- in the spectrum of the feature matrix. However, with a constant gradient descent step size, this spike only carries information from the linear component of the target function and therefore learning non-linear components is impossible. We show that with a learning rate that grows with the sample size, such training in fact introduces multiple rank-one components, each corresponding to a specific polynomial feature. We further prove that the limiting large-dimensional and large sample training and test errors of the updated neural networks are fully characterized by these spikes. By precisely analyzing the improvement in the training and test errors, we demonstrate that these non-linear features can enhance learning.
Published: 2023

9. Suppressing Overestimation in Q-Learning through Adversarial Behaviors

Author: Lee, HyeAnn and Lee, Donghwan
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: The goal of this paper is to propose a new Q-learning algorithm with a dummy adversarial player, which is called dummy adversarial Q-learning (DAQ), that can effectively regulate the overestimation bias in standard Q-learning. With the dummy player, the learning can be formulated as a two-player zero-sum game. The proposed DAQ unifies several Q-learning variations to control overestimation biases, such as maxmin Q-learning and minmax Q-learning (proposed in this paper) in a single framework. The proposed DAQ is a simple but effective way to suppress the overestimation bias thourgh dummy adversarial behaviors and can be easily applied to off-the-shelf reinforcement learning algorithms to improve the performances. A finite-time convergence of DAQ is analyzed from an integrated perspective by adapting an adversarial Q-learning. The performance of the suggested DAQ is empirically demonstrated under various benchmark environments., Comment: Annual Allerton 2024
Published: 2023

10. A primal-dual perspective for distributed TD-learning

Author: Lim, Han-Dong and Lee, Donghwan
Subjects: Computer Science - Machine Learning, Mathematics - Optimization and Control
Abstract: The goal of this paper is to investigate distributed temporal difference (TD) learning for a networked multi-agent Markov decision process. The proposed approach is based on distributed optimization algorithms, which can be interpreted as primal-dual Ordinary differential equation (ODE) dynamics subject to null-space constraints. Based on the exponential convergence behavior of the primal-dual ODE dynamics subject to null-space constraints, we examine the behavior of the final iterate in various distributed TD-learning scenarios, considering both constant and diminishing step-sizes and incorporating both i.i.d. and Markovian observation models. Unlike existing methods, the proposed algorithm does not require the assumption that the underlying communication network structure is characterized by a doubly stochastic matrix.
Published: 2023

11. Relaxed Conditions for Parameterized Linear Matrix Inequality in the Form of Nested Fuzzy Summations

Author: Kim, Do Wan and Lee, Donghwan
Subjects: Mathematics - Optimization and Control, Electrical Engineering and Systems Science - Systems and Control
Abstract: The aim of this study is to investigate less conservative conditions for parameterized linear matrix inequalities (PLMIs) that are formulated as nested fuzzy summations. Such PLMIs are commonly encountered in stability analysis and control design problems for Takagi-Sugeno (T-S) fuzzy systems. Utilizing the weighted inequality of arithmetic and geometric means (AM-GM inequality), we develop new, less conservative linear matrix inequalities for the PLMIs. This methodology enables us to efficiently handle the product of membership functions that have intersecting indices. Through empirical case studies, we demonstrate that our proposed conditions produce less conservative results compared to existing approaches in the literature., Comment: This work has been submitted to IEEE Transactions on Systems, Man and Cybernetics: Systems for possible publications
Published: 2023

12. On the Local Quadratic Stability of T-S Fuzzy Systems in the Vicinity of the Origin

Author: Lee, Donghwan and Kim, Do Wan
Subjects: Electrical Engineering and Systems Science - Systems and Control, Computer Science - Artificial Intelligence
Abstract: The main goal of this paper is to introduce new local stability conditions for continuous-time Takagi-Sugeno (T-S) fuzzy systems. These stability conditions are based on linear matrix inequalities (LMIs) in combination with quadratic Lyapunov functions. Moreover, they integrate information on the membership functions at the origin and effectively leverage the linear structure of the underlying nonlinear system in the vicinity of the origin. As a result, the proposed conditions are proved to be less conservative compared to existing methods using fuzzy Lyapunov functions in the literature. Moreover, we establish that the proposed methods offer necessary and sufficient conditions for the local exponential stability of T-S fuzzy systems. The paper also includes discussions on the inherent limitations associated with fuzzy Lyapunov approaches. To demonstrate the theoretical results, we provide comprehensive examples that elucidate the core concepts and validate the efficacy of the proposed conditions.
Published: 2023

13. Continuous-Time Distributed Dynamic Programming for Networked Multi-Agent Markov Decision Processes

Author: Lee, Donghwan, Lim, Han-Dong, and Kim, Do Wan
Subjects: Electrical Engineering and Systems Science - Systems and Control, Computer Science - Artificial Intelligence
Abstract: The main goal of this paper is to investigate continuous-time distributed dynamic programming (DP) algorithms for networked multi-agent Markov decision problems (MAMDPs). In our study, we adopt a distributed multi-agent framework where individual agents have access only to their own rewards, lacking insights into the rewards of other agents. Moreover, each agent has the ability to share its parameters with neighboring agents through a communication network, represented by a graph. We first introduce a novel distributed DP, inspired by the distributed optimization method of Wang and Elia. Next, a new distributed DP is introduced through a decoupling process. The convergence of the DP algorithms is proved through systems and control perspectives. The study in this paper sets the stage for new distributed temporal different learning algorithms.
Published: 2023

14. Bias reduction for semi-competing risks frailty model with rare events: application to a chronic kidney disease cohort study in South Korea

Author: Kim, Jayoun, Jeong, Boram, Ha, Il Do, Oh, Kook-Hwan, Jung, Ji Yong, Jeong, Jong Cheol, and Lee, Donghwan
Published: 2024
Full Text: View/download PDF

15. Temporal Difference Learning with Experience Replay

Author: Lim, Han-Dong and Lee, Donghwan
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Temporal-difference (TD) learning is widely regarded as one of the most popular algorithms in reinforcement learning (RL). Despite its widespread use, it has only been recently that researchers have begun to actively study its finite time behavior, including the finite time bound on mean squared error and sample complexity. On the empirical side, experience replay has been a key ingredient in the success of deep RL algorithms, but its theoretical effects on RL have yet to be fully understood. In this paper, we present a simple decomposition of the Markovian noise terms and provide finite-time error bounds for TD-learning with experience replay. Specifically, under the Markovian observation model, we demonstrate that for both the averaged iterate and final iterate cases, the error term induced by a constant step-size can be effectively controlled by the size of the replay buffer and the mini-batch sampled from the experience replay buffer.
Published: 2023

16. Optimal Multitask Linear Regression and Contextual Bandits under Sparse Heterogeneity

Author: Huang, Xinmeng, Xu, Kan, Lee, Donghwan, Hassani, Hamed, Bastani, Hamsa, and Dobriban, Edgar
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning, Statistics - Methodology
Abstract: Large and complex datasets are often collected from several, possibly heterogeneous sources. Multitask learning methods improve efficiency by leveraging commonalities across datasets while accounting for possible differences among them. Here, we study multitask linear regression and contextual bandits under sparse heterogeneity, where the source/task-associated parameters are equal to a global parameter plus a sparse task-specific term. We propose a novel two-stage estimator called MOLAR that leverages this structure by first constructing a covariate-wise weighted median of the task-wise linear regression estimates and then shrinking the task-wise estimates towards the weighted median. Compared to task-wise least squares estimates, MOLAR improves the dependence of the estimation error on the data dimension. Extensions of MOLAR to generalized linear models and constructing confidence intervals are discussed in the paper. We then apply MOLAR to develop methods for sparsely heterogeneous multitask contextual bandits, obtaining improved regret guarantees over single-task bandit methods. We further show that our methods are minimax optimal by providing a number of lower bounds. Finally, we support the efficiency of our methods by performing experiments on both synthetic data and the PISA dataset on student educational outcomes from heterogeneous countries., Comment: Journal of the American Statistical Association, 2024
Published: 2023
Full Text: View/download PDF

17. Finite-Time Analysis of Minimax Q-Learning for Two-Player Zero-Sum Markov Games: Switching System Approach

Author: Lee, Donghwan
Subjects: Electrical Engineering and Systems Science - Systems and Control, Computer Science - Computer Science and Game Theory, Computer Science - Machine Learning
Abstract: The objective of this paper is to investigate the finite-time analysis of a Q-learning algorithm applied to two-player zero-sum Markov games. Specifically, we establish a finite-time analysis of both the minimax Q-learning algorithm and the corresponding value iteration method. To enhance the analysis of both value iteration and Q-learning, we employ the switching system model of minimax Q-learning and the associated value iteration. This approach provides further insights into minimax Q-learning and facilitates a more straightforward and insightful convergence analysis. We anticipate that the introduction of these additional insights has the potential to uncover novel connections and foster collaboration between concepts in the fields of control theory and reinforcement learning communities., Comment: arXiv admin note: text overlap with arXiv:2205.05455
Published: 2023

18. On Some Geometric Behavior of Value Iteration on the Orthant: Switching System Perspective

Author: Lee, Donghwan
Subjects: Mathematics - Optimization and Control, Electrical Engineering and Systems Science - Systems and Control
Abstract: In this paper, the primary goal is to offer additional insights into the value iteration through the lens of switching system models in the control community. These models establish a connection between value iteration and switching system theory and reveal additional geometric behaviors of value iteration in solving discounted Markov decision problems. Specifically, the main contributions of this paper are twofold: 1) We provide a switching system model of value iteration and, based on it, offer a different proof for the contraction property of the value iteration. 2) Furthermore, from the additional insights, new geometric behaviors of value iteration are proven when the initial iterate lies in a special region. We anticipate that the proposed perspectives might have the potential to be a useful tool, applicable in various settings. Therefore, further development of these methods could be a valuable avenue for future research.
Published: 2023

19. TMO: Textured Mesh Acquisition of Objects with a Mobile Device by using Differentiable Rendering

Author: Choi, Jaehoon, Jung, Dongki, Lee, Taejae, Kim, Sangwook, Jung, Youngdong, Manocha, Dinesh, and Lee, Donghwan
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: We present a new pipeline for acquiring a textured mesh in the wild with a single smartphone which offers access to images, depth maps, and valid poses. Our method first introduces an RGBD-aided structure from motion, which can yield filtered depth maps and refines camera poses guided by corresponding depth. Then, we adopt the neural implicit surface reconstruction method, which allows for high-quality mesh and develops a new training process for applying a regularization provided by classical multi-view stereo methods. Moreover, we apply a differentiable rendering to fine-tune incomplete texture maps and generate textures which are perceptually closer to the original scene. Our pipeline can be applied to any common objects in the real world without the need for either in-the-lab environments or accurate mask images. We demonstrate results of captured objects with complex shapes and validate our method numerically against existing 3D reconstruction and texture mapping methods., Comment: Accepted to CVPR23. Project Page: https://jh-choi.github.io/TMO/
Published: 2023

20. Backstepping Temporal Difference Learning

Author: Lim, Han-Dong and Lee, Donghwan
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Off-policy learning ability is an important feature of reinforcement learning (RL) for practical applications. However, even one of the most elementary RL algorithms, temporal-difference (TD) learning, is known to suffer form divergence issue when the off-policy scheme is used together with linear function approximation. To overcome the divergent behavior, several off-policy TD-learning algorithms, including gradient-TD learning (GTD), and TD-learning with correction (TDC), have been developed until now. In this work, we provide a unified view of such algorithms from a purely control-theoretic perspective, and propose a new convergent algorithm. Our method relies on the backstepping technique, which is widely used in nonlinear control theory. Finally, convergence of the proposed algorithm is experimentally verified in environments where the standard TD-learning is known to be unstable.
Published: 2023

21. Demystifying Disagreement-on-the-Line in High Dimensions

Author: Lee, Donghwan, Moniri, Behrad, Huang, Xinmeng, Dobriban, Edgar, and Hassani, Hamed
Subjects: Statistics - Machine Learning, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Evaluating the performance of machine learning models under distribution shift is challenging, especially when we only have unlabeled data from the shifted (target) domain, along with labeled data from the original (source) domain. Recent work suggests that the notion of disagreement, the degree to which two models trained with different randomness differ on the same input, is a key to tackle this problem. Experimentally, disagreement and prediction error have been shown to be strongly connected, which has been used to estimate model performance. Experiments have led to the discovery of the disagreement-on-the-line phenomenon, whereby the classification error under the target domain is often a linear function of the classification error under the source domain; and whenever this property holds, disagreement under the source and target domain follow the same linear relation. In this work, we develop a theoretical foundation for analyzing disagreement in high-dimensional random features regression; and study under what conditions the disagreement-on-the-line phenomenon occurs in our setting. Experiments on CIFAR-10-C, Tiny ImageNet-C, and Camelyon17 are consistent with our theory and support the universality of the theoretical findings.
Published: 2023

22. Is the Availability of Biosimilar Adalimumab Associated with Budget Savings? A Difference-in-Difference Analysis of 14 Countries

Author: Woo, Hyunjung, Shin, Gyeongseon, Lee, Donghwan, Kwon, Hye-Young, and Bae, SeungJin
Published: 2024
Full Text: View/download PDF

23. Block Double-Submission Attack: Block Withholding Can Be Self-Destructive

Author: Lee, Suhyeon, Lee, Donghwan, and Kim, Seungjoo
Subjects: Computer Science - Cryptography and Security
Abstract: Proof-of-Work (PoW) is a Sybil control mechanism adopted in blockchain-based cryptocurrencies. It prevents the attempt of malicious actors to manipulate distributed ledgers. Bitcoin has successfully suppressed double-spending by accepting the longest PoW chain. Nevertheless, PoW encountered several major security issues surrounding mining competition. One of them is a Block WithHolding (BWH) attack that can exploit a widespread and cooperative environment called a mining pool. This attack takes advantage of untrustworthy relationships between mining pools and participating agents. Moreover, detecting or responding to attacks is challenging due to the nature of mining pools. In this paper, however, we suggest that BWH attacks also have a comparable trust problem. Because a BWH attacker cannot have complete control over BWH agents, they can betray the belonging mining pool and seek further benefits by trading with victims. We prove that this betrayal is not only valid in all attack parameters but also provides double benefits; finally, it is the best strategy for BWH agents. Furthermore, our study implies that BWH attacks may encounter self-destruction of their own revenue, contrary to their intention., Comment: This paper is an extended version of a paper accepted to ACM Advances in Financial Techologies - AFT 2022
Published: 2022

24. Finite-Time Analysis of Asynchronous Q-learning under Diminishing Step-Size from Control-Theoretic View

Author: Lim, Han-Dong and Lee, Donghwan
Subjects: Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Q-learning has long been one of the most popular reinforcement learning algorithms, and theoretical analysis of Q-learning has been an active research topic for decades. Although researches on asymptotic convergence analysis of Q-learning have a long tradition, non-asymptotic convergence has only recently come under active study. The main goal of this paper is to investigate new finite-time analysis of asynchronous Q-learning under Markovian observation models via a control system viewpoint. In particular, we introduce a discrete-time time-varying switching system model of Q-learning with diminishing step-sizes for our analysis, which significantly improves recent development of the switching system analysis with constant step-sizes, and leads to $\mathcal{O}\left( \sqrt{\frac{\log k}{k}} \right)$ convergence rate that is comparable to or better than most of the state of the art results in the literature. In the mean while, a technique using the similarly transformation is newly applied to avoid the difficulty in the analysis posed by diminishing step-sizes. The proposed analysis brings in additional insights, covers different scenarios, and provides new simplified templates for analysis to deepen our understanding on Q-learning via its unique connection to discrete-time switching systems.
Published: 2022

25. Collaborative Learning of Discrete Distributions under Heterogeneity and Communication Constraints

Author: Huang, Xinmeng, Lee, Donghwan, Dobriban, Edgar, and Hassani, Hamed
Subjects: Statistics - Machine Learning, Computer Science - Information Theory, Computer Science - Machine Learning
Abstract: In modern machine learning, users often have to collaborate to learn the distribution of the data. Communication can be a significant bottleneck. Prior work has studied homogeneous users -- i.e., whose data follow the same discrete distribution -- and has provided optimal communication-efficient methods for estimating that distribution. However, these methods rely heavily on homogeneity, and are less applicable in the common case when users' discrete distributions are heterogeneous. Here we consider a natural and tractable model of heterogeneity, where users' discrete distributions only vary sparsely, on a small number of entries. We propose a novel two-stage method named SHIFT: First, the users collaborate by communicating with the server to learn a central distribution; relying on methods from robust statistics. Then, the learned central distribution is fine-tuned to estimate their respective individual distribution. We show that SHIFT is minimax optimal in our model of heterogeneity and under communication constraints. Further, we provide experimental results using both synthetic data and $n$-gram frequency estimation in the text domain, which corroborate its efficiency.
Published: 2022

26. Investigating the Role of Image Retrieval for Visual Localization -- An exhaustive benchmark

Author: Humenberger, Martin, Cabon, Yohann, Pion, Noé, Weinzaepfel, Philippe, Lee, Donghwan, Guérin, Nicolas, Sattler, Torsten, and Csurka, Gabriela
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Visual localization, i.e., camera pose estimation in a known scene, is a core component of technologies such as autonomous driving and augmented reality. State-of-the-art localization approaches often rely on image retrieval techniques for one of two purposes: (1) provide an approximate pose estimate or (2) determine which parts of the scene are potentially visible in a given query image. It is common practice to use state-of-the-art image retrieval algorithms for both of them. These algorithms are often trained for the goal of retrieving the same landmark under a large range of viewpoint changes which often differs from the requirements of visual localization. In order to investigate the consequences for visual localization, this paper focuses on understanding the role of image retrieval for multiple visual localization paradigms. First, we introduce a novel benchmark setup and compare state-of-the-art retrieval representations on multiple datasets using localization performance as metric. Second, we investigate several definitions of "ground truth" for image retrieval. Using these definitions as upper bounds for the visual localization paradigms, we show that there is still sgnificant room for improvement. Third, using these tools and in-depth analysis, we show that retrieval performance on classical landmark retrieval or place recognition tasks correlates only for some but not all paradigms to localization performance. Finally, we analyze the effects of blur and dynamic scenes in the images. We conclude that there is a need for retrieval approaches specifically designed for localization paradigms. Our benchmark and evaluation protocols are available at https://github.com/naver/kapture-localization., Comment: International Journal of Computer Vision (2022). arXiv admin note: text overlap with arXiv:2011.11946
Published: 2022
Full Text: View/download PDF

27. Finite-Time Analysis of Temporal Difference Learning: Discrete-Time Linear System Perspective

Author: Lee, Donghwan and Kim, Do Wan
Subjects: Computer Science - Machine Learning, Electrical Engineering and Systems Science - Systems and Control
Abstract: TD-learning is a fundamental algorithm in the field of reinforcement learning (RL), that is employed to evaluate a given policy by estimating the corresponding value function for a Markov decision process. While significant progress has been made in the theoretical analysis of TD-learning, recent research has uncovered guarantees concerning its statistical efficiency by developing finite-time error bounds. This paper aims to contribute to the existing body of knowledge by presenting a novel finite-time analysis of tabular temporal difference (TD) learning, which makes direct and effective use of discrete-time stochastic linear system models and leverages Schur matrix properties. The proposed analysis can cover both on-policy and off-policy settings in a unified manner. By adopting this approach, we hope to offer new and straightforward templates that not only shed further light on the analysis of TD-learning and related RL algorithms but also provide valuable insights for future research in this domain., Comment: arXiv admin note: text overlap with arXiv:2112.14417
Published: 2022

28. On the local quadratic stability of T–S fuzzy systems in the vicinity of the origin

Author: Lee, Donghwan and Kim, Do Wan
Published: 2024
Full Text: View/download PDF

29. EXpanding Technology-Enabled, Nurse-Delivered Chronic Disease Care (EXTEND): Protocol and Baseline Data for a Randomized Trial

Author: German, Jashalynn, Yang, Qing, Hatch, Daniel, Lewinski, Allison, Bosworth, Hayden B., Kaufman, Brystana G., Chatterjee, Ranee, Pennington, Gina, Matters, Doreen, Lee, Donghwan, Urlichich, Diana, Kokosa, Sarah, Canupp, Holly, Gregory, Patrick, Roberson, Cindy Leslie, Smith, Benjamin, Huber, Sherry, Doukellis, Katheryn, Deal, Tammi, Burns, Rose, Crowley, Matthew J., and Shaw, Ryan J.
Published: 2024
Full Text: View/download PDF

30. Relaxed conditions for parameterized linear matrix inequality in the form of nested fuzzy summations

Author: Kim, Do Wan and Lee, Donghwan
Published: 2025
Full Text: View/download PDF

31. Three-dimensional path-following control of nonlinear autonomous underwater vehicles with actuator saturation

Author: Kim, Moon Hwan, Lee, Donghwan, and Kim, Do Wan
Published: 2025
Full Text: View/download PDF

32. Changes in Mental Health Among Adolescents in South Korea Before and After COVID-19: An Interrupted Time Series Analysis From 2015 to 2022

Author: Kim, Yeonjae, Park, Hyewon, Bhan, YooWha, Lee, Donghwan, Oh, Chang-Mo, Lee, Weon Young, and Park, Bomi
Published: 2025
Full Text: View/download PDF

33. A Single Correspondence Is Enough: Robust Global Registration to Avoid Degeneracy in Urban Environments

Author: Lim, Hyungtae, Yeon, Suyong, Ryu, Soohyun, Lee, Yonghan, Kim, Youngji, Yun, Jaeseong, Jung, Euigon, Lee, Donghwan, and Myung, Hyun
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Robotics
Abstract: Global registration using 3D point clouds is a crucial technology for mobile platforms to achieve localization or manage loop-closing situations. In recent years, numerous researchers have proposed global registration methods to address a large number of outlier correspondences. Unfortunately, the degeneracy problem, which represents the phenomenon in which the number of estimated inliers becomes lower than three, is still potentially inevitable. To tackle the problem, a degeneracy-robust decoupling-based global registration method is proposed, called Quatro. In particular, our method employs quasi-SO(3) estimation by leveraging the Atlanta world assumption in urban environments to avoid degeneracy in rotation estimation. Thus, the minimum degree of freedom (DoF) of our method is reduced from three to one. As verified in indoor and outdoor 3D LiDAR datasets, our proposed method yields robust global registration performance compared with other global registration methods, even for distant point cloud pairs. Furthermore, the experimental results confirm the applicability of our method as a coarse alignment. Our code is available: https://github.com/url-kaist/quatro., Comment: 8 pages. Acccepted by ICRA 2022
Published: 2022

34. SelfTune: Metrically Scaled Monocular Depth Estimation through Self-Supervised Learning

Author: Choi, Jaehoon, Jung, Dongki, Lee, Yonghan, Kim, Deokhwa, Manocha, Dinesh, and Lee, Donghwan
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Monocular depth estimation in the wild inherently predicts depth up to an unknown scale. To resolve scale ambiguity issue, we present a learning algorithm that leverages monocular simultaneous localization and mapping (SLAM) with proprioceptive sensors. Such monocular SLAM systems can provide metrically scaled camera poses. Given these metric poses and monocular sequences, we propose a self-supervised learning method for the pre-trained supervised monocular depth networks to enable metrically scaled depth estimation. Our approach is based on a teacher-student formulation which guides our network to predict high-quality depths. We demonstrate that our approach is useful for various applications such as mobile robot navigation and is applicable to diverse environments. Our full system shows improvements over recent self-supervised depth estimation and completion methods on EuRoC, OpenLORIS, and ScanNet datasets.
Published: 2022

35. T-Cal: An optimal test for the calibration of predictive models

Author: Lee, Donghwan, Huang, Xinmeng, Hassani, Hamed, and Dobriban, Edgar
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning
Abstract: The prediction accuracy of machine learning methods is steadily increasing, but the calibration of their uncertainty predictions poses a significant challenge. Numerous works focus on obtaining well-calibrated predictive models, but less is known about reliably assessing model calibration. This limits our ability to know when algorithms for improving calibration have a real effect, and when their improvements are merely artifacts due to random noise in finite datasets. In this work, we consider detecting mis-calibration of predictive models using a finite validation dataset as a hypothesis testing problem. The null hypothesis is that the predictive model is calibrated, while the alternative hypothesis is that the deviation from calibration is sufficiently large. We find that detecting mis-calibration is only possible when the conditional probabilities of the classes are sufficiently smooth functions of the predictions. When the conditional class probabilities are H\"older continuous, we propose T-Cal, a minimax optimal test for calibration based on a debiased plug-in estimator of the $\ell_2$-Expected Calibration Error (ECE). We further propose Adaptive T-Cal, a version that is adaptive to unknown smoothness. We verify our theoretical findings with a broad range of experiments, including with several popular deep neural net architectures and several standard post-hoc calibration methods. T-Cal is a practical general-purpose tool, which -- combined with classical tests for discrete-valued predictors -- can be used to test the calibration of virtually any probabilistic classification method., Comment: The implementation of T-Cal is available at https://github.com/dh7401/T-Cal
Published: 2022

36. Regularized Q-learning

Author: Lim, Han-Dong and Lee, Donghwan
Subjects: Computer Science - Machine Learning
Abstract: Q-learning is widely used algorithm in reinforcement learning community. Under the lookup table setting, its convergence is well established. However, its behavior is known to be unstable with the linear function approximation case. This paper develops a new Q-learning algorithm that converges when linear function approximation is used. We prove that simply adding an appropriate regularization term ensures convergence of the algorithm. We prove its stability using a recent analysis tool based on switching system models. Moreover, we experimentally show that it converges in environments where Q-learning with linear function approximation has known to diverge. We also provide an error bound on the solution where the algorithm converges., Comment: NeurIPS2024
Published: 2022

37. Control Theoretic Analysis of Temporal Difference Learning

Author: Lee, Donghwan and Kim, Do Wan
Subjects: Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Electrical Engineering and Systems Science - Systems and Control
Abstract: The goal of this manuscript is to conduct a controltheoretic analysis of Temporal Difference (TD) learning algorithms. TD-learning serves as a cornerstone in the realm of reinforcement learning, offering a methodology for approximating the value function associated with a given policy in a Markov Decision Process. Despite several existing works that have contributed to the theoretical understanding of TD-learning, it is only in recent years that researchers have been able to establish concrete guarantees on its statistical efficiency. In this paper, we introduce a finite-time, control-theoretic framework for analyzing TD-learning, leveraging established concepts from the field of linear systems control. Consequently, this paper provides additional insights into the mechanics of TD learning and the broader landscape of reinforcement learning, all while employing straightforward analytical tools derived from control theory., Comment: The contents of this paper have some overlaps with some other arxiv paper we have submitted. Therefore, this paper is redundant in my opinion
Published: 2021

38. Lossless convexification and duality

Author: Lee, Donghwan
Published: 2024
Full Text: View/download PDF

39. New Versions of Gradient Temporal Difference Learning

Author: Lee, Donghwan, Lim, Han-Dong, Park, Jihoon, and Choi, Okyong
Subjects: Computer Science - Machine Learning
Abstract: Sutton, Szepesv\'{a}ri and Maei introduced the first gradient temporal-difference (GTD) learning algorithms compatible with both linear function approximation and off-policy training. The goal of this paper is (a) to propose some variants of GTDs with extensive comparative analysis and (b) to establish new theoretical analysis frameworks for the GTDs. These variants are based on convex-concave saddle-point interpretations of GTDs, which effectively unify all the GTDs into a single framework, and provide simple stability analysis based on recent results on primal-dual gradient dynamics. Finally, numerical comparative analysis is given to evaluate these approaches.
Published: 2021

40. On the Semidefinite Duality of Finite-Horizon LQG Problem

Author: Lee, Donghwan
Subjects: Mathematics - Optimization and Control, Electrical Engineering and Systems Science - Systems and Control
Abstract: In this paper, our goal is to study fundamental foundations of linear quadratic Gaussian (LQG) control problems for stochastic linear time-invariant systems via Lagrangian duality of semidefinite programming (SDP) problems. In particular, we derive an SDP formulation of the finite-horizon LQG problem, and its Lagrangian duality. Moreover, we prove that Riccati equation for LQG can be derived the KKT optimality condition of the corresponding SDP problem. Besides, the proposed primal problem efficiently decouples the system matrices and the gain matrix. This allows us to develop new convex relaxations of non-convex structured control design problems such as the decentralized control problem. We expect that this work would provide new insights on the LQG problem and may potentially facilitate developments of new formulations of various optimal control problems. Numerical examples are given to demonstrate the effectiveness of the proposed methods., Comment: arXiv admin note: substantial text overlap with arXiv:2108.01457
Published: 2021

41. DnD: Dense Depth Estimation in Crowded Dynamic Indoor Scenes

Author: Jung, Dongki, Choi, Jaehoon, Lee, Yonghan, Kim, Deokhwa, Kim, Changick, Manocha, Dinesh, and Lee, Donghwan
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: We present a novel approach for estimating depth from a monocular camera as it moves through complex and crowded indoor environments, e.g., a department store or a metro station. Our approach predicts absolute scale depth maps over the entire scene consisting of a static background and multiple moving people, by training on dynamic scenes. Since it is difficult to collect dense depth maps from crowded indoor environments, we design our training framework without requiring depths produced from depth sensing devices. Our network leverages RGB images and sparse depth maps generated from traditional 3D reconstruction methods to estimate dense depth maps. We use two constraints to handle depth for non-rigidly moving people without tracking their motion explicitly. We demonstrate that our approach offers consistent improvements over recent depth estimation methods on the NAVERLABS dataset, which includes complex and crowded scenes.
Published: 2021

42. Lossless Convexification and Duality

Author: Lee, Donghwan
Subjects: Mathematics - Optimization and Control
Abstract: The main goal of this paper is to investigate strong duality of non-convex semidefinite programming problems (SDPs). In the optimization community, it is well-known that a convex optimization problem satisfies strong duality if the Slater's condition holds. However, this result cannot be directly generalized to non-convex problems. In this paper, we prove that a class of non-convex SDPs with special structures satisfies strong duality under the Slater's condition. Such a class of SDPs arises in SDP-based control analysis and design approaches. Throughout the paper, several examples are given to support the proposed results. We expect that the proposed analysis can potentially deepen our understanding of non-convex SDPs arising in the control community, and promote their analysis based on KKT conditions.
Published: 2021
Full Text: View/download PDF

43. K-ras mutation detected by peptide nucleic acid-clamping polymerase chain reaction, Ki-67, S100P, and SMAD4 expression can improve the diagnostic accuracy of inconclusive pancreatic EUS-FNB specimens

Author: Kim, Bo-Hyung, Kwon, Minji, Lee, Donghwan, Park, Se Woo, and Shin, Eun
Published: 2024
Full Text: View/download PDF

44. Bayesian Stackelberg game approach for cyber mission impact assessment

Author: Lee, Donghwan, Kim, Donghwa, Ahn, Myung Kil, and Lee, Seongkee
Published: 2024
Full Text: View/download PDF

45. Endoplasmic reticular stress as an emerging therapeutic target for chronic pain: a narrative review

Author: Kim, Harper S., Lee, Donghwan, and Shen, Shiqian
Published: 2024
Full Text: View/download PDF

46. Convergence of Dynamic Programming on the Semidefinite Cone

Author: Lee, Donghwan
Subjects: Mathematics - Optimization and Control, Electrical Engineering and Systems Science - Systems and Control
Abstract: The goal of this paper is to investigate new and simple convergence analysis of dynamic programming for linear quadratic regulator problem of discrete-time linear time-invariant systems. In particular, bounds on errors are given in terms of both matrix inequalities and matrix norm. Under a mild assumption on the initial parameter, we prove that the Q-value iteration exponentially converges to the optimal solution. Moreover, a global asymptotic convergence is also presented. These results are then extended to the policy iteration. We prove that in contrast to the Q-value iteration, the policy iteration always converges exponentially fast. An example is given to illustrate the results.
Published: 2021

47. Data-Driven Control Design with LMIs and Dynamic Programming

Author: Lee, Donghwan and Kim, Do Wan
Subjects: Mathematics - Optimization and Control, Electrical Engineering and Systems Science - Systems and Control
Abstract: The goal of this paper is to develop data-driven control design and evaluation strategies based on linear matrix inequalities (LMIs) and dynamic programming. We consider deterministic discrete-time LTI systems, where the system model is unknown. We propose efficient data collection schemes from the state-input trajectories together with data-driven LMIs to design state-feedback controllers for stabilization and linear quadratic regulation (LQR) problem. In addition, we investigate theoretically guaranteed exploration schemes to acquire valid data from the trajectories under different scenarios. In particular, we prove that as more and more data is accumulated, the collected data becomes valid for the proposed algorithms with higher probability. Finally, data-driven dynamic programming algorithms with convergence guarantees are then discussed.
Published: 2021

48. Multi-Objective LQG Design with Primal-Dual Method

Author: Lee, Donghwan and Kim, Do Wan
Subjects: Mathematics - Optimization and Control, Electrical Engineering and Systems Science - Systems and Control
Abstract: The goal of this paper is to study a multi-objective linear quadratic Gaussian (LQG) control problem. In particular, we consider an optimal control problem minimizing a quadratic cost over a finite time horizon for linear stochastic systems subject to control energy constraints. To solve the problem, we suggest an efficient bisection line search algorithm which is computationally efficient compared to other approaches such as the semidefinite programming. The main idea is to use the Lagrangian function and Karush-Kuhn-Tucker (KKT) optimality conditions to solve the constrained optimization problem. The Lagrange multiplier is searched using the bisection line search. Numerical examples are given to demonstrate the effectiveness of the proposed methods.
Published: 2021

49. Large-scale Localization Datasets in Crowded Indoor Spaces

Author: Lee, Donghwan, Ryu, Soohyun, Yeon, Suyong, Lee, Yonghan, Kim, Deokhwa, Han, Cheolho, Cabon, Yohann, Weinzaepfel, Philippe, Guérin, Nicolas, Csurka, Gabriela, and Humenberger, Martin
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Estimating the precise location of a camera using visual localization enables interesting applications such as augmented reality or robot navigation. This is particularly useful in indoor environments where other localization technologies, such as GNSS, fail. Indoor spaces impose interesting challenges on visual localization algorithms: occlusions due to people, textureless surfaces, large viewpoint changes, low light, repetitive textures, etc. Existing indoor datasets are either comparably small or do only cover a subset of the mentioned challenges. In this paper, we introduce 5 new indoor datasets for visual localization in challenging real-world environments. They were captured in a large shopping mall and a large metro station in Seoul, South Korea, using a dedicated mapping platform consisting of 10 cameras and 2 laser scanners. In order to obtain accurate ground truth camera poses, we developed a robust LiDAR SLAM which provides initial poses that are then refined using a novel structure-from-motion based optimization. We present a benchmark of modern visual localization algorithms on these challenging datasets showing superior performance of structure-based methods using robust image features. The datasets are available at: https://naverlabs.com/datasets
Published: 2021

50. Simulation Studies on Deep Reinforcement Learning for Building Control with Human Interaction

Author: Lee, Donghwan, He, Niao, Lee, Seungjae, Karava, Panagiota, and Hu, Jianghai
Subjects: Computer Science - Artificial Intelligence, Electrical Engineering and Systems Science - Systems and Control
Abstract: The building sector consumes the largest energy in the world, and there have been considerable research interests in energy consumption and comfort management of buildings. Inspired by recent advances in reinforcement learning (RL), this paper aims at assessing the potential of RL in building climate control problems with occupant interaction. We apply a recent RL approach, called DDPG (deep deterministic policy gradient), for the continuous building control tasks and assess its performance with simulation studies in terms of its ability to handle (a) the partial state observability due to sensor limitations; (b) complex stochastic system with high-dimensional state-spaces, which are jointly continuous and discrete; (c) uncertainties due to ambient weather conditions, occupant's behavior, and comfort feelings. Especially, the partial observability and uncertainty due to the occupant interaction significantly complicate the control problem. Through simulation studies, the policy learned by DDPG demonstrates reasonable performance and computational tractability.
Published: 2021

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

626 results on '"Lee Donghwan"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources