Author: "Wu TI" / Publication Type: Reports - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Wu TI"' showing total 10 results

Start Over Author "Wu TI" Publication Type Reports

10 results on '"Wu TI"'

1. ResTNet: Defense against Adversarial Policies via Transformer in Computer Go

Author: Wu, Tai-Lin, Wu, Ti-Rong, Shih, Chung-Chin, Ju, Yan-Ru, and Wu, I-Chen
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Although AlphaZero has achieved superhuman levels in Go, recent research has highlighted its vulnerability in particular situations requiring a more comprehensive understanding of the entire board. To address this challenge, this paper introduces ResTNet, a network that interleaves residual networks and Transformer. Our empirical experiments demonstrate several advantages of using ResTNet. First, it not only improves playing strength but also enhances the ability of global information. Second, it defends against an adversary Go program, called cyclic-adversary, tailor-made for attacking AlphaZero algorithms, significantly reducing the average probability of being attacked rate from 70.44% to 23.91%. Third, it improves the accuracy from 59.15% to 80.01% in correctly recognizing ladder patterns, which are one of the challenging patterns for Go AIs. Finally, ResTNet offers a potential explanation of the decision-making process and can also be applied to other games like Hex. To the best of our knowledge, ResTNet is the first to integrate residual networks and Transformer in the context of AlphaZero for board games, suggesting a promising direction for enhancing AlphaZero's global understanding.
Published: 2024

2. Evaluation of a motion measurement system for PET imaging studies

Author: Wang, Junxiang, Wu, Ti, Iordachita, Iulian I., and Kazanzides, Peter
Subjects: Computer Science - Robotics
Abstract: Positron Emission Tomography (PET) enables functional imaging of deep brain structures, but the bulk and weight of current systems preclude their use during many natural human activities, such as locomotion. The proposed long-term solution is to construct a robotic system that can support an imaging system surrounding the subject's head, and then move the system to accommodate natural motion. This requires a system to measure the motion of the head with respect to the imaging ring, for use by both the robotic system and the image reconstruction software. We report here the design and experimental evaluation of a parallel string encoder mechanism for sensing this motion. Our preliminary results indicate that the measurement system may achieve accuracy within 0.5 mm, especially for small motions, with improved accuracy possible through kinematic calibration.
Published: 2023
Full Text: View/download PDF

3. Game Solving with Online Fine-Tuning

Author: Wu, Ti-Rong, Guei, Hung, Wei, Ting Han, Shih, Chung-Chin, Chin, Jui-Te, and Wu, I-Chen
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computer Science and Game Theory, Computer Science - Machine Learning
Abstract: Game solving is a similar, yet more difficult task than mastering a game. Solving a game typically means to find the game-theoretic value (outcome given optimal play), and optionally a full strategy to follow in order to achieve that outcome. The AlphaZero algorithm has demonstrated super-human level play, and its powerful policy and value predictions have also served as heuristics in game solving. However, to solve a game and obtain a full strategy, a winning response must be found for all possible moves by the losing player. This includes very poor lines of play from the losing side, for which the AlphaZero self-play process will not encounter. AlphaZero-based heuristics can be highly inaccurate when evaluating these out-of-distribution positions, which occur throughout the entire search. To address this issue, this paper investigates applying online fine-tuning while searching and proposes two methods to learn tailor-designed heuristics for game solving. Our experiments show that using online fine-tuning can solve a series of challenging 7x7 Killall-Go problems, using only 23.54% of computation time compared to the baseline without online fine-tuning. Results suggest that the savings scale with problem size. Our method can further be extended to any tree search algorithm for problem solving. Our code is available at https://rlg.iis.sinica.edu.tw/papers/neurips2023-online-fine-tuning-solver., Comment: Accepted by the 37th Conference on Neural Information Processing Systems (NeurIPS 2023)
Published: 2023

4. MiniZero: Comparative Analysis of AlphaZero and MuZero on Go, Othello, and Atari Games

Author: Wu, Ti-Rong, Guei, Hung, Peng, Pei-Chiun, Huang, Po-Wei, Wei, Ting Han, Shih, Chung-Chin, and Tsai, Yun-Jui
Subjects: Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: This paper presents MiniZero, a zero-knowledge learning framework that supports four state-of-the-art algorithms, including AlphaZero, MuZero, Gumbel AlphaZero, and Gumbel MuZero. While these algorithms have demonstrated super-human performance in many games, it remains unclear which among them is most suitable or efficient for specific tasks. Through MiniZero, we systematically evaluate the performance of each algorithm in two board games, 9x9 Go and 8x8 Othello, as well as 57 Atari games. For two board games, using more simulations generally results in higher performance. However, the choice of AlphaZero and MuZero may differ based on game properties. For Atari games, both MuZero and Gumbel MuZero are worth considering. Since each game has unique characteristics, different algorithms and simulations yield varying results. In addition, we introduce an approach, called progressive simulation, which progressively increases the simulation budget during training to allocate computation more efficiently. Our empirical results demonstrate that progressive simulation achieves significantly superior performance in two board games. By making our framework and trained models publicly available, this paper contributes a benchmark for future research on zero-knowledge learning algorithms, assisting researchers in algorithm selection and comparison against these zero-knowledge learning baselines. Our code and data are available at https://rlg.iis.sinica.edu.tw/papers/minizero., Comment: Accepted by IEEE Transactions on Games
Published: 2023

5. A Local-Pattern Related Look-Up Table

Author: Shih, Chung-Chin, Wei, Ting Han, Wu, Ti-Rong, and Wu, I-Chen
Subjects: Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: This paper describes a Relevance-Zone pattern table (RZT) that can be used to replace a traditional transposition table. An RZT stores exact game values for patterns that are discovered during a Relevance-Zone-Based Search (RZS), which is the current state-of-the-art in solving L&D problems in Go. Positions that share the same pattern can reuse the same exact game value in the RZT. The pattern matching scheme for RZTs is implemented using a radix tree, taking into consideration patterns with different shapes. To improve the efficiency of table lookups, we designed a heuristic that prevents redundant lookups. The heuristic can safely skip previously queried patterns for a given position, reducing the overhead to 10% of the original cost. We also analyze the time complexity of the RZT both theoretically and empirically. Experiments show the overhead of traversing the radix tree in practice during lookup remain flat logarithmically in relation to the number of entries stored in the table. Experiments also show that the use of an RZT instead of a traditional transposition table significantly reduces the number of searched nodes on two data sets of 7x7 and 19x19 L&D Go problems., Comment: Submitted to IEEE Transactions on Games (under review)
Published: 2022

6. Are AlphaZero-like Agents Robust to Adversarial Perturbations?

Author: Lan, Li-Cheng, Zhang, Huan, Wu, Ti-Rong, Tsai, Meng-Yu, Wu, I-Chen, and Hsieh, Cho-Jui
Subjects: Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Computer Science - Robotics
Abstract: The success of AlphaZero (AZ) has demonstrated that neural-network-based Go AIs can surpass human performance by a large margin. Given that the state space of Go is extremely large and a human player can play the game from any legal state, we ask whether adversarial states exist for Go AIs that may lead them to play surprisingly wrong actions. In this paper, we first extend the concept of adversarial examples to the game of Go: we generate perturbed states that are ``semantically'' equivalent to the original state by adding meaningless moves to the game, and an adversarial state is a perturbed state leading to an undoubtedly inferior action that is obvious even for Go beginners. However, searching the adversarial state is challenging due to the large, discrete, and non-differentiable search space. To tackle this challenge, we develop the first adversarial attack on Go AIs that can efficiently search for adversarial states by strategically reducing the search space. This method can also be extended to other board games such as NoGo. Experimentally, we show that the actions taken by both Policy-Value neural network (PV-NN) and Monte Carlo tree search (MCTS) can be misled by adding one or two meaningless stones; for example, on 58\% of the AlphaGo Zero self-play games, our method can make the widely used KataGo agent with 50 simulations of MCTS plays a losing action by adding two meaningless stones. We additionally evaluated the adversarial examples found by our algorithm with amateur human Go players and 90\% of examples indeed lead the Go agent to play an obviously inferior action. Our code is available at \url{https://PaperCode.cc/GoAttack}., Comment: Accepted by Neurips 2022
Published: 2022

7. A Novel Approach to Solving Goal-Achieving Problems for Board Games

Author: Shih, Chung-Chin, Wu, Ti-Rong, Wei, Ting Han, and Wu, I-Chen
Subjects: Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Goal-achieving problems are puzzles that set up a specific situation with a clear objective. An example that is well-studied is the category of life-and-death (L&D) problems for Go, which helps players hone their skill of identifying region safety. Many previous methods like lambda search try null moves first, then derive so-called relevance zones (RZs), outside of which the opponent does not need to search. This paper first proposes a novel RZ-based approach, called the RZ-Based Search (RZS), to solving L&D problems for Go. RZS tries moves before determining whether they are null moves post-hoc. This means we do not need to rely on null move heuristics, resulting in a more elegant algorithm, so that it can also be seamlessly incorporated into AlphaZero's super-human level play in our solver. To repurpose AlphaZero for solving, we also propose a new training method called Faster to Life (FTL), which modifies AlphaZero to entice it to win more quickly. We use RZS and FTL to solve L&D problems on Go, namely solving 68 among 106 problems from a professional L&D book while a previous program solves 11 only. Finally, we discuss that the approach is generic in the sense that RZS is applicable to solving many other goal-achieving problems for board games., Comment: Accepted by AAAI2022. In this version, supplementary materials are added
Published: 2021

8. Learning to Stop: Dynamic Simulation Monte-Carlo Tree Search

Author: Lan, Li-Cheng, Tsai, Meng-Yu, Wu, Ti-Rong, Wu, I-Chen, and Hsieh, Cho-Jui
Subjects: Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Monte Carlo tree search (MCTS) has achieved state-of-the-art results in many domains such as Go and Atari games when combining with deep neural networks (DNNs). When more simulations are executed, MCTS can achieve higher performance but also requires enormous amounts of CPU and GPU resources. However, not all states require a long searching time to identify the best action that the agent can find. For example, in 19x19 Go and NoGo, we found that for more than half of the states, the best action predicted by DNN remains unchanged even after searching 2 minutes. This implies that a significant amount of resources can be saved if we are able to stop the searching earlier when we are confident with the current searching result. In this paper, we propose to achieve this goal by predicting the uncertainty of the current searching status and use the result to decide whether we should stop searching. With our algorithm, called Dynamic Simulation MCTS (DS-MCTS), we can speed up a NoGo agent trained by AlphaZero 2.5 times faster while maintaining a similar winning rate. Also, under the same average simulation count, our method can achieve a 61% winning rate against the original program., Comment: Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21)
Published: 2020

9. Accelerating and Improving AlphaZero Using Population Based Training

Author: Wu, Ti-Rong, Wei, Ting-Han, and Wu, I-Chen
Subjects: Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Computer Science - Multiagent Systems, Computer Science - Neural and Evolutionary Computing
Abstract: AlphaZero has been very successful in many games. Unfortunately, it still consumes a huge amount of computing resources, the majority of which is spent in self-play. Hyperparameter tuning exacerbates the training cost since each hyperparameter configuration requires its own time to train one run, during which it will generate its own self-play records. As a result, multiple runs are usually needed for different hyperparameter configurations. This paper proposes using population based training (PBT) to help tune hyperparameters dynamically and improve strength during training time. Another significant advantage is that this method requires a single run only, while incurring a small additional time cost, since the time for generating self-play records remains unchanged though the time for optimization is increased following the AlphaZero training algorithm. In our experiments for 9x9 Go, the PBT method is able to achieve a higher win rate for 9x9 Go than the baselines, each with its own hyperparameter configuration and trained individually. For 19x19 Go, with PBT, we are able to obtain improvements in playing strength. Specifically, the PBT agent can obtain up to 74% win rate against ELF OpenGo, an open-source state-of-the-art AlphaZero program using a neural network of a comparable capacity. This is compared to a saturated non-PBT agent, which achieves a win rate of 47% against ELF OpenGo under the same circumstances., Comment: accepted by AAAI2020 as oral presentation. In this version, supplementary materials are added
Published: 2020

10. Multi-Labelled Value Networks for Computer Go

Author: Wu, Ti-Rong, Wu, I-Chen, Chen, Guan-Wun, Wei, Ting-han, Lai, Tung-Yi, Wu, Hung-Chun, and Lan, Li-Cheng
Subjects: Computer Science - Artificial Intelligence, Computer Science - Learning
Abstract: This paper proposes a new approach to a novel value network architecture for the game Go, called a multi-labelled (ML) value network. In the ML value network, different values (win rates) are trained simultaneously for different settings of komi, a compensation given to balance the initiative of playing first. The ML value network has three advantages, (a) it outputs values for different komi, (b) it supports dynamic komi, and (c) it lowers the mean squared error (MSE). This paper also proposes a new dynamic komi method to improve game-playing strength. This paper also performs experiments to demonstrate the merits of the architecture. First, the MSE of the ML value network is generally lower than the value network alone. Second, the program based on the ML value network wins by a rate of 67.6% against the program based on the value network alone. Third, the program with the proposed dynamic komi method significantly improves the playing strength over the baseline that does not use dynamic komi, especially for handicap games. To our knowledge, up to date, no handicap games have been played openly by programs using value networks. This paper provides these programs with a useful approach to playing handicap games., Comment: This version was also submitted to IEEE TCIAIG on May 30, 2017
Published: 2017

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

10 results on '"Wu TI"'

1. ResTNet: Defense against Adversarial Policies via Transformer in Computer Go

2. Evaluation of a motion measurement system for PET imaging studies

3. Game Solving with Online Fine-Tuning

4. MiniZero: Comparative Analysis of AlphaZero and MuZero on Go, Othello, and Atari Games

5. A Local-Pattern Related Look-Up Table

6. Are AlphaZero-like Agents Robust to Adversarial Perturbations?

7. A Novel Approach to Solving Goal-Achieving Problems for Board Games

8. Learning to Stop: Dynamic Simulation Monte-Carlo Tree Search

9. Accelerating and Improving AlphaZero Using Population Based Training

10. Multi-Labelled Value Networks for Computer Go

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Publication Type

Database

10 results on '"Wu TI"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources