Author: "Topcu, U." / Topic: artificial intelligence (cs.ai) - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Topcu, U."' showing total 4 results

Start Over Author "Topcu, U." Topic artificial intelligence (cs.ai)

4 results on '"Topcu, U."'

1. Verifiable RNN-Based Policies for POMDPs Under Temporal Logic Constraints

Author: Carr, S., Jansen, Nils, Topcu, U., Bessiere, C., and Bessiere, C.
Subjects: FOS: Computer and information sciences, 050101 languages & linguistics, Computer science, business.industry, Computer Science - Artificial Intelligence, 05 social sciences, Partially observable Markov decision process, Context (language use), 02 engineering and technology, Formal methods, Machine learning, computer.software_genre, Artificial Intelligence (cs.AI), Recurrent neural network, Software Science, 0202 electrical engineering, electronic engineering, information engineering, Benchmark (computing), 020201 artificial intelligence & image processing, 0501 psychology and cognitive sciences, Temporal logic, Artificial intelligence, Markov decision process, business, Formal verification, computer
Abstract: Recurrent neural networks (RNNs) have emerged as an effective representation of control policies in sequential decision-making problems. However, a major drawback in the application of RNN-based policies is the difficulty in providing formal guarantees on the satisfaction of behavioral specifications, e.g. safety and/or reachability. By integrating techniques from formal methods and machine learning, we propose an approach to automatically extract a finite-state controller (FSC) from an RNN, which, when composed with a finite-state system model, is amenable to existing formal verification tools. Specifically, we introduce an iterative modification to the so-called quantized bottleneck insertion technique to create an FSC as a randomized policy with memory. For the cases in which the resulting FSC fails to satisfy the specification, verification generates diagnostic information. We utilize this information to either adjust the amount of memory in the extracted FSC or perform focused retraining of the RNN. While generally applicable, we detail the resulting iterative procedure in the context of policy synthesis for partially observable Markov decision processes (POMDPs), which is known to be notoriously hard. The numerical experiments show that the proposed approach outperforms traditional POMDP synthesis methods by 3 orders of magnitude within 2% of optimal benchmark values., 8 pages, 5 figures, 1 table
Published: 2020

2. Counterexample-Guided Strategy Improvement for POMDPs Using Recurrent Neural Networks

Author: Carr, S., Jansen, Nils, Wimmer, R., Serban, A.C., Becker, B., Topcu, U., Kraus, S., and Kraus, S.
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer science, business.industry, 020208 electrical & electronic engineering, 02 engineering and technology, Machine Learning (cs.LG), Artificial Intelligence (cs.AI), Recurrent neural network, 0202 electrical engineering, electronic engineering, information engineering, Software Science, 020201 artificial intelligence & image processing, Artificial intelligence, Digital Security, business, Counterexample
Abstract: We study strategy synthesis for partially observable Markov decision processes (POMDPs). The particular problem is to determine strategies that provably adhere to (probabilistic) temporal logic constraints. This problem is computationally intractable and theoretically hard. We propose a novel method that combines techniques from machine learning and formal verification. First, we train a recurrent neural network (RNN) to encode POMDP strategies. The RNN accounts for memory-based decisions without the need to expand the full belief space of a POMDP. Secondly, we restrict the RNN-based strategy to represent a finite-memory strategy and implement it on a specific POMDP. For the resulting finite Markov chain, efficient formal verification techniques provide provable guarantees against temporal logic specifications. If the specification is not satisfied, counterexamples supply diagnostic information. We use this information to improve the strategy by iteratively training the RNN. Numerical experiments show that the proposed method elevates the state of the art in POMDP solving by up to three orders of magnitude in terms of solving times and model sizes.
Published: 2019

3. Strategy Synthesis in POMDPs via Game-Based Abstractions

Author: Winterer, L., Junges, J.S.L., Wimmer, R., Jansen, Nils, Topcu, U., Katoen, J.-P., and Becker, B.
Subjects: FOS: Computer and information sciences, Computer Science - Robotics, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Software Science, Robotics (cs.RO)
Abstract: We study synthesis problems with constraints in partially observable Markov decision processes (POMDPs), where the objective is to compute a strategy for an agent that is guaranteed to satisfy certain safety and performance specifications. Verification and strategy synthesis for POMDPs are, however, computationally intractable in general. We alleviate this difficulty by focusing on planning applications and exploiting typical structural properties of such scenarios; for instance, we assume that the agent has the ability to observe its own position inside an environment. We propose an abstraction refinement framework which turns such a POMDP model into a (fully observable) probabilistic two-player game (PG). For the obtained PGs, efficient verification and synthesis tools allow to determine strategies with optimal safety and performance measures, which approximate optimal schedulers on the POMDP. If the approximation is too coarse to satisfy the given specifications, an refinement scheme improves the computed strategies. As a running example, we use planning problems where an agent moves inside an environment with randomly moving obstacles and restricted observability. We demonstrate that the proposed method advances the state of the art by solving problems several orders-of-magnitude larger than those that can be handled by existing POMDP solvers. Furthermore, this method gives guarantees on safety constraints, which is not supported by the majority of the existing solvers.
Published: 2019

4. Human-in-the-Loop Synthesis of Partially Observable Markov Decision Processes

Author: Carr, S., Jansen, N.H., Wimmer, R., Fu, J., Topcu, U., Berg, J.M., and Berg, J.M.
Subjects: FOS: Computer and information sciences, 0209 industrial biotechnology, Theoretical computer science, Markov chain, Computer science, Computer Science - Artificial Intelligence, Autonomous agent, Partially observable Markov decision process, Markov process, Observable, 02 engineering and technology, symbols.namesake, 020901 industrial engineering & automation, Artificial Intelligence (cs.AI), 0202 electrical engineering, electronic engineering, information engineering, symbols, Software Science, Human-in-the-loop, 020201 artificial intelligence & image processing, Observability, Markov decision process
Abstract: We study planning problems where autonomous agents operate inside environments that are subject to uncertainties and not fully observable. Partially observable Markov decision processes (POMDPs) are a natural formal model to capture such problems. Because of the potentially huge or even infinite belief space in POMDPs, synthesis with safety guarantees is, in general, computationally intractable. We propose an approach that aims to circumvent this difficulty: in scenarios that can be partially or fully simulated in a virtual environment, we actively integrate a human user to control an agent. While the user repeatedly tries to safely guide the agent in the simulation, we collect data from the human input. Via behavior cloning, we translate the data into a strategy for the POMDP. The strategy resolves all nondeterminism and non-observability of the POMDP, resulting in a discrete-time Markov chain (MC). The efficient verification of this MC gives quantitative insights into the quality of the inferred human strategy by proving or disproving given system specifications. For the case that the quality of the strategy is not sufficient, we propose a refinement method using counterexamples presented to the human. Experiments show that by including humans into the POMDP verification loop we improve the state of the art by orders of magnitude in terms of scalability.
Published: 2018

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

4 results on '"Topcu, U."'

1. Verifiable RNN-Based Policies for POMDPs Under Temporal Logic Constraints

2. Counterexample-Guided Strategy Improvement for POMDPs Using Recurrent Neural Networks

3. Strategy Synthesis in POMDPs via Game-Based Abstractions

4. Human-in-the-Loop Synthesis of Partially Observable Markov Decision Processes

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Database

4 results on '"Topcu, U."'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources