Author: "Georgiev, Petko" / Topic: fos: computer and information sciences - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Georgiev, Petko"' showing total 9 results

Start Over Author "Georgiev, Petko" Topic fos: computer and information sciences

9 results on '"Georgiev, Petko"'

1. Improving Multimodal Interactive Agents with Reinforcement Learning from Human Feedback

Author: Abramson, Josh, Ahuja, Arun, Carnevale, Federico, Georgiev, Petko, Goldin, Alex, Hung, Alden, Landon, Jessica, Lhotka, Jirka, Lillicrap, Timothy, Muldal, Alistair, Powell, George, Santoro, Adam, Scully, Guy, Srivastava, Sanjana, von Glehn, Tamara, Wayne, Greg, Wong, Nathaniel, Yan, Chen, and Zhu, Rui
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Science - Human-Computer Interaction, Computer Science - Multiagent Systems, Machine Learning (cs.LG), Human-Computer Interaction (cs.HC), Multiagent Systems (cs.MA)
Abstract: An important goal in artificial intelligence is to create agents that can both interact naturally with humans and learn from their feedback. Here we demonstrate how to use reinforcement learning from human feedback (RLHF) to improve upon simulated, embodied agents trained to a base level of competency with imitation learning. First, we collected data of humans interacting with agents in a simulated 3D world. We then asked annotators to record moments where they believed that agents either progressed toward or regressed from their human-instructed goal. Using this annotation data we leveraged a novel method - which we call "Inter-temporal Bradley-Terry" (IBT) modelling - to build a reward model that captures human judgments. Agents trained to optimise rewards delivered from IBT reward models improved with respect to all of our metrics, including subsequent human judgment during live interactions with agents. Altogether our results demonstrate how one can successfully leverage human judgments to improve agent behaviour, allowing us to use reinforcement learning in complex, embodied domains without programmatic reward functions. Videos of agent behaviour may be found at https://youtu.be/v_Z9F2_eKk4.
Published: 2022

2. A data-driven approach for learning to control computers

Author: Humphreys, Peter C, Raposo, David, Pohlen, Toby, Thornton, Gregory, Chhaparia, Rachita, Muldal, Alistair, Abramson, Josh, Georgiev, Petko, Goldin, Alex, Santoro, Adam, and Lillicrap, Timothy
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Machine Learning (cs.LG)
Abstract: It would be useful for machines to use computers as humans do so that they can aid us in everyday tasks. This is a setting in which there is also the potential to leverage large-scale expert demonstrations and human judgements of interactive behaviour, which are two ingredients that have driven much recent success in AI. Here we investigate the setting of computer control using keyboard and mouse, with goals specified via natural language. Instead of focusing on hand-designed curricula and specialized action spaces, we focus on developing a scalable method centered on reinforcement learning combined with behavioural priors informed by actual human-computer interactions. We achieve state-of-the-art and human-level mean performance across all tasks within the MiniWob++ benchmark, a challenging suite of computer control problems, and find strong evidence of cross-task transfer. These results demonstrate the usefulness of a unified human-agent interface when training machines to use computers. Altogether our results suggest a formula for achieving competency beyond MiniWob++ and towards controlling computers, in general, as a human would.
Published: 2022
Full Text: View/download PDF

3. Intra-agent speech permits zero-shot task acquisition

Author: Yan, Chen, Carnevale, Federico, Georgiev, Petko, Santoro, Adam, Guy, Aurelia, Muldal, Alistair, Hung, Chia-Chun, Abramson, Josh, Lillicrap, Timothy, and Wayne, Gregory
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Science - Computation and Language, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Computation and Language (cs.CL), Machine Learning (cs.LG)
Abstract: Human language learners are exposed to a trickle of informative, context-sensitive language, but a flood of raw sensory data. Through both social language use and internal processes of rehearsal and practice, language learners are able to build high-level, semantic representations that explain their perceptions. Here, we take inspiration from such processes of "inner speech" in humans (Vygotsky, 1934) to better understand the role of intra-agent speech in embodied behavior. First, we formally pose intra-agent speech as a semi-supervised problem and develop two algorithms that enable visually grounded captioning with little labeled language data. We then experimentally compute scaling curves over different amounts of labeled data and compare the data efficiency against a supervised learning baseline. Finally, we incorporate intra-agent speech into an embodied, mobile manipulator agent operating in a 3D virtual world, and show that with as few as 150 additional image captions, intra-agent speech endows the agent with the ability to manipulate and answer questions about a new object without any related task-directed experience (zero-shot). Taken together, our experiments suggest that modelling intra-agent speech is effective in enabling embodied agents to learn new tasks efficiently and without direct interaction experience.
Published: 2022
Full Text: View/download PDF

4. Evaluating Multimodal Interactive Agents

Author: Abramson, Josh, Ahuja, Arun, Carnevale, Federico, Georgiev, Petko, Goldin, Alex, Hung, Alden, Landon, Jessica, Lillicrap, Timothy, Muldal, Alistair, Richards, Blake, Santoro, Adam, von Glehn, Tamara, Wayne, Greg, Wong, Nathaniel, and Yan, Chen
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Machine Learning (cs.LG)
Abstract: Creating agents that can interact naturally with humans is a common goal in artificial intelligence (AI) research. However, evaluating these interactions is challenging: collecting online human-agent interactions is slow and expensive, yet faster proxy metrics often do not correlate well with interactive evaluation. In this paper, we assess the merits of these existing evaluation metrics and present a novel approach to evaluation called the Standardised Test Suite (STS). The STS uses behavioural scenarios mined from real human interaction data. Agents see replayed scenario context, receive an instruction, and are then given control to complete the interaction offline. These agent continuations are recorded and sent to human annotators to mark as success or failure, and agents are ranked according to the proportion of continuations in which they succeed. The resulting STS is fast, controlled, interpretable, and representative of naturalistic interactions. Altogether, the STS consolidates much of what is desirable across many of our standard evaluation metrics, allowing us to accelerate research progress towards producing agents that can interact naturally with humans. A video may be found at https://youtu.be/YR1TngGORGQ.
Published: 2022
Full Text: View/download PDF

5. Creating Multimodal Interactive Agents with Imitation and Self-Supervised Learning

Author: DeepMind Interactive Agents Team, Abramson, Josh, Ahuja, Arun, Brussee, Arthur, Carnevale, Federico, Cassin, Mary, Fischer, Felix, Georgiev, Petko, Goldin, Alex, Gupta, Mansi, Harley, Tim, Hill, Felix, Humphreys, Peter C, Hung, Alden, Landon, Jessica, Lillicrap, Timothy, Merzic, Hamza, Muldal, Alistair, Santoro, Adam, Scully, Guy, von Glehn, Tamara, Wayne, Greg, Wong, Nathaniel, Yan, Chen, and Zhu, Rui
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Machine Learning (cs.LG)
Abstract: A common vision from science fiction is that robots will one day inhabit our physical spaces, sense the world as we do, assist our physical labours, and communicate with us through natural language. Here we study how to design artificial agents that can interact naturally with humans using the simplification of a virtual environment. We show that imitation learning of human-human interactions in a simulated world, in conjunction with self-supervised learning, is sufficient to produce a multimodal interactive agent, which we call MIA, that successfully interacts with non-adversarial humans 75% of the time. We further identify architectural and algorithmic techniques that improve performance, such as hierarchical action selection. Altogether, our results demonstrate that imitation of multi-modal, real-time human behaviour may provide a straightforward and surprisingly effective means of imbuing agents with a rich behavioural prior from which agents might then be fine-tuned for specific purposes, thus laying a foundation for training capable agents for interactive robots or digital assistants. A video of MIA's behaviour may be found at https://youtu.be/ZFgRhviF7mY
Published: 2021
Full Text: View/download PDF

6. Imitating Interactive Intelligence

Author: Abramson, Josh, Ahuja, Arun, Barr, Iain, Brussee, Arthur, Carnevale, Federico, Cassin, Mary, Chhaparia, Rachita, Clark, Stephen, Damoc, Bogdan, Dudzik, Andrew, Georgiev, Petko, Guy, Aurelia, Harley, Tim, Hill, Felix, Hung, Alden, Kenton, Zachary, Landon, Jessica, Lillicrap, Timothy, Mathewson, Kory, Mokrá, Soňa, Muldal, Alistair, Santoro, Adam, Savinov, Nikolay, Varma, Vikrant, Wayne, Greg, Williams, Duncan, Wong, Nathaniel, Yan, Chen, and Zhu, Rui
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Computer Science - Multiagent Systems, Machine Learning (cs.LG), Multiagent Systems (cs.MA)
Abstract: A common vision from science fiction is that robots will one day inhabit our physical spaces, sense the world as we do, assist our physical labours, and communicate with us through natural language. Here we study how to design artificial agents that can interact naturally with humans using the simplification of a virtual environment. This setting nevertheless integrates a number of the central challenges of artificial intelligence (AI) research: complex visual perception and goal-directed physical control, grounded language comprehension and production, and multi-agent social interaction. To build agents that can robustly interact with humans, we would ideally train them while they interact with humans. However, this is presently impractical. Therefore, we approximate the role of the human with another learned agent, and use ideas from inverse reinforcement learning to reduce the disparities between human-human and agent-agent interactive behaviour. Rigorously evaluating our agents poses a great challenge, so we develop a variety of behavioural tests, including evaluation by humans who watch videos of agents or interact directly with them. These evaluations convincingly demonstrate that interactive training and auxiliary losses improve agent behaviour beyond what is achieved by supervised learning of actions alone. Further, we demonstrate that agent capabilities generalise beyond literal experiences in the dataset. Finally, we train evaluation models whose ratings of agents agree well with human judgement, thus permitting the evaluation of new agent models without additional effort. Taken together, our results in this virtual environment provide evidence that large-scale human behavioural imitation is a promising tool to create intelligent, interactive agents, and the challenge of reliably evaluating such agents is possible to surmount.
Published: 2020
Full Text: View/download PDF

7. StarCraft II: A New Challenge for Reinforcement Learning

Author: Vinyals, Oriol, Ewalds, Timo, Bartunov, Sergey, Georgiev, Petko, Vezhnevets, Alexander Sasha, Yeo, Michelle, Makhzani, Alireza, Küttler, Heinrich, Agapiou, John, Schrittwieser, Julian, Quan, John, Gaffney, Stephen, Petersen, Stig, Simonyan, Karen, Schaul, Tom, van Hasselt, Hado, Silver, David, Lillicrap, Timothy, Calderone, Kevin, Keet, Paul, Brunasso, Anthony, Lawrence, David, Ekermo, Anders, Repp, Jacob, and Tsing, Rodney
Subjects: FOS: Computer and information sciences, Computer Science - Learning, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, ComputingMilieux_PERSONALCOMPUTING, Machine Learning (cs.LG)
Abstract: This paper introduces SC2LE (StarCraft II Learning Environment), a reinforcement learning environment based on the StarCraft II game. This domain poses a new grand challenge for reinforcement learning, representing a more difficult class of problems than considered in most prior work. It is a multi-agent problem with multiple players interacting; there is imperfect information due to a partially observed map; it has a large action space involving the selection and control of hundreds of units; it has a large state space that must be observed solely from raw input feature planes; and it has delayed credit assignment requiring long-term strategies over thousands of steps. We describe the observation, action, and reward specification for the StarCraft II domain and provide an open source Python-based interface for communicating with the game engine. In addition to the main game maps, we provide a suite of mini-games focusing on different elements of StarCraft II gameplay. For the main game maps, we also provide an accompanying dataset of game replay data from human expert players. We give initial baseline results for neural networks trained from this data to predict game outcomes and player actions. Finally, we present initial baseline results for canonical deep reinforcement learning agents applied to the StarCraft II domain. On the mini-games, these agents learn to achieve a level of play that is comparable to a novice player. However, when trained on the main game, these agents are unable to make significant progress. Thus, SC2LE offers a new and challenging environment for exploring deep reinforcement learning algorithms and architectures., Comment: Collaboration between DeepMind & Blizzard. 20 pages, 9 figures, 2 tables
Published: 2017
Full Text: View/download PDF

8. The Call of the Crowd: Event Participation in Location-based Social Services

Author: Georgiev, Petko, Noulas, Anastasios, Mascolo, Cecilia, Mascolo, Cecilia [0000-0001-9614-4380], and Apollo - University of Cambridge Repository
Subjects: Social and Information Networks (cs.SI), FOS: Computer and information sciences, Physics - Physics and Society, physics.soc-ph, FOS: Physical sciences, Computer Science - Social and Information Networks, Physics and Society (physics.soc-ph), cs.SI
Abstract: Understanding the social and behavioral forces behind event participation is not only interesting from the viewpoint of social science, but also has important applications in the design of personalized event recommender systems.This paper takes advantage of data from a widely used location-based social network, Foursquare, to analyze event patterns in three metropolitan cities. We put forward several hypotheses on the motivating factors of user participation and confirm that social aspects play a major role in determining the likelihood of a user to participate in an event. While an explicit social filtering signal accounting for whether friends are attending dominates the factors, the popularity of an event proves to also be a strong attractor. Further, we capture an implicit social signal by performing random walks in a high dimensional graph that encodes the place type preferences of friends and that proves especially suited to identify relevant niche events for users. Our findings on the extent to which the various temporal, spatial and social aspects underlie users' event preferences lead us to further hypothesize that a combination of factors better models users' event interests. We verify this through a supervised learning framework. We show that for one in three users in London and one in five users in New York and Chicago it identifies the exact event the user would attend among the pool of suggestions.
Published: 2014

9. Where businesses thrive: Predicting the impact of the olympic games on local retailers through location-based services data

Author: Georgiev, Petko, Noulas, Anastasios, Mascolo, Cecilia, Mascolo, Cecilia [0000-0001-9614-4380], and Apollo - University of Cambridge Repository
Subjects: Social and Information Networks (cs.SI), FOS: Computer and information sciences, Physics - Physics and Society, physics.soc-ph, FOS: Physical sciences, Computer Science - Social and Information Networks, Physics and Society (physics.soc-ph), cs.SI
Abstract: The Olympic Games are an important sporting event with notable consequences for the general economic landscape of the host city. Traditional economic assessments focus on the aggregated impact of the event on the national income, but fail to provide micro-scale insights on why local businesses will benefit from the increased activity during the Games.In this paper we provide a novel approach to modeling the impact of the Olympic Games on local retailers by analyzing a dataset mined from a large location-based social service, Foursquare. We hypothesize that the spatial positioning of businesses as well as the mobility trends of visitors are primary indicators of whether retailers will rise their popularity during the event. To confirm this we formulate a retail winners prediction task in the context of which we evaluate a set of geographic and mobility metrics. We find that the proximity to stadiums, the diversity of activity in the neighborhood, the nearby area sociability, as well as the probability of customer flows from and to event places such as stadiums and parks are all vital factors. Through supervised learning techniques we demonstrate that the success of businesses hinges on a combination of both geographic and mobility factors. Our results suggest that location-based social networks, where crowdsourced information about the dynamic interaction of users with urban spaces becomes publicly available, present an alternative medium to assess the economic impact of large scale events in a city.
Published: 2014

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

9 results on '"Georgiev, Petko"'

1. Improving Multimodal Interactive Agents with Reinforcement Learning from Human Feedback

2. A data-driven approach for learning to control computers

3. Intra-agent speech permits zero-shot task acquisition

4. Evaluating Multimodal Interactive Agents

5. Creating Multimodal Interactive Agents with Imitation and Self-Supervised Learning

6. Imitating Interactive Intelligence

7. StarCraft II: A New Challenge for Reinforcement Learning

8. The Call of the Crowd: Event Participation in Location-based Social Services

9. Where businesses thrive: Predicting the impact of the olympic games on local retailers through location-based services data

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Database

Publisher

9 results on '"Georgiev, Petko"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources