Lifelong transfer learning with an option hierarchy

Majd Hawasly, Subramanian Ramamoorthy

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

6 Citations (Scopus)

Abstract

Many applications require autonomous agents to achieve quick responses to task instances drawn from a rich family of qualitatively-related tasks. We address the setting where the tasks share a state-action space and have the same qualitative objective but differ in dynamics. We adopt a transfer learning approach where common structure in previously-learnt policies, in the form of shared subtasks, is exploited to accelerate learning in subsequent ones. We use a probabilistic mixture model to describe regions in state space which are common to successful trajectories in different instances. Then, we extract policy fragments from previously-learnt policies that are specialised to these regions. These policy fragments are options, whose initiation and termination sets are automatically extracted from data by the mixture model. In novel task instances, these options are used in an SMDP learning process and option learning repeats over the resulting policy library. The utility of this method is demonstrated through experiments in a standard navigation environment and then in the RoboCup simulated soccer domain with opponent teams of different skill.

Original languageEnglish
Title of host publicationIROS 2013
Subtitle of host publicationNew Horizon, Conference Digest - 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems
Pages1341-1346
Number of pages6
DOIs
Publication statusPublished - 2013
Externally publishedYes
Event2013 26th IEEE/RSJ International Conference on Intelligent Robots and Systems: New Horizon, IROS 2013 - Tokyo, Japan
Duration: 3 Nov 20138 Nov 2013

Publication series

NameIEEE International Conference on Intelligent Robots and Systems
ISSN (Print)2153-0858
ISSN (Electronic)2153-0866

Conference

Conference2013 26th IEEE/RSJ International Conference on Intelligent Robots and Systems: New Horizon, IROS 2013
Country/TerritoryJapan
CityTokyo
Period3/11/138/11/13

Fingerprint

Dive into the research topics of 'Lifelong transfer learning with an option hierarchy'. Together they form a unique fingerprint.

Cite this