Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01ws859j82g
Title: Learning from Humans: Beyond Classical Imitation Learning
Authors: Spencer, Jonathan Cullinan
Advisors: RamadgeChiang, PeterMung J
Contributors: Electrical and Computer Engineering Department
Keywords: Imitation Learning
Machine Learning
Reinforcement Learning
Subjects: Robotics
Electrical engineering
Issue Date: 2022
Publisher: Princeton, NJ : Princeton University
Abstract: As robotic tasks become more complex and the environments they operate in become more dynamic and unpredictable, roboticists increasingly utilize machine learning-based controllers. Imitation learning is an especially attractive method that trains a robot learner using a set of recorded demonstrations of an expert performing a task. Imitation learning is convenient to use in settings like autonomous driving where it is difficult for the expert to precisely assign numerical costs but easy for them to provide a demonstration; they prefer to show rather than tell. The primary failure point in imitation learning occurs in settings where small learner mistakes compound and introduce the learner to states not visited by the expert. Classic imitation learning methods account for this by either requiring a prohibitively large number of offline expert demonstrations or by requiring access to an online expert who can provide action labels in novel learner states. In this work, we share two new methods for addressing distributional shift that make imitation learning more efficient and practical in both the offline and the online expert setting. Our offline method uses a simulator to bootstrap existing expert demonstrations and explicitly encourages the learner to match the expert state distribution rather than the expert actions. When the environment is part of a class of moderately well-behaved environments, we show that error compounding can be mitigated in a provable way. When the expert is available during training, we consider a much more natural setting of supervisory expert intervention. We show that expert interventions provide rich information about the expert's value function and introduce a framework to leverage both explicit and implicit feedback gleaned from that interaction. We enjoy robust theoretical guarantees with a setting that is much more natural, efficient, and scalable than classical online methods. Both of our methods mitigate distributional shift and achieve an error that in the worst case grows only linearly in trajectory length. We demonstrate our techniques using a miniature autonomous driving platform, which learns technical driving skills from scratch. We show experimentally that our methods require an order of magnitude fewer expert samples than classical imitation techniques.
URI: http://arks.princeton.edu/ark:/88435/dsp01ws859j82g
Alternate format: The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog: catalog.princeton.edu
Type of Material: Academic dissertations (Ph.D.)
Language: en
Appears in Collections:Electrical Engineering

Files in This Item:
File Description SizeFormat 
Spencer_princeton_0181D_14168.pdf9.18 MBAdobe PDFView/Download


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.