AvE: Assistance via Empowerment

Yuqing Du, Stas Tiomkin, Emre Kıcıman, Daniel Polani, Pieter Abbeel, Anca Dragan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

One difficulty in using artificial agents for human-assistive applications lies in the challenge of accurately assisting with a person’s goal(s). Existing methods tend to rely on inferring the human’s goal, which is challenging when there are many potential goals or when the set of candidate goals is difficult to identify. We propose a new paradigm for assistance by instead increasing the human’s ability to control their environment, and formalize this approach by augmenting reinforcement learning with human empowerment. This task-agnostic objective preserves the person’s autonomy and ability to achieve any eventual state. We test our approach against assistance based on goal inference, highlighting scenarios where our method overcomes failure modes stemming from goal ambiguity or misspecification. As existing methods for estimating empowerment in continuous domains are computationally hard, precluding its use in real time learned assistance, we also propose an efficient empowerment-inspired proxy metric. Using this, we are able to successfully demonstrate our method in a shared autonomy user study for a challenging simulated teleoperation task with human-in-the-loop training.
Original languageEnglish
Title of host publicationAdvances in Neural Information Processing Systems 33 (NeurIPS 2020)
EditorsH. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, H. Lin
Number of pages11
Publication statusPublished - 12 Dec 2020
EventNeurIPS 2020: Thirty-fourth Conference on Neural Information Processing Systems - Online, Vancouver, Canada
Duration: 6 Dec 202012 Dec 2020
Conference number: 34
https://neurips.cc/Conferences/2020/

Conference

ConferenceNeurIPS 2020
Country/TerritoryCanada
CityVancouver
Period6/12/2012/12/20
Internet address

Fingerprint

Dive into the research topics of 'AvE: Assistance via Empowerment'. Together they form a unique fingerprint.

Cite this