23 Citations (Scopus)
46 Downloads (Pure)

Abstract

We propose the use of kernel-based methods as underlying
function approximator in the least-squares based policy
evaluation framework of LSPE(λ) and LSTD(λ). In particular we
present the ‘kernelization’ of model-free LSPE(λ). The ‘kernelization’
is computationally made possible by using the subset of
regressors approximation, which approximates the kernel using
a vastly reduced number of basis functions. The core of our
proposed solution is an efficient recursive implementation with
automatic supervised selection of the relevant basis functions. The
LSPE method is well-suited for optimistic policy iteration and
can thus be used in the context of online reinforcement learning.
We use the high-dimensional Octopus benchmark to demonstrate
this.
Original languageEnglish
Title of host publicationProcs of the 2007 Symposium on Approximate Dynamic Programming & Reinforcement Learning (ADPRL 2007)
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Pages338-345
Volume2007
ISBN (Print)1-4244-0706-0
Publication statusPublished - 2007

Fingerprint

Dive into the research topics of 'Kernelizing LSPE λ'. Together they form a unique fingerprint.

Cite this