Abstract
In this paper I investigate methods of applying reinforcement learning to continuous state- and action-space problems without a policy function. I compare the performance of four methods, one of which is the discretisation of the action-space, and the other three are optimisation techniques applied to finding the greedy action without discretisation. The optimisation methods I apply are gradient descent, Nelder-Mead and Newton's Method. The action selection methods are applied in conjunction with the SARSA algorithm, with a multilayer perceptron utilized for the approximation of the value function. The approaches are applied to two simulated continuous state- and action-space control problems: Cart-Pole and double Cart-Pole. The results are compared both in terms of action selection time and the number of trials required to train on the benchmark problems.
Original language | English |
---|---|
Title of host publication | 2016 International Joint Conference on Neural Networks (IJCNN) |
Publisher | Institute of Electrical and Electronics Engineers (IEEE) |
Pages | 3785-3792 |
Number of pages | 8 |
ISBN (Print) | 978-1-5090-0621-2 |
DOIs | |
Publication status | Published - 29 Jul 2016 |
Event | 2016 International Joint Conference on Neural Networks (IJCNN) - Vancouver, BC, Canada Duration: 24 Jul 2016 → 29 Jul 2016 |
Conference
Conference | 2016 International Joint Conference on Neural Networks (IJCNN) |
---|---|
Period | 24/07/16 → 29/07/16 |
Keywords
- reinforcement learning
- artificial neural networks
- optimization methods
- action selection
- continous state- and action-space