Using Genetic Programming to Improve Data Collection for Offline Reinforcement Learning

Research output: Working paperPreprint

Abstract

Offline Reinforcement Learning (RL) learns policies from fixed pre-collected datasets, making it applicable to use-cases where data collection is risky. Consequently, the performance of these offline-learners is highly dependent on the dataset. Still the questions of how this data is collected and what characteristics are needed are not thoroughly investigated. Simultaneously, evolutionary methods have reemerged as a promising alternative to classic RL, leading to the field of evolutionary RL (EvoRL), combining the two learning paradigms to exploit their supplementary attributes. This study aims to join these research directions and examine the effects of Genetic Programming (GP) on dataset characteristics in RL and its potential to enhance the performance of offline algorithms. A comparative approach was employed, comparing Deep Q-Networks (DQN) and GP for data collection across multiple environments and modes. The exploration and exploitation capabilities of these methods were quantified and a comparative analysis was conducted to determine whether data collected through GP led to superior performance in multiple offline-learners. The findings indicate that GP demonstrates strong and stable performance in generating high-quality experiences with competitive exploration. GP exhibited lower uncertainty in experience generation compared to DQN and produced high trajectory quality datasets across all environments. More offline-learners showed statistically significant performance gains with GP-collected data than trained on DQN-collected trajectories. Furthermore, their performance was less dependent on the environment, as the GP consistently generated high-quality datasets. This study showcases the effective combination of GP with offline-learners, suggesting a promising avenue for future research in optimizing data collection for RL.
Original languageEnglish
PublisherSocial Science Research Network (SSRN), Elsevier
Pages1-73
Number of pages73
DOIs
Publication statusSubmitted - 8 Oct 2024

Keywords

  • Offline Reinforcement Learning
  • Genetic Programming
  • Evolutionary Reinforcement Learning
  • Evolutionary Algorithms
  • Data Efficiency

Cite this