This video demonstrates reinforcement learning on our selfmade real cartpole with Raspberry PI and Arduino microcontrollers, presented at CeBIT 2018. All control policies were trained on the same batch of data, which consists of 10 minutes manual interactions with the system using a SNES USB gamepad. The control interval is 20ms.

Reinforcement Learning algorithms applied:

  1. Neural-fitted Q Iteration (discrete actions)
  2. Interpretable Policy using Genetic Programming (continuous actions)
  3. Bayesian Neural Network based Reinforcement Learning (continuous actions)

Credits: We built this demonstrator at Siemens AG, Department: Corporate Technology / Learning Systems. The project was supported with funds from the German Federal Ministry of Education and Research under project number 01IB15001A (ALICE II).