Back to All Events

Dominik Baumann: Safe reinforcement learning: global exploration and discrete contexts

Abstract: Leveraging reinforcement learning algorithms to control dynamical systems has become an increasingly popular approach over the past years. An important difference between dynamical systems and, for instance, gaming environments, is that failures in dynamical systems are often critical. While a game can simply be restarted, failures in dynamical systems often result in damaging expensive hardware. Thus, algorithms have emerged that guarantee, with high probability, that the system will not incur any failures during exploration. In this talk, I will present two recent approaches that fall into this category. In exchange for their guarantees, safe learning algorithms can often only explore locally around an initially given safe policy. That way, they may fail to find the global optimum. To address this, I present a recent approach that allows for global exploration while retaining probabilistic safety guarantees. Second, most algorithms focus on regression from continuous sensor inputs to actions of the system. In reality, system dynamics are often affected by discrete “context” variables, such as whether the surface is frozen or wet, which they cannot measure directly. Thus, I present an approach for multi-class classification that provides frequentist guarantees and, therefore, can be used to classify discrete contexts in safe learning algorithms while still providing probabilistic guarantees. Apart from theoretical guarantees, I also show results from hardware experiments for both approaches. In the end, I will give a brief outlook on some very recent research on the role of ergodicity in reinforcement learning.

Bio: Dominik received a diploma in electrical engineering from TU Dresden, Germany in 2016. He was then a joint PhD student with the Max Planck Institute for Intelligent Systems in Germany and KTH Stockholm, Sweden. He obtained the PhD from KTH in 2020. After obtaining his PhD, Dominik did one-year postdocs at RWTH Aachen University, Germany, and Uppsala University, Sweden. In January, he joined Aalto as an assistant professor. Dominik’s research interests revolve around learning and control for networked multi-agent systems. More information at https://baumanndominik.github.io/.

Place of seminar: Aalto CS building, room T5 (in person) & zoom https://aalto.zoom.us/j/68729326149