Register now After registration you will be able to apply for this opportunity online.
This opportunity is not published. No applications will be accepted.
Reinforcement Learning Control with Probabilistic Safety
When controlling a system we typically aim to make the system carry out specific tasks, like remaining in a set of states, or reaching a set of states, or both. Recent advances allow to formulate controllers using dynamic programming that trade off such specifications optimally against costs, such as energy consumption. However, these methods rely on full model knowledge; it is the aim of this project to explore model-free attempts towards achieving these objectives. The approach will be validated on the Ball-on-a-Plate system, which is a mechanically actuated plate with a ball on it.
Keywords: Machine Learning, Reinforcement Learning, Control Theory, Safety, Stochastic Systems
The problem of finding controllers that yield a maximum probability of remaining in a set (invariance), reaching specific states (reachability) or both (reach-avoidance) has a rich history in Dynamic Programming (https://arxiv.org/pdf/2211.07544.pdf). Further, most systems also feature a physical cost objective and trading off the cost against the probability of achieving one of the above specifications yields a problem formulation that has often been considered intractable. While recent developments propose a computationally effective approach, they require full information of the stochastic model dynamics (https://arxiv.org/pdf/2312.10495v1.pdf, https://arxiv.org/pdf/2402.19360v1.pdf). This project aims to alleviate this restriction using learning-based approaches, e.g., Reinforcement Learning. The algorithm will be validated on the ball-on-a-plate system; a mechanically actuated plate that balances a ball in its middle. Recently equipped with a simple-to-use python interface the system yields an ideal testbed with simple, yet challenging dynamics.
The problem of finding controllers that yield a maximum probability of remaining in a set (invariance), reaching specific states (reachability) or both (reach-avoidance) has a rich history in Dynamic Programming (https://arxiv.org/pdf/2211.07544.pdf). Further, most systems also feature a physical cost objective and trading off the cost against the probability of achieving one of the above specifications yields a problem formulation that has often been considered intractable. While recent developments propose a computationally effective approach, they require full information of the stochastic model dynamics (https://arxiv.org/pdf/2312.10495v1.pdf, https://arxiv.org/pdf/2402.19360v1.pdf). This project aims to alleviate this restriction using learning-based approaches, e.g., Reinforcement Learning. The algorithm will be validated on the ball-on-a-plate system; a mechanically actuated plate that balances a ball in its middle. Recently equipped with a simple-to-use python interface the system yields an ideal testbed with simple, yet challenging dynamics.
The goal is to
1. Design learning-based algorithms to trade off invariance, reachability and reach-avoid specifications against cost objectives.
2. Implement and validate the algorithm on the Ball-on-a-Plate system, including creating a good visual presentation of the approach.
The goal is to 1. Design learning-based algorithms to trade off invariance, reachability and reach-avoid specifications against cost objectives. 2. Implement and validate the algorithm on the Ball-on-a-Plate system, including creating a good visual presentation of the approach.
Please send your resume/CV (including lists of relevant publications/projects) and transcript of records via email to nikschmid@ethz.ch, mfochesato@ethz.ch.
Please send your resume/CV (including lists of relevant publications/projects) and transcript of records via email to nikschmid@ethz.ch, mfochesato@ethz.ch.