Register now After registration you will be able to apply for this opportunity online.
Deploying Locomotion Policies Trained in Differentiable Simulation on Real Hardware
In recent years, using deep Reinforcement Learning (RL) for robotic motion policies has demonstrated impressive performance, yielding unprecedented robustness on real hardware. Current sim2real approaches rely on large-scale pre-training with domain randomization to make policies robust but struggle with high-dimensional spaces. Current RL methods are primarily limited by their low sample efficiency. Leveraging differentiable simulators for first-order gradient information shows great results for enhancing sample efficiency. Although promising simulation results exist, deployment on hardware is not usually done. The goal of this thesis is to train quadrupedal locomotion policies in a differentiable simulation framework, and then enable real-world deployment by modifying the simulation, the policy training, or the learning algorithm. Ideally, we can leverage properties of differentiable simulators in this process to improve sim2real transfer by fitting real data.
Keywords: Deep Reinforcement Learning, Differentiable Simulation, Quadrupedal Locomotion Control, Sim2Real
Related literature:
Xu, Jie, et al. "Accelerated policy learning with parallel differentiable simulation." arXiv preprint arXiv:2204.07137 (2022).
https://adaptive-horizon-actor-critic.github.io/
Rudin, Nikita, et al. "Learning to walk in minutes using massively parallel deep reinforcement learning." Conference on Robot Learning. PMLR, 2022.
Related literature:
Xu, Jie, et al. "Accelerated policy learning with parallel differentiable simulation." arXiv preprint arXiv:2204.07137 (2022).
https://adaptive-horizon-actor-critic.github.io/
Rudin, Nikita, et al. "Learning to walk in minutes using massively parallel deep reinforcement learning." Conference on Robot Learning. PMLR, 2022.
Literature research
Train quadrupedal locomotion policies leveraging differentiable simulation
Modify the simulation, policy training, or the learning algorithm for better sim2real
Validate on real hardware
Modify the simulation, policy training, or the learning algorithm for better sim2real
Validate on real hardware
Highly motivated student eager to test on real hardware
Familiar with basic concepts of RL, ML
Comfortable with Python, Pytorch, C++, and ROS stacks
Familiar with physics simulation and optimization for robotics
Highly motivated student eager to test on real hardware
Familiar with basic concepts of RL, ML
Comfortable with Python, Pytorch, C++, and ROS stacks
Familiar with physics simulation and optimization for robotics
Victor Klemm (RSL),
vklemm@ethz.ch
Ignat Georgiev (Georgia Tech),
ignat@imgeorgiev.com