- Given a set of independently trained RL skills enabling quick transitions becomes a challenge, especially in the real-world. We introduce a method that lets quadruped robots transition between multiple learned locomotion skills.
- We implement a meta-controller that leverages a transition scoring model conditioned on the latent state representations of underlying policy experts.
Method
Each gait is independently learned using motion imitation from animation data.
- Domain randomization ensures robust zero-shot sim2real deployment.
Transition-Net:
- An MLP trained to predict the success of a transition between two policies, conditioned on specific policy pairs, a target phase, and the latent state representations of the active policy.
- Latent representation are activations from the last hidden layer of each policy. It encodes the current state of the robot under the policy.
- Trained as a binary classifier: Given a transition configuration (source policy, destination policy, source latent, target phase), predicts success/failure.
- An MLP trained to predict the success of a transition between two policies, conditioned on specific policy pairs, a target phase, and the latent state representations of the active policy.
Diagram detailing the whole process from training the library of experts, collecting the dataset and training the transition-net classifier, and the meta-controller during deployment.
- At runtime, a meta-controller queries this network in real-time to determine when and how to switch gaits without destabilizing the robot.
The meta-controller is queried at every time step. Once the predicted score is good enough, the queued policy takes control of the robot.
Cite
@inproceedings{christmann2023expanding,
title={Expanding versatility of agile locomotion through policy transitions using latent state representation},
author={Christmann, Guilherme and Luo, Ying-Sheng and Soeseno, Jonathan Hans and Chen, Wei-Chao},
booktitle={2023 IEEE International Conference on Robotics and Automation (ICRA)},
pages={5134--5140},
year={2023},
organization={IEEE}
}