Xiaoyu Huang, Yufeng Chi, Ruofeng Wang, Zhongyu Li, Xue Bin Peng, Sophia Shao, Borivoje Nikolic, Koushil Sreenath
This work introduces DiffuseLoco, a framework for training multi-skill diffusion-based policies for dynamic legged locomotion from offline datasets, enabling real-time control of diverse skills on robots in the real world. Offline learning at scale has led to breakthroughs in computer vision, natural language processing, and robotic manipulation domains. However, scaling up learning for legged robot locomotion, especially with multiple skills in a single policy, presents significant challenges for prior online reinforcement learning methods. To address this challenge, we propose a novel, scalable framework that leverages diffusion models to directly learn from offline multimodal datasets with a diverse set of locomotion skills. With design choices tailored for real-time control in dynamical systems, including receding horizon control and delayed inputs, DiffuseLoco is capable of reproducing multimodality in performing various locomotion skills, zero-shot transfer to real quadrupedal robots, and it can be deployed on edge computing devices. Furthermore, DiffuseLoco demonstrates free transitions between skills and robustness against environmental variations. Through extensive benchmarking in real-world experiments, DiffuseLoco exhibits better stability and velocity tracking performance compared to prior reinforcement learning and non-diffusion-based behavior cloning baselines. The design choices are validated via comprehensive ablation studies. This work opens new possibilities for scaling up learning-based legged locomotion controllers through the scaling of large, expressive models and diverse offline datasets.
Goal (Task) | Metric | AMP | AMP w/ H | TF | TF w/ RHC | DiffuseLoco (Ours) |
---|---|---|---|---|---|---|
0.3m/s Forward | Stability (%) | 100 | 100 | 80 | 100 | 100 |
Ev | 90.44 ± 1.87 | 90.63 ± 4.79 | 75.75 ± 6.07 | 39.28 ± 2.34 | 33.22 ± 12.48 | |
0.5m/s Forward | Stability (%) | 100 | 100 | 100 | 100 | 100 |
Ev | 50.44 ± 1.97 | 46.29 ± 2.55 | 54.35 ± 2.66 | 37.46 ± 5.31 | 12.91 ± 6.84 | |
0.7m/s Forward | Stability (%) | 0 | 20 | 0 | 40 | 100 |
Ev | 54.96 ± 0.00 | 39.36 ± 5.02 | 24.80 ± 8.91 | |||
Turn Left | Stability (%) | 20 | 100 | 0 | 100 | 100 |
Ev | 20.96 ± 0.00 | 33.39 ± 6.96 | 13.41 ± 5.02 | 12.79 ± 5.64 | ||
Turn Right | Stability (%) | 100 | 100 | 100 | 80 | 100 |
Ev | 18.61 ± 2.40 | 33.39 ± 6.96 | 25.86 ± 1.47 | 8.69 ± 5.04 | 2.22 ± 1.03 |
Goal (Task) | Metric | DLw/oRHC | DLw/oRand | DDIM-100/10 | DDIM-10/5 | U-Net | DiffuseLoco (Ours) |
---|---|---|---|---|---|---|---|
0.3m/s Forward | Stability (%) | 100 | 100 | 100 | 100 | 100 | 100 |
Ev | 75.09 ± 18.98 | 50.45 ± 2.70 | 56.89 ± 2.43 | 47.09 ± 2.40 | 81.31 ± 1.90 | 33.22 ± 12.48 | |
0.5m/s Forward | Stability (%) | 100 | 80 | 80 | 100 | 100 | 100 |
Ev | 64.49 ± 1.87 | 41.07 ± 6.12 | 41.00 ± 3.18 | 37.92 ± 1.59 | 74.52 ± 2.83 | 12.91 ± 6.84 | |
0.7m/s Forward | Stability (%) | 0 | 40 | 80 | 80 | 100 | 100 |
Ev | 44.30 ± 4.21 | 47.71 ± 6.63 | 42.58 ± 2.08 | 71.71 ± 2.93 | 24.80 ± 8.91 | ||
Turn Left | Stability (%) | 100 | 100 | 100 | 100 | 20 | 100 |
Ev | 20.96 ± 18.22 | 10.17 ± 5.86 | 22.22 ± 4.29 | 13.27 ± 2.63 | 18.93 ± 23.28 | 12.79 ± 5.64 | |
Turn Right | Stability (%) | 100 | 100 | 100 | 100 | 100 | 100 |
Ev | 18.61 ± 2.40 | 8.18 ± 3.94 | 6.47 ± 2.49 | 7.42 ± 2.90 | 89.63 ± 3.36 | 2.22 ± 1.03 |
@misc{huang2024diffuseloco,
title={DiffuseLoco: Real-Time Legged Locomotion Control with Diffusion from Offline Datasets},
author={Xiaoyu Huang and Yufeng Chi and Ruofeng Wang and Zhongyu Li and Xue Bin Peng and Sophia Shao and Borivoje Nikolic and Koushil Sreenath},
year={2024},
eprint={2404.19264},
archivePrefix={arXiv},
primaryClass={cs.RO},
url={https://arxiv.org/abs/2404.19264},
}