Audio-Driven Expressive Humanoid Locomotion
Analysis
Key Takeaways
- •Proposes RoboPerform, a novel framework for direct audio-to-locomotion.
- •Eliminates the need for explicit motion reconstruction, reducing latency and improving fidelity.
- •Enables humanoid robots to perform music-driven dance and speech-driven gestures.
- •Employs a ResMoE teacher policy and a diffusion-based student policy for audio style injection.
“RoboPerform, the first unified audio-to-locomotion framework that can directly generate music-driven dance and speech-driven co-speech gestures from audio.”