Music2Play: Audio-Driven Instrumental Animation

Ruijian Jia,Shanmin Pang

2023 China Automation Congress (CAC)(2023)

引用 0|浏览2
暂无评分
摘要
Sounds are produced by the vibration of a medium driven by the motion of objects. Inspired by the human ability to visually interpret the trajectory of objects from sound sources, we propose a framework, Audio-Driven Instrumental Animation network (ADIA). Given a music clip, ADIA animates the instrumental player in the corresponding image and finally harvests a complete matched instrumental video. This task is very challenging as it needs to dynamically transfer information from noisy low-dimensional audio signals to high-dimensional visual representations. To complete the task, we design three elaborate modules in ADIA, namely, audio, flow and image modules. To be specific, the audio module begins to drive the initial pose to gain a sequence of poses in an auto-regressive way, where a novel limb loss is proposed to constrain the location of each key-point of generated poses. Then, the flow module estimates the dense flow field information from the pair of poses. Finally, the image module fuses multi-modal information (audio, flow and image) to synthesize the output frame. In the experiment phase, we demonstrate the effectiveness of our method according to the comparison with several closely related works. In addition, we also show it can synthesize realistic, diverse and rhythm-matching videos from music through a user study. The supplementary video is available at https://youtu.be/F1rZxgu4B_A
更多
查看译文
关键词
image animation,cross-modal learning,video synthesis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要