Deep Reinforcement Learning For Audio-Visual Gaze Control

Stéphane Lathuilière,Benoit Massé,Pablo Mesejo,Radu Horaud

2018 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS)（2018）

引用 14|浏览50

暂无评分

摘要

We address the problem of audio-visual gaze control in the specific context of human-robot interaction, namely how controlled robot motions are combined with visual and acoustic observations in order to direct the robot head towards targets of interest. The paper has the following contributions: (i) a novel audio-visual fusion framework that is well suited for controlling the gaze of a robotic head; (ii) a reinforcement learning (RL) formulation for the gaze control problem, using a reward function based on the available temporal sequence of camera and microphone observations; and (iii) several deep architectures that allow to experiment with early and late fusion of audio and visual data. We introduce a simulated environment that enables us to learn the proposed deep RL model without the need of spending hours of tedious interaction. By thoroughly experimenting on a publicly available dataset and on a real robot, we provide empirical evidence that our method achieves state-of-the-art performance.

查看译文

关键词

deep reinforcement,audio-visual gaze control,human-robot interaction,controlled robot motions,visual observations,acoustic observations,robot head,robotic head,reinforcement learning formulation,gaze control problem,audio data,visual data,audio-visual fusion framework,RL,microphone observations,deep architectures

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要