Isnn: Impact Sound Neural Network For Audio-Visual Object Classification

COMPUTER VISION - ECCV 2018, PT 15(2018)

引用 23|浏览172
暂无评分
摘要
3D object geometry reconstruction remains a challenge when working with transparent, occluded, or highly reflective surfaces. While recent methods classify shape features using raw audio, we present a multimodal neural network optimized for estimating an object's geometry and material. Our networks use spectrograms of recorded and synthesized object impact sounds and voxelized shape estimates to extend the capabilities of vision-based reconstruction. We evaluate our method on multiple datasets of both recorded and synthesized sounds. We further present an interactive application for real-time scene reconstruction in which a user can strike objects, producing sound that can instantly classify and segment the struck object, even if the object is transparent or visually occluded.
更多
查看译文
关键词
Impact Sounds,Object Geometry,Striking Object,Vision-based Reconstruction,VoxNet
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要