Exploiting Stereo Sound Channels to Boost Performance of Neural Network-Based Music Transcription.

ICMLA(2019)

引用 4|浏览78
暂无评分
摘要
In recent years deep learning begins to show great potential for automatic music transcription that reproduces MIDI-like music composition information, such as note pitches and onset and offset times, from music recordings. In the literature without exception the two stereo sound channels coming with music recordings were averaged into a single channel to alleviate the computation overhead, which, from an entropy standpoint, definitely sacrifices information. In this paper we propose a method to properly combine the two sound channels for deep learning-based pitch detection. In particular, through modifying the loss function the network is forced to focus on the worse performing sound channel. This method achieves start-of-the-art frame-wise pitch detection performance on the MAPS dataset.
更多
查看译文
关键词
automatic music transcription (AMT),deep learning,pitch detection
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要