AudioDiffusion: Generating High-Quality Audios from EEG Signals : Reconstructing Audio from EEG Signals

Dianyuan Qi, Ling Kong,Lei Yang,Congsheng Li

2023 4th International Symposium on Computer Engineering and Intelligent Communications (ISCEIC)(2023)

引用 0|浏览3
暂无评分
摘要
The study proposed a new model for generating high-quality audio directly from the brain’s electroencephalogram (EEG) signal. AudioDiffusion uses pre-trained text-to-speech models with temporally masked signal modelling to pre-train the EEG encoder for effective and robust EEG representations and robust EEG representation. In addition, the method further utilizes the Mel-Frequency Spectrum encoder to provide additional supervision in order to better align EEG and speech embeddings in a limited number of EEG-audio pairs. Overall, the proposed method overcomes the challenges of using EEG signals to generate audio, such as noise, limited information and individual differences, and achieves promising results. The quantitative and qualitative results demonstrate the effectiveness of the proposed method, an important step towards portable and low-cost EEG-to-audio, with potential applications in neuroscience and natural language processing.
更多
查看译文
关键词
electroencephalogram,audio reconstruction,diffusion model,Mel-Frequency spectrum
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要