ISPA: Inter-Species Phonetic Alphabet for Transcribing Animal Sounds
CoRR(2024)
摘要
Traditionally, bioacoustics has relied on spectrograms and continuous,
per-frame audio representations for the analysis of animal sounds, also serving
as input to machine learning models. Meanwhile, the International Phonetic
Alphabet (IPA) system has provided an interpretable, language-independent
method for transcribing human speech sounds. In this paper, we introduce ISPA
(Inter-Species Phonetic Alphabet), a precise, concise, and interpretable system
designed for transcribing animal sounds into text. We compare acoustics-based
and feature-based methods for transcribing and classifying animal sounds,
demonstrating their comparable performance with baseline methods utilizing
continuous, dense audio representations. By representing animal sounds with
text, we effectively treat them as a "foreign language," and we show that
established human language ML paradigms and models, such as language models,
can be successfully applied to improve performance.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要