Back to Search
Start Over
A pipeline for identification of bird and frog species in tropical soundscape recordings using a convolutional neural network
- Source :
- Ecological Informatics. 59:101113
- Publication Year :
- 2020
- Publisher :
- Elsevier BV, 2020.
-
Abstract
- Automated acoustic recorders can collect long-term soundscape data containing species-specific signals in remote environments. Ecologists have increasingly used them for studying diverse fauna around the globe. Deep learning methods have gained recent attention for automating the process of species identification in soundscape recordings. We present an end-to-end pipeline for training a convolutional neural network (CNN) for multi-species multi-label classification of soundscape recordings, starting from raw, unlabeled audio. Training data for species-specific signals are collected using a semi-automated procedure consisting of an efficient template-based signal detection algorithm and a graphical user interface for rapid detection validation. A CNN is then trained based on mel-spectrograms of sound to predict the set of species present in a recording. Transfer learning of a pre-trained model is employed to reduce the necessary training data and time. Furthermore, we define a loss function that allows for using true and false template-based detections to train a multi-class multi-label audio classifier. This approach leverages relevant absence (negative) information in training, and reduces the effort in creating multi-label training data by allowing weak labels. We evaluated the pipeline using a set of soundscape recordings collected across 749 sites in Puerto Rico. A CNN model was trained to identify 24 regional species of birds and frogs. The semi-automated training data collection process greatly reduced the manual effort required for training. The model was evaluated on an excluded set of 1000 randomly sampled 1-min soundscapes from 17 sites in the El Yunque National Forest. The test recordings contained an average of ~3 present target species per recording, and a maximum of 8. The test set also showed a large class imbalance with most species being present in less than 5% of recordings, and others present in >25%. The model achieved a mean-average-precision of 0.893 across the 24 species. Across all predictions, the total average-precision was 0.975.
- Subjects :
- 0106 biological sciences
Soundscape
Ecology
Computer science
business.industry
010604 marine biology & hydrobiology
Applied Mathematics
Ecological Modeling
Deep learning
Pattern recognition
010603 evolutionary biology
01 natural sciences
Convolutional neural network
Computer Science Applications
Computational Theory and Mathematics
Modeling and Simulation
Test set
Detection theory
Artificial intelligence
business
Transfer of learning
Classifier (UML)
Ecology, Evolution, Behavior and Systematics
Graphical user interface
Subjects
Details
- ISSN :
- 15749541
- Volume :
- 59
- Database :
- OpenAIRE
- Journal :
- Ecological Informatics
- Accession number :
- edsair.doi...........00eb92707da9ffecc265097b0dbad186
- Full Text :
- https://doi.org/10.1016/j.ecoinf.2020.101113