Back to Search
Start Over
A Distributed Approach to Speaker Count Problem in an Open-Set Scenario by Clustering Pitch Features
- Source :
- Information, Vol 12, Iss 157, p 157 (2021), Information, Volume 12, Issue 4
- Publication Year :
- 2021
- Publisher :
- MDPI AG, 2021.
-
Abstract
- Counting the number of speakers in an audio sample can lead to innovative applications, such as a real-time ranking system. Researchers have studied advanced machine learning approaches for solving the speaker count problem. However, these solutions are not efficient in real-time environments, as it requires pre-processing of a finite set of data samples. Another approach for solving the problem is via unsupervised learning or by using audio processing techniques. The research in this category is limited and does not consider the large-scale open set environment. In this paper, we propose a distributed clustering approach to address the speaker count problem. The separability of the speaker is computed using statistical pitch parameters. The proposed solution uses multiple microphones available in smartphones in a large geographical area to capture and extract statistical pitch features from the audio samples. These features are shared between the nodes to estimate the number of speakers in the neighborhood. One of the major challenges is to reduce the error count that arises due to the proximity of the users and multiple microphones. We evaluate the algorithm’s performance using real smartphones in a multi-group arrangement by capturing parallel conversations between the users in both indoor and outdoor scenarios. The average error count distance is 1.667 in a multi-group scenario. The average error count distances in indoor environments are 16% which is better than in the outdoor environment.
- Subjects :
- Computer science
Open set
Sample (statistics)
02 engineering and technology
computer.software_genre
030507 speech-language pathology & audiology
03 medical and health sciences
node clustering
0202 electrical engineering, electronic engineering, information engineering
Audio signal processing
Cluster analysis
Finite set
speaker count
feature clustering
prosodic parameters
lcsh:T58.5-58.64
lcsh:Information technology
Statistical parameter
020206 networking & telecommunications
distributed architecture
Unsupervised learning
Node clustering
Data mining
statistical parameters
0305 other medical science
computer
Information Systems
Subjects
Details
- ISSN :
- 20782489
- Volume :
- 12
- Database :
- OpenAIRE
- Journal :
- Information
- Accession number :
- edsair.doi.dedup.....8e83eb2f037342727fb2b3e4ba51e2dd
- Full Text :
- https://doi.org/10.3390/info12040157