1. Content-based recommender systems for spoken documents
- Author
-
Daniel Garcia-Romero, Aren Jansen, Jonathan Wintrode, Alan V. McCree, Gregory Sell, and Michelle Fox
- Subjects
Search engine ,Thesaurus (information retrieval) ,Identification (information) ,Information retrieval ,business.industry ,Computer science ,Information needs ,The Internet ,Relevance (information retrieval) ,Document retrieval ,Recommender system ,business ,Task (project management) - Abstract
Content-based recommender systems use preference ratings and features that characterize media to model users' interests or information needs for making future recommendations. While previously developed in the music and text domains, we present an initial exploration of content-based recommendation for spoken documents using a corpus of public domain internet audio. Unlike familiar speech technologies of topic identification and spoken document retrieval, our recommendation task requires a more comprehensive notion of document relevance than bags-of-words would supply. Inspired by music recommender systems, we automatically extract a wide variety of content-based features to characterize non-linguistic aspects of the audio such as speaker, language, gender, and environment. To combine these heterogeneous information sources into a single relevance judgement, we evaluate feature, score, and hybrid fusion techniques. Our study provides an essential first exploration of the task and clearly demonstrates the value of a multisource approach over a bag-of-words baseline.
- Published
- 2015
- Full Text
- View/download PDF