Patrick Susini, Daniel Presssnitzer, Stephen McAdams, Bennett K. Smith, Nicolas Misdariis, Equipe Perception et cognition musicales, Sciences et Technologies de la Musique et du Son (STMS), Institut de Recherche et Coordination Acoustique/Musique (IRCAM)-Université Pierre et Marie Curie - Paris 6 (UPMC)-Centre National de la Recherche Scientifique (CNRS)-Institut de Recherche et Coordination Acoustique/Musique (IRCAM)-Université Pierre et Marie Curie - Paris 6 (UPMC)-Centre National de la Recherche Scientifique (CNRS), and ircam, ircam
Several studies dealing with the perception of musical timbre have found significant correlations between acoustical parameters of sounds and their subjective dimensions. Using the conclusions of some of these studies, a calculation method of the perceptual distance between two sounds has been developed. Initially, four parameters are considered: spectral centroid, irregularity of the spectral envelope, attack time, and degree of variation of the spectral envelope over time. For each of these, a transformation factor between the physical axis and the corresponding subjective dimension is obtained by linear regression. After a normalization of the data, the four coefficients then found are those of a linear combination that gives the final distance values. Since this model is based on numerical results derived from experiments that mostly used synthesized sounds, the application to a database of recorded musical instrument sounds needs a strong validation procedure. This procedure involves the adjustment of the coefficients of the first four parameters as well as the eventual introduction of new ones to attain a perceptually relevant distance between two musical sounds. The progress of this research and the results of the database search engine built on this similarity model will be presented and discussed.