Vocal Typology: The Science of Voice Typing: * About this Research Project

    Both linguistic and non-linguistic information is transmitted in the same signal of a spoken voice. To receive linguistic information in a given speech signal, as perceivers (listeners) of that signal, our job is to process the signal to receive that linguistic content. Indexical properties of speech, such as gender, age, dialect, sexual orientation, emotion, and pathology allow us to identify a particular speaker and place their signal quickly into context, which allows us to receive the linguistic content and process it. This study deals with the indexical property of talker or speaker identity, inside of which it is posited that their are natural categories based on similarity.
  These categories yield a system of vocal typology. Vocal typology is a novel method of classifying talker voices according to degree of similarity (encompassing all aspects of the speech signal), into types that have utility for models of linguistic processing, forensic or voice identification, security applications, commercial uses of voice, and personal interest. In addition, any automatic (machine) processing of the speech signal would be enhanced by factoring out the speaker-specific acoustic attributes. By reducing the vast number of speaker identities down to a manageable number of voice types, the success of all methods of automatically processing the speech signal will be enhanced, since they all involve the automatic extraction of a common set or subset of acoustic cues.
   A series of perception experiments with an existing database of 150 American English voices will be used to determine the number and characteristics of the inventory of basic voice categories. More specifically, perceptual data from both naive and expert listeners will be integrated in order to reduce the large set of individual voices to a smaller, workable set of voice types. These voice types and the data itself will be provided online and in the published thesis, to provide an initial model for vocal typology that will find potential revision through future experimentation and eventual utility in the public, academic, and private sectors.

Search This Blog

* About this Research Project