Skip to Main Content
 

Global Search Box

 
 
 
 

Files

ETD Abstract Container

Abstract Header

Exploration of Acoustic Features for Automatic Vowel Discrimination in Spontaneous Speech

Tyson, Na'im R.

Abstract Details

2012, Doctor of Philosophy, Ohio State University, Linguistics.

In an attempt to understand what acoustic/auditory feature sets motivated transcribers towards certain labeling decisions, I built machine learning models that were capable of discriminating between canonical and non-canonical vowels excised from the Buckeye Corpus. Specifically, I wanted to model when the dictionary form and the transcribed-form of a vowel would match one another. I defined the transcribed-form of a vowel as an intended production from a speaker X labeled as Y by a transcriber. With specific acoustic/auditory feature sets extracted from a vowel, a pattern recognizer was used to produce a result indicating if the transcribed-form is an example of a citation form of a vowel.

The second purpose was to compare discrimination performance of models with static vowel measures to those models with measurements taken along a trajectory, which consisted of measurements from 20%, 50% and 80% of a vowel¿¿¿¿¿¿¿¿¿¿¿¿¿s duration. The hypothesis was that trajectory-based measures would have notable performance gains over static vowel measures.

Static and trajectory-based measurements were then organized between formant-based and cepstral measurements. The hypothesis was that cepstral representations of vowels should outperform resonant frequencies of the vocal tract (formants) simply because there is more acoustic/auditory information encoded within cepstral representations compared to formants, thereby facilitating vowel discrimination in spontaneous speech.

To model this type of vowel discrimination process, I used a Support Vector Machine (SVM) and Discriminant Analysis since such pattern recognition models showed encouraging results in classifying vowel data as shown by Clarkson and Moreno (1999) in the case of SVMs and Hillenbrand et al. (1995) for Discriminant Analysis. Input parameters came in the form of either formant measures (and transformations of formants into log and Bark scales) or cepstral measures such as Mel Frequency Cepstral Coefficients and Perceptual Linear Predictive (PLP) Coefficients. Both were computed from the midpoint and at distinct time points of a vowel¿¿¿¿¿¿¿¿¿¿¿¿¿s duration (20%, 50% and 80%) for CVC syllables where I chose only stop consonants /p, b, t, d, k, g/ and one of the vowels /¿¿¿¿¿¿‘, ¿¿¿¿¿, ¿¿¿¿, ¿¿¿¿/ because of their high numbers of mismatches between the canonical and transcribed forms.

Results substantiated our hypothesis that trajectory-based, cepstral measures had the highest accuracies for both male and female speakers. Auditory features like Bark transformations and PLP coefficients were the most effective as well for both classifiers, with percentages of agreement (with the human transcribers) upwards of 80%. However, the differences in performance between formant-based and cepstral-based features were not substantial as I had originally postulated. This finding suggests that formant transformations to the auditory scale were sufficient for the task of vowel discrimination within the Buckeye Corpus.

Cynthia Clopper (Committee Chair)
Eric Fosler-Lussier (Committee Member)
Mark Pitt (Committee Member)
187 p.

Recommended Citations

Citations

  • Tyson, N. R. (2012). Exploration of Acoustic Features for Automatic Vowel Discrimination in Spontaneous Speech [Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1339695879

    APA Style (7th edition)

  • Tyson, Na'im. Exploration of Acoustic Features for Automatic Vowel Discrimination in Spontaneous Speech. 2012. Ohio State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu1339695879.

    MLA Style (8th edition)

  • Tyson, Na'im. "Exploration of Acoustic Features for Automatic Vowel Discrimination in Spontaneous Speech." Doctoral dissertation, Ohio State University, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=osu1339695879

    Chicago Manual of Style (17th edition)