Exploration of Acoustic Features for Automatic Vowel Discrimination in Spontaneous Speech

Tyson, Na'im R.

Keyword Search

School Logo

osu1339695879.pdf (4.31 MB)

Exploration of Acoustic Features for Automatic Vowel Discrimination in Spontaneous Speech

Author Info

Tyson, Na'im R.

Permalink:

http://rave.ohiolink.edu/etdc/view?acc_num=osu1339695879

Year and Degree

2012, Doctor of Philosophy, Ohio State University, Linguistics.

Abstract

In an attempt to understand what acoustic/auditory feature sets motivated transcribers towards certain labeling decisions, I built machine learning models that were capable of discriminating between canonical and non-canonical vowels excised from the Buckeye Corpus. Specifically, I wanted to model when the dictionary form and the transcribed-form of a vowel would match one another. I defined the transcribed-form of a vowel as an intended production from a speaker X labeled as Y by a transcriber. With specific acoustic/auditory feature sets extracted from a vowel, a pattern recognizer was used to produce a result indicating if the transcribed-form is an example of a citation form of a vowel.

The second purpose was to compare discrimination performance of models with static vowel measures to those models with measurements taken along a trajectory, which consisted of measurements from 20%, 50% and 80% of a vowel¿¿¿¿¿¿¿¿¿¿¿¿¿s duration. The hypothesis was that trajectory-based measures would have notable performance gains over static vowel measures.

Static and trajectory-based measurements were then organized between formant-based and cepstral measurements. The hypothesis was that cepstral representations of vowels should outperform resonant frequencies of the vocal tract (formants) simply because there is more acoustic/auditory information encoded within cepstral representations compared to formants, thereby facilitating vowel discrimination in spontaneous speech.

To model this type of vowel discrimination process, I used a Support Vector Machine (SVM) and Discriminant Analysis since such pattern recognition models showed encouraging results in classifying vowel data as shown by Clarkson and Moreno (1999) in the case of SVMs and Hillenbrand et al. (1995) for Discriminant Analysis. Input parameters came in the form of either formant measures (and transformations of formants into log and Bark scales) or cepstral measures such as Mel Frequency Cepstral Coefficients and Perceptual Linear Predictive (PLP) Coefficients. Both were computed from the midpoint and at distinct time points of a vowel¿¿¿¿¿¿¿¿¿¿¿¿¿s duration (20%, 50% and 80%) for CVC syllables where I chose only stop consonants /p, b, t, d, k, g/ and one of the vowels /¿¿¿¿¿¿‘, ¿¿¿¿¿, ¿¿¿¿, ¿¿¿¿/ because of their high numbers of mismatches between the canonical and transcribed forms.

Results substantiated our hypothesis that trajectory-based, cepstral measures had the highest accuracies for both male and female speakers. Auditory features like Bark transformations and PLP coefficients were the most effective as well for both classifiers, with percentages of agreement (with the human transcribers) upwards of 80%. However, the differences in performance between formant-based and cepstral-based features were not substantial as I had originally postulated. This finding suggests that formant transformations to the auditory scale were sufficient for the task of vowel discrimination within the Buckeye Corpus.

Committee

Cynthia Clopper (Committee Chair)
Eric Fosler-Lussier (Committee Member)
Mark Pitt (Committee Member)

Pages

187 p.

Subject Headings

Computer Science; Linguistics; Psychology

Keywords

vowel discrimination; spontaneous speech; conversational speech; discriminant analysis; support vector machines; buckeye corpus

Tyson, N. R. (2012). Exploration of Acoustic Features for Automatic Vowel Discrimination in Spontaneous Speech [Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1339695879
APA Style (7th edition)
Tyson, Na'im. Exploration of Acoustic Features for Automatic Vowel Discrimination in Spontaneous Speech. 2012. Ohio State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu1339695879.
MLA Style (8th edition)
Tyson, Na'im. "Exploration of Acoustic Features for Automatic Vowel Discrimination in Spontaneous Speech." Doctoral dissertation, Ohio State University, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=osu1339695879
Chicago Manual of Style (17th edition)

Document number:

osu1339695879

Download Count:

821

Copyright Info

Global Search Box

Files

File List

ETD Abstract Container

Abstract Header

Exploration of Acoustic Features for Automatic Vowel Discrimination in Spontaneous Speech

Abstract Details

Recommended Citations

Citations

Abstract Footer

Global Footer

Ohio Department of Higher Education

State Government Links

Education Links

Global Search Box

Files

File List

ETD Abstract Container

Abstract Header

Exploration of Acoustic Features for Automatic Vowel Discrimination in Spontaneous Speech

Abstract Details

Recommended CitationsRefworksEndNoteRISMendeley

Citations

Abstract Footer

Global Footer

Ohio Department of Higher Education

State Government Links

Education Links

Recommended Citations