Skip to Main Content
 

Global Search Box

 
 
 
 

Files

ETD Abstract Container

Abstract Header

Inference of string mappings for speech technology

Jansche, Martin

Abstract Details

2003, Doctor of Philosophy, Ohio State University, Linguistics.
Mappings between formal languages play an important role in speech and language processing. This thesis explores issues related to inductive inference or learning of string-to-string mappings. The kinds of mappings considered fall within the larger class of rational transductions realized by finite state machines. Such mappings have applications in speech synthesis, speech recognition, and information retrieval and extraction. The present work takes its examples from speech synthesis, and is in particular concerned with the task of predicting the pronunciation of words from their spelling. When applied to this task, deterministic mappings are also known as letter-to-sound rules. The three most commonly used metrics for evaluating letter-to-sound rules are prediction error, which is not generally applicable; string error, which can only distinguish between perfect and flawed pronunciations and is therefore too coarse; and symbol error, which is based on string edit distance and subsumes string error. These three performance measures are independent in the sense that they may prefer different models for the same data set. The use of an evaluation measure based on some version of string edit distance is recommended. Existing proposals for learning deterministic letter-to-sound rules are systematized and formalized. Most formal problems underlying the learning task are shown to be intractable, even when they are severely restricted. The traditional approaches based on aligned data and prediction error are tractable, but have other undesirable properties. Approximate and heuristic methods are recommended. The formalization of learning problems also reveals a number of new open problems. Recent probabilistic approaches based on stochastic transducers are discussed and extended. A simple proposal due to Ristad and Yianilos is reviewed and recast in an algebraic framework for weighted transducers. Simple models based on memoryless transducers are generalized to stochastic finite transducers without any restrictions on their state graphs. Four fundamental problems for stochastic transducers (evaluation, parameter estimation, derivation of marginal and conditional models, and decoding) are identified and discussed for memoryless and unrestricted machines. An empirical evaluation demonstrates that stochastic transducers perform better on a letter-to-sound conversion task than deterministic mappings.
Chris Brew (Advisor)
284 p.

Recommended Citations

Citations

  • Jansche, M. (2003). Inference of string mappings for speech technology [Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1061209163

    APA Style (7th edition)

  • Jansche, Martin. Inference of string mappings for speech technology. 2003. Ohio State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu1061209163.

    MLA Style (8th edition)

  • Jansche, Martin. "Inference of string mappings for speech technology." Doctoral dissertation, Ohio State University, 2003. http://rave.ohiolink.edu/etdc/view?acc_num=osu1061209163

    Chicago Manual of Style (17th edition)