Skip to Main Content
 

Global Search Box

 
 
 
 

ETD Abstract Container

Abstract Header

Bean Soup Translation: Flexible, Linguistically-motivated Syntax for Machine Translation

Mehay, Dennis Nolan

Abstract Details

2012, Doctor of Philosophy, Ohio State University, Linguistics.

Machine translation (MT) systems attempt to translate texts from one language into another by translating words from a source language and rearranging them into fluent utterances in a target language. When the two languages organize concepts in very different ways, knowledge of their general sentence structure, or syntax, is crucial. The syntax of the target language is particularly useful, because it provides a means of testing whether the reorderings that a system might try are grammatically licensed. This thesis presents two novel syntactic techniques that aid in producing correct and grammatical translations. The first technique controls target language reordering using syntactic categories that span multiple words. The second technique complements the first by assessing the well-formedness of sequences formed by these reorderings using the same syntactic categories. These innovations are implemented in the context of statistical phrase-based machine translation [Zens et al., 2002; Koehn et al., 2003], which is the prevailing modern translation paradigm.

The main contribution of this thesis is to use the flexible syntax of Combinatory Categorial Grammar [CCG, Steedman, 2000] as the basis for deriving syntactic constituent labels for target strings in phrase-based systems, providing CCG labels for many target strings that traditional syntactic theories struggle to describe. These CCG labels are used to train novel syntax-based reordering and language models, which efficiently describe translation reordering patterns, as well as assess the grammaticality of target translations. The models are easily incorporated into phrase-based systems with minimal disruption to existing technology and achieve superior automatic metric scores and human evaluation ratings over a strong phrase-based baseline, as well as over syntax-based techniques that do not use CCG.

William Schuler, PhD (Committee Chair)
Michael White, PhD (Committee Member)
Christopher Brew, PhD (Committee Member)
Eric Fosler-Lussier, PhD (Committee Member)
154 p.

Recommended Citations

Citations

  • Mehay, D. N. (2012). Bean Soup Translation: Flexible, Linguistically-motivated Syntax for Machine Translation [Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1345433807

    APA Style (7th edition)

  • Mehay, Dennis. Bean Soup Translation: Flexible, Linguistically-motivated Syntax for Machine Translation. 2012. Ohio State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu1345433807.

    MLA Style (8th edition)

  • Mehay, Dennis. "Bean Soup Translation: Flexible, Linguistically-motivated Syntax for Machine Translation." Doctoral dissertation, Ohio State University, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=osu1345433807

    Chicago Manual of Style (17th edition)