Skip to Main Content
 

Global Search Box

 
 
 
 

ETD Abstract Container

Abstract Header

Parsing with Local Context

Pate, John Kenton

Abstract Details

2009, Master of Arts, Ohio State University, Linguistics.

Treebanks, as a quantitative extension of decades ofsyntactic theorizing, typically use annotation schemes with a small set of well-motivated phrasal categories. For constituency-based treebanks, these phrasal categories are selected to describe distributional regularities. These treebanks are often used as a data set for estimating Probabilistic Context Free Grammars (PCFGs) for parsing, but the phrasal category sets which are best for constituency description may be suboptimal for constituency parsing. Specifically, phrasal categories may exhibit a probabilistic bias towards different expansions in different parts of the overall tree, and there may be unanticipated but useful correlations between constituency annotation and other levels of linguistic annotation.

In this thesis, the symbol-splitting technique of Johnson (1998) is extended to enrich syntactic categories with information about local syntactic context on the English Penn Treebank and the German Verbmobil II Treebank. The split symbols are then subjected to two different clustering techniques to preserve only relevant category distinctions, forming linguistically-motivated generalizations and assuaging data sparsity. The symbol-splitting and clustering techniques are then employed, on the Verbmobil treebank, to enrich syntactic categories with information about implicit prosodic break strength alone and then together with information about local context.

Local syntactic context is found to be helpful on both treebanks examined. Experiments on the German Verbmobil II Treebank then show that information about implicit prosodic break strength presents slightly larger gain over information about local syntactic context, and that combining both sorts of information leads to the largest increase in parse accuracy. This research shows that implicit prosody, as imposed by the annotators of the Verbmobil project, does vary with syntactic structure in a useful way outside of a laboratory setting. It is moreover suggestive of exploring prosody as a cue to grammar learning in children.

Chris Brew, PhD (Advisor)
Laura Wagner, PhD (Advisor)
Shari Speer, PhD (Committee Member)
55 p.

Recommended Citations

Citations

  • Pate, J. K. (2009). Parsing with Local Context [Master's thesis, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1243880542

    APA Style (7th edition)

  • Pate, John. Parsing with Local Context. 2009. Ohio State University, Master's thesis. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu1243880542.

    MLA Style (8th edition)

  • Pate, John. "Parsing with Local Context." Master's thesis, Ohio State University, 2009. http://rave.ohiolink.edu/etdc/view?acc_num=osu1243880542

    Chicago Manual of Style (17th edition)