Skip to Main Content
 

Global Search Box

 
 
 
 

Files

ETD Abstract Container

Abstract Header

Broad-domain Quantifier Scoping with RoBERTa

Rasmussen, Nathan Ellis

Abstract Details

2022, Doctor of Philosophy, Ohio State University, Linguistics.
This thesis reports development of a new, broad-domain quantifier scope corpus including all of the factors, for use training and testing the system. Training materials, a work process, and the annotator-facing data format were each designed to reduce barriers to entry and safeguard accuracy, with revisions resulting from an inter-annotator agreement study and error analysis. The thesis discusses appropriate measures of agreement for scope annotations, both between human annotators and between predicted and gold labels. For appropriate calculation of chance-corrected agreement between human annotators, an inter-annotation distance metric is introduced and justified. For evaluation of automated predictions, where human-like constraints on the structure of a set of predictions are not enforced, results are evaluated both for small-scale accuracy and for compliance with these holistic constraints. The scoping data of the corpus are developed into a natural language understanding task suitable for automatic prediction, framing it as a span pair classification problem, with outscoping treated as a semantic dependency between words. This thesis reports the application of the RoBERTa language model to this task. The model encodes properties of lexis, syntax, and semantics that correlate with human scoping judgements (`scoping factors'). Previously published scope-annotated corpora and scope prediction systems either do not cover all of the scoping factors, do not apply them to the full set of quantifiers, or do not represent the full range of subject-matter domains in which humans routinely predict quantifier scope. Predictions from the RoBERTa system are shown to be more accurate than the majority-prediction baseline, to a degree not due to chance. The system successfully complies with the holistic constraints. The system's principal shortcomings are its relatively small improvement over the baseline, its dependence on some other system to screen pairs of scope-bearers for the presence of scopal interaction, and the inability thus far of its architecture to serve as that screener. Further steps to address these are proposed. Data and code are available via the Schuler Computational Cognitive Modeling Lab.
William Schuler (Advisor)
Micha Elsner (Committee Member)
Michael White (Committee Member)
171 p.

Recommended Citations

Citations

  • Rasmussen, N. E. (2022). Broad-domain Quantifier Scoping with RoBERTa [Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu164157527012067

    APA Style (7th edition)

  • Rasmussen, Nathan. Broad-domain Quantifier Scoping with RoBERTa. 2022. Ohio State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu164157527012067.

    MLA Style (8th edition)

  • Rasmussen, Nathan. "Broad-domain Quantifier Scoping with RoBERTa." Doctoral dissertation, Ohio State University, 2022. http://rave.ohiolink.edu/etdc/view?acc_num=osu164157527012067

    Chicago Manual of Style (17th edition)