Skip to Main Content
Frequently Asked Questions
Submit an ETD
Global Search Box
Need Help?
Keyword Search
Participating Institutions
Advanced Search
School Logo
Files
File List
OhioLink.pdf (566.98 KB)
ETD Abstract Container
Abstract Header
Broad-domain Quantifier Scoping with RoBERTa
Author Info
Rasmussen, Nathan Ellis
ORCID® Identifier
http://orcid.org/0000-0002-0665-1707
Permalink:
http://rave.ohiolink.edu/etdc/view?acc_num=osu164157527012067
Abstract Details
Year and Degree
2022, Doctor of Philosophy, Ohio State University, Linguistics.
Abstract
This thesis reports development of a new, broad-domain quantifier scope corpus including all of the factors, for use training and testing the system. Training materials, a work process, and the annotator-facing data format were each designed to reduce barriers to entry and safeguard accuracy, with revisions resulting from an inter-annotator agreement study and error analysis. The thesis discusses appropriate measures of agreement for scope annotations, both between human annotators and between predicted and gold labels. For appropriate calculation of chance-corrected agreement between human annotators, an inter-annotation distance metric is introduced and justified. For evaluation of automated predictions, where human-like constraints on the structure of a set of predictions are not enforced, results are evaluated both for small-scale accuracy and for compliance with these holistic constraints. The scoping data of the corpus are developed into a natural language understanding task suitable for automatic prediction, framing it as a span pair classification problem, with outscoping treated as a semantic dependency between words. This thesis reports the application of the RoBERTa language model to this task. The model encodes properties of lexis, syntax, and semantics that correlate with human scoping judgements (`scoping factors'). Previously published scope-annotated corpora and scope prediction systems either do not cover all of the scoping factors, do not apply them to the full set of quantifiers, or do not represent the full range of subject-matter domains in which humans routinely predict quantifier scope. Predictions from the RoBERTa system are shown to be more accurate than the majority-prediction baseline, to a degree not due to chance. The system successfully complies with the holistic constraints. The system's principal shortcomings are its relatively small improvement over the baseline, its dependence on some other system to screen pairs of scope-bearers for the presence of scopal interaction, and the inability thus far of its architecture to serve as that screener. Further steps to address these are proposed. Data and code are available via the Schuler Computational Cognitive Modeling Lab.
Committee
William Schuler (Advisor)
Micha Elsner (Committee Member)
Michael White (Committee Member)
Pages
171 p.
Subject Headings
Linguistics
Keywords
quantifiers
;
quantifier scope disambiguation
;
explanatory text
;
Simple English Wikipedia
;
corpus annotation
;
inter-annotator agreement
;
RoBERTa
;
self-trained language models
;
transfer learning
;
span pair classification
Recommended Citations
Refworks
EndNote
RIS
Mendeley
Citations
Rasmussen, N. E. (2022).
Broad-domain Quantifier Scoping with RoBERTa
[Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu164157527012067
APA Style (7th edition)
Rasmussen, Nathan.
Broad-domain Quantifier Scoping with RoBERTa.
2022. Ohio State University, Doctoral dissertation.
OhioLINK Electronic Theses and Dissertations Center
, http://rave.ohiolink.edu/etdc/view?acc_num=osu164157527012067.
MLA Style (8th edition)
Rasmussen, Nathan. "Broad-domain Quantifier Scoping with RoBERTa." Doctoral dissertation, Ohio State University, 2022. http://rave.ohiolink.edu/etdc/view?acc_num=osu164157527012067
Chicago Manual of Style (17th edition)
Abstract Footer
Document number:
osu164157527012067
Download Count:
212
Copyright Info
© 2022, some rights reserved.
Broad-domain Quantifier Scoping with RoBERTa by Nathan Ellis Rasmussen is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License. Based on a work at etd.ohiolink.edu.
This open access ETD is published by The Ohio State University and OhioLINK.