Skip to Main Content
Frequently Asked Questions
Submit an ETD
Global Search Box
Need Help?
Keyword Search
Participating Institutions
Advanced Search
School Logo
Files
File List
paper_draft.pdf (404.45 KB)
ETD Abstract Container
Abstract Header
The Influence of Syntactic Frequencies on Human Sentence Processing
Author Info
van Schijndel, Marten
ORCID® Identifier
http://orcid.org/0000-0002-9858-5881
Permalink:
http://rave.ohiolink.edu/etdc/view?acc_num=osu1502452939626929
Abstract Details
Year and Degree
2017, Doctor of Philosophy, Ohio State University, Linguistics.
Abstract
Humans are sensitive to the frequency of events, and this sensitivity is reflected in a wide range of behavioral and neural measures. This thesis focuses on the ways in which syntactic co-occurrence frequencies affect human language comprehension. Previous psycholinguistic findings seemed to show that humans are not sensitive to verbal subcategorization frequencies. Instead, this work demonstrates that sensitivity to fine-grained syntactic frequencies provide a confounding explanation for those findings. A left-corner parser is defined that can be used to compute a variety of psycholinguistic complexity metrics in order to better control for such syntactic influences in future studies. One of the strongest and most commonly used psycholinguistic measures output by the parser is surprisal (Hale, 2001; Levy, 2008), which estimates frequency-based comprehension difficulty based on the probability of an observation conditioned on the observations that preceded it. When used to predict reading times, however, this work shows that surprisal is mathematically inconsistent since it conditions on the immediately adjacent lexical material despite the fact that reading proceeds via saccades over non-adjacent material. This mathematical problem with surprisal can be corrected by summing surprisal over each saccade region to enable the measure to account for the probability of each new span of text conditioned on the preceding material that was actually observed. The corrected version of lexical (n-gram) surprisal, cumulative n-gram surprisal, obtains a better fit to reading times than the uncorrected version, though the correction does not work for surprisal over syntactic (probabilistic context-free; PCFG) structure. In addition to the frequency of observed events, this work explores the influence of frequency in how humans predict upcoming events. In particular, uncertainty about upcoming material (entropy) is shown to influence reading times, corroborating previous results in the literature (Roark et al., 2009; Angele et al., 2015). Unfortunately, the entropy over upcoming material is very expensive to compute, and so can be difficult to control for in psycholinguistic experiments. This work shows that the surprisal (n-gram and PCFG) of upcoming words, which is inexpensive to compute, can approximate the influence of that uncertainty on self-paced reading times. The results in this thesis indicate that humans are sensitive to both lexical sequence frequencies and syntactic frequencies, and this work concludes by providing a proof-of-concept model of syntactic acquisition that links the two types of frequencies. The acquisition model demonstrates how a learner that is sensitive to linear ordering frequencies could end up acquiring long-distance dependencies, typically conceived as a hallmark of hierarchical syntax, in a fashion that replicates the acquisition timeline of children.
Committee
William Schuler (Advisor)
Micha Elsner (Committee Member)
Shari Speer (Committee Member)
Shravan Vasishth (Committee Member)
Pages
143 p.
Subject Headings
Computer Science
;
Linguistics
;
Psychology
Keywords
syntax
;
computational linguistics
;
psycholinguistics
;
frequency effects
;
text complexity
;
prediction
Recommended Citations
Refworks
EndNote
RIS
Mendeley
Citations
van Schijndel, M. (2017).
The Influence of Syntactic Frequencies on Human Sentence Processing
[Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1502452939626929
APA Style (7th edition)
van Schijndel, Marten.
The Influence of Syntactic Frequencies on Human Sentence Processing.
2017. Ohio State University, Doctoral dissertation.
OhioLINK Electronic Theses and Dissertations Center
, http://rave.ohiolink.edu/etdc/view?acc_num=osu1502452939626929.
MLA Style (8th edition)
van Schijndel, Marten. "The Influence of Syntactic Frequencies on Human Sentence Processing." Doctoral dissertation, Ohio State University, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=osu1502452939626929
Chicago Manual of Style (17th edition)
Abstract Footer
Document number:
osu1502452939626929
Download Count:
775
Copyright Info
© 2017, all rights reserved.
This open access ETD is published by The Ohio State University and OhioLINK.