Skip to Main Content
 

Global Search Box

 
 
 
 

ETD Abstract Container

Abstract Header

Computational Modeling of Syntax Acquisition with Cognitive Constraints

Abstract Details

2020, Doctor of Philosophy, Ohio State University, Linguistics.
Syntactic structures are unobserved theoretical constructs which are useful in explaining a wide range of linguistic and psychological phenomena. Language acquisition studies how such latent structures are acquired by human learners through many hypothesized learning mechanisms and apparatuses, which can be genetically endowed or of general cognitive use. Through computational modeling, this thesis aims at understanding the issue of learning such latent structures in a bottom-up fashion, starting from a position with fewest assumptions possible about what learners know to facilitate learning. The learning technique used in all models is distributional learning, where regularities in statistics of surface forms: words, characters, images, are used for inducing clusters for words and phrases, as well as hierarchical structures of such clusters for generating the observed linear linguistic sequences with maximum likelihood. The central question these models are trying to answer is how much of syntax can be learned with distributional learning only. Novel models for syntax acquisition modeling are proposed in this thesis, starting from Bayesian grammar induction models to grammar induction models with neural networks; from models without any constraint to models with psycholinguistically-inspired constraints; from models with words as input to models with distributed representations of words, characters and images as input. These models have achieved high consistency between induced latent structures and syntactic structures from linguistic theories. Through evaluation and comparison of proposed models and models from previous work on unsupervised parsing and grammar induction, results presented in this thesis first paint a relatively complete picture of state-of-the-art grammar induction performance on a large set of languages with different typological features, supporting the generality of proposed algorithms as well as providing crosslinguistic performance data for analysis of interaction between distributional learning and linguistic typology. These models also provided us valuable insights into properties of language and cognition, providing us with evidence of the degree to which statistical information of words and characters can guide syntax learning. Many things considered to be essential to syntax acquisition, such as categories, head directionality, case and verb valency, have been shown to be inducible using distributional learning with computational models. The incorporation of memory constraints into grammar induction models as well as supervised neural left-corner parsers has strengthened the claim that performance interacts and constrains the formation of linguistic competence. Multilingual induction has shown how high frequency markers guide grammar induction models in the induction process. Character sequences provide useful information when grammatical relations are expressed through affixes, and images provide information for languages where high frequency markers are not enough for the formation of syntactic categories. Analyses of results from these models have also presented evidences of things which are not easily learned from distributional learning such as preposition phrase attachment, tense and grammatical categories marked by affixes.
William Schuler (Advisor)
Mike White (Committee Member)
Micha Elsner (Committee Member)

Recommended Citations

Citations

  • Jin, L. (2020). Computational Modeling of Syntax Acquisition with Cognitive Constraints [Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1594934826359118

    APA Style (7th edition)

  • Jin, Lifeng. Computational Modeling of Syntax Acquisition with Cognitive Constraints. 2020. Ohio State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu1594934826359118.

    MLA Style (8th edition)

  • Jin, Lifeng. "Computational Modeling of Syntax Acquisition with Cognitive Constraints." Doctoral dissertation, Ohio State University, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=osu1594934826359118

    Chicago Manual of Style (17th edition)