Pretraining Deep Learning Models for Natural Language Understanding

Shao, Han

Keyword Search

School Logo

Honors_Project_newest.pdf (154.96 KB)

Pretraining Deep Learning Models for Natural Language Understanding

Author Info

Shao, Han

ORCID® Identifier

http://orcid.org/0000-0002-8398-1531

Permalink:

http://rave.ohiolink.edu/etdc/view?acc_num=oberlin158955297757398

Year and Degree

2020, BA, Oberlin College, Computer Science.

Abstract

Since the first bidirectional deep learn- ing model for natural language understanding, BERT, emerged in 2018, researchers have started to study and use pretrained bidirectional autoencoding or autoregressive models to solve language problems. In this project, I conducted research to fully understand BERT and XLNet and applied their pretrained models to two language tasks: reading comprehension (RACE) and part-of-speech tagging (The Penn Treebank). After experimenting with those released models, I implemented my own version of ELECTRA, a pretrained text encoder as a discriminator instead of a generator to improve compute-efficiency, with BERT as its underlying architecture. To reduce the number of parameters, I replaced BERT with ALBERT in ELEC- TRA and named the new model, ALE (A Lite ELECTRA). I compared the performance of BERT, ELECTRA, and ALE on GLUE benchmark dev set after pretraining them with the same datasets for the same amount of training FLOPs.

Committee

John L. Donaldson (Advisor)

Pages

9 p.

Subject Headings

Computer Science

Keywords

Machine learning; NLP; Deep learning

Shao, H. (2020). Pretraining Deep Learning Models for Natural Language Understanding [Undergraduate thesis, Oberlin College]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=oberlin158955297757398
APA Style (7th edition)
Shao, Han. Pretraining Deep Learning Models for Natural Language Understanding. 2020. Oberlin College, Undergraduate thesis. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=oberlin158955297757398.
MLA Style (8th edition)
Shao, Han. "Pretraining Deep Learning Models for Natural Language Understanding." Undergraduate thesis, Oberlin College, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=oberlin158955297757398
Chicago Manual of Style (17th edition)

Document number:

oberlin158955297757398

Download Count:

522

Copyright Info

Global Search Box

Files

File List

ETD Abstract Container

Abstract Header

Pretraining Deep Learning Models for Natural Language Understanding

Abstract Details

Recommended Citations

Citations

Abstract Footer

Global Footer

Ohio Department of Higher Education

State Government Links

Education Links

Global Search Box

Files

File List

ETD Abstract Container

Abstract Header

Pretraining Deep Learning Models for Natural Language Understanding

Abstract Details

Recommended CitationsRefworksEndNoteRISMendeley

Citations

Abstract Footer

Global Footer

Ohio Department of Higher Education

State Government Links

Education Links

Recommended Citations