Skip to Main Content
 

Global Search Box

 
 
 
 

Files

ETD Abstract Container

Abstract Header

Part-of-Speech Tagging of Source Code Identifiers using Programming Language Context Versus Natural Language Context

AlSuhaibani, Reem Saleh

Abstract Details

2015, MS, Kent State University, College of Arts and Sciences / Department of Computer Science.
An approach to identify the part of speech for program identifiers is proposed. The approach does not rely on natural language part-of-speech tagging, rather it uses rules based on the programming language syntax, along with static analysis, to determine the grammatical usage of identifiers. A set of new grammatical rules for identifier part-of-speech that are analogous to natural language part-of-speech is defined. These rules are based on the syntactical context of an identifier’s usage in the source code. Additionally, static analysis is used to automatically determine the stereotype of methods in the source code. The stereotype information is also used to infer the part-of-speech of identifiers that represent method calls. The approach is fully automated and is built upon an existing source code format and parsing technology (i.e. srcML). The tagging process adds the part-of-speech meta data directly into the srcML format. This allows for seamless interoperability with other program understanding tools and methods that utilize the part-of-speech information. The approach is evaluated by comparing the results of the tagging on ten open source software projects. The results of the evaluation demonstrate that the proposed part of speech approach consistently model how identifiers are used in software systems.
Jonathan Maletic, Dr. (Advisor)
Gwenn Volkert, Dr. (Committee Member)
Kambiz Ghazinour, Dr. (Committee Member)
82 p.

Recommended Citations

Citations

  • AlSuhaibani, R. S. (2015). Part-of-Speech Tagging of Source Code Identifiers using Programming Language Context Versus Natural Language Context [Master's thesis, Kent State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=kent1448502094

    APA Style (7th edition)

  • AlSuhaibani, Reem. Part-of-Speech Tagging of Source Code Identifiers using Programming Language Context Versus Natural Language Context. 2015. Kent State University, Master's thesis. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=kent1448502094.

    MLA Style (8th edition)

  • AlSuhaibani, Reem. "Part-of-Speech Tagging of Source Code Identifiers using Programming Language Context Versus Natural Language Context." Master's thesis, Kent State University, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=kent1448502094

    Chicago Manual of Style (17th edition)