Skip to Main Content
 

Global Search Box

 
 
 
 

ETD Abstract Container

Abstract Header

Automatic Generation and Assessment of Source-code Method Summaries

Abid, Nahla Jamal

Abstract Details

2017, PHD, Kent State University, College of Arts and Sciences / Department of Computer Science.
Source code reading and comprehension is an essential and time-consuming task that programmers perform during software maintenance. Natural language documentation and code summarizations are found to be critical to improve code comprehension. To this end, the dissertation proposes and presents an approach to automatically generate natural language documentation summaries for C++ methods. First, each method is automatically assigned a stereotype(s) based on static analysis and a set of heuristics. Then, the approach uses the stereotype information, static analysis, and predefined templates to generate a natural-language summary/documentation for each method. This documentation is automatically added to the code base as a comment for each method. Two studies are conducted to evaluate the approach to automatically generate natural language documentation summaries for C++ methods. The result of the two studies reveals that the generated documentation is accurate, does not include unnecessary information, and does a reasonable job describing what the method does. To further improve automatic documentation, an eye-tracking study of 18 developers reading and summarizing Java methods is presented. The study is conducted within an environment that allows eye gaze data to be collected during file scrolling and window switching. The developers provide a written summary for 15 methods assigned to them. In total, 63 methods were used from five different systems. In contrast to prior studies, the methods are not presented in isolation rather the entire source code for each system is available for the developer to navigate while writing the summary. Data collected includes eye gazes on source code, written summaries, and time to complete each summary. When data is analyzed at terms level, eye-movement behavior of a developer is closely related to their level of expertise. Experts tend to revisit control flow terms rather than focusing on them for a long period. Novices tend to spend a significant amount of gaze time and visits when they read call and control flow terms. At line level, it was found that gaze time can predict approximately 70% of lines used by experts to write their summaries by considering the top 30% lines for each method. Lastly, it is shown that activity types can be predicted based on developer reading behavior and mapped to program comprehension models.
Jonathan Maletic, Prof. (Advisor)
196 p.

Recommended Citations

Citations

  • Abid, N. J. (2017). Automatic Generation and Assessment of Source-code Method Summaries [Doctoral dissertation, Kent State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=kent1492993506814839

    APA Style (7th edition)

  • Abid, Nahla. Automatic Generation and Assessment of Source-code Method Summaries. 2017. Kent State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=kent1492993506814839.

    MLA Style (8th edition)

  • Abid, Nahla. "Automatic Generation and Assessment of Source-code Method Summaries." Doctoral dissertation, Kent State University, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=kent1492993506814839

    Chicago Manual of Style (17th edition)