Skip to Main Content
 

Global Search Box

 
 
 

ETD Abstract Container

Abstract Header

An Investigation of Routine Repetitiveness in Open-Source Projects

Abstract Details

2018, Master of Science (MS), Bowling Green State University, Computer Science.
Many programming languages contain a way to provide a sub-portion of the source code that performs a specific and often independent behavior. Depending on the language this is called a (sub-)routine, method, function, procedure, etc. One of the main purposes of creating a routine is to enable re-use. As devised, routines are intended to be called from multiple places within a program. Sometimes, however, the same code is repeated within a project or across projects. In this work, we investigate how often such routines are repeated in a large-scale corpus of open source software. This work attempts to independently reproduce a prior research result by Nguyen et al., building from the ground up the analysis framework and analyzing a different and very large set of open source software projects. In this work, we use the Boa infrastructure to investigate routine repetitiveness by analyzing over 300k open source projects from GitHub. Similar to the prior work, we first compute the program dependence graphs (PDGs) for each routine in the dataset, perform normalization on the PDGs, and look for repetitions both within and across projects. Our experiment shows that about 16.4% of routines repeat within a project and approximately 11% of routines repeat across at least two different projects. We then perform static program slicing on the PDGs, slicing the graph on each routine argument to obtain subroutines and look for repetitiveness once again. We observe that approximately 17% of all subroutines repeat within a project and 11% repeat across projects. Finally, we investigate if the size of the PDG or the number of control nodes has any impact on the repetitiveness of routines. Overall, our results confirm the trends shown in the prior study, though with differences in the size of the results.
Robert Dyer, Dr. (Advisor)
Robert Green, Dr. (Committee Member)
Raymon Kresman, Dr. (Committee Member)
47 p.

Recommended Citations

Citations

  • Arafat, M. (2018). An Investigation of Routine Repetitiveness in Open-Source Projects [Master's thesis, Bowling Green State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1530525754458504

    APA Style (7th edition)

  • Arafat, Mohd. An Investigation of Routine Repetitiveness in Open-Source Projects. 2018. Bowling Green State University, Master's thesis. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1530525754458504.

    MLA Style (8th edition)

  • Arafat, Mohd. "An Investigation of Routine Repetitiveness in Open-Source Projects." Master's thesis, Bowling Green State University, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1530525754458504

    Chicago Manual of Style (17th edition)