Skip to Main Content
Frequently Asked Questions
Submit an ETD
Global Search Box
Need Help?
Keyword Search
Participating Institutions
Advanced Search
School Logo
Files
File List
Ye, Xin accepted dissertation 05-04-16 Su 16.pdf (1.47 MB)
ETD Abstract Container
Abstract Header
Automated Software Defect Localization
Author Info
Ye, Xin
Permalink:
http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1462374079
Abstract Details
Year and Degree
2016, Doctor of Philosophy (PhD), Ohio University, Electrical Engineering & Computer Science (Engineering and Technology).
Abstract
In software development, developers usually receive bug reports that describe the abnormal behaviors of the software products. When a new bug report is received, developers usually need to reproduce the bug and perform code reviews to find the cause, a process that can be tedious and time-consuming. To alleviate developers’ manual efforts of finding the bug, this dissertation presents a learning-to-rank approach that ranks all the source code files for a given bug report automatically. To improve the ranking performance, this dissertation also introduces using word-embedding-based text similarities to bridge the lexical gap between natural languages in bug reports and code in source files. First, a tool for ranking all the source files with respect to how likely they are to contain the cause of the bug would enable developers to narrow down their search and improve productivity. This dissertation introduces an adaptive ranking approach that leverages project knowledge through functional decomposition of source code, API descriptions of library components, bug-fixing history, code change history, and the file dependency graph. Given a bug report, the ranking score of each source file is computed as a weighted combination of an array of features, where the weights are trained automatically on previously solved bug reports using a learning-to-rank technique. We evaluate the ranking system on six large-scale open source Java projects, using the before-fix version of the project for every bug report. The experimental results show that the learning-to-rank approach outperforms three recent state-of-the-art methods. In particular, our method makes correct recommendations within the top 10 ranked source files for over 70% of the bug reports in the Eclipse Platform and Tomcat projects. Second, we propose bridging the lexical gap by projecting natural language statements and code snippets as meaning vectors in a shared representation space. In the proposed architecture, word embeddings are first trained on API documents, tutorials, and reference documents, and then aggregated in order to estimate semantic similarities between documents. Empirical evaluations show that the learned vector space embeddings lead to improvements in the report-oriented bug localization task.
Committee
Chang Liu (Advisor)
Razvan Bunescu (Advisor)
Pages
134 p.
Subject Headings
Computer Science
Keywords
Software maintenance
;
bug reports
;
learning to rank
;
word embeddings
;
API documents
Recommended Citations
Refworks
EndNote
RIS
Mendeley
Citations
Ye, X. (2016).
Automated Software Defect Localization
[Doctoral dissertation, Ohio University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1462374079
APA Style (7th edition)
Ye, Xin.
Automated Software Defect Localization.
2016. Ohio University, Doctoral dissertation.
OhioLINK Electronic Theses and Dissertations Center
, http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1462374079.
MLA Style (8th edition)
Ye, Xin. "Automated Software Defect Localization." Doctoral dissertation, Ohio University, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1462374079
Chicago Manual of Style (17th edition)
Abstract Footer
Document number:
ohiou1462374079
Download Count:
1,254
Copyright Info
© 2016, all rights reserved.
This open access ETD is published by Ohio University and OhioLINK.