Skip to Main Content
 

Global Search Box

 
 
 
 

Files

File List

Full text of this paper is not available in the ETD Center. Copies may be available for inter-library loan from Case Western Reserve University or may be available for purchase from Proquest/UMI

ETD Abstract Container

Abstract Header

Causal Basis of Value-Based Statistical Fault Localization

Abstract Details

2022, Doctor of Philosophy, Case Western Reserve University, EECS - Computer and Information Sciences.
Statistical fault localization (SFL) techniques use execution profiles and success/failure information from software executions, in conjunction with statistical inference, to automatically score program elements based on how likely they are to be faulty. SFL techniques typically employ one type of profile data: coverage data, predicate outcomes, or variable values. In the relatively short history of automated SFL, most proposed techniques are based on statistical relationships between the elements of coverage profiles/spectra and the occurrences of program failures. Valuable information carried by variables has usually been ignored, and far fewer techniques are based on the analysis of variable values. In addition, the vast majority of subject programs used to evaluate techniques have been small programs containing artificial faults (faults created by mutations in code), rather than large programs containing real faults (faults logged by developers), which inhibit the credibility of these techniques. Most SFL techniques measure correlation, rather than causation, between profile values and success/failure, and so, they are subject to confounding bias that distorts the scores they produce. In recent years, techniques using the sound causal inference methodology (Causal Statistical Fault Localization) to adjust for the confounding bias properly have been successfully applied to a wide range of programs and profile data. While these first attempts at causal statistical fault localization have yielded promising results, they have lacked cohesion between the different types of profile data, flexible machine learning models, or large subject programs containing real faults in their empirical evaluation. In my work, I present new insights into the properties and limitations inherent in the SFL literature and their effect on the performance of SFL. Furthermore, I present novel algorithms that use causal inference techniques and flexible machine learning models to integrate information from various data profiles, including predicates, objects and different variable types, to estimate the real failure-causing effect of program statements more accurately. Observations with a critical condition, covariate balance of faulty predicates, presents useful intuitions about the use of causal inference and value-based techniques in fault localization. Experiments using the well-known Defects4J repository, which consists of large programs containing real faults and theoretical results, show the proposed techniques' efficacy and usability. This work helps programmers locate faults in software programs by applying a sound causal inference based approach combined with robust machine learning models to the problem of fault localization.
Andy Podgurski (Advisor)
Soumya Ray (Committee Member)
Xusheng Xiao (Committee Member)
Gurkan Bebek (Committee Member)
143 p.

Recommended Citations

Citations

  • Kucuk, Y. (2022). Causal Basis of Value-Based Statistical Fault Localization [Doctoral dissertation, Case Western Reserve University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=case1630333432023167

    APA Style (7th edition)

  • Kucuk, Yigit. Causal Basis of Value-Based Statistical Fault Localization. 2022. Case Western Reserve University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=case1630333432023167.

    MLA Style (8th edition)

  • Kucuk, Yigit. "Causal Basis of Value-Based Statistical Fault Localization." Doctoral dissertation, Case Western Reserve University, 2022. http://rave.ohiolink.edu/etdc/view?acc_num=case1630333432023167

    Chicago Manual of Style (17th edition)