Skip to Main Content
 

Global Search Box

 
 
 
 

ETD Abstract Container

Abstract Header

Amplifying Domain Expertise in Medical Data Pipelines

Rahman, Protiva

Abstract Details

2020, Doctor of Philosophy, Ohio State University, Computer Science and Engineering.
Digitization of medical documents has led to increased availability of data for analysis. This has induced domains to incorporate data-driven decision-making. However, going from data to decision-making involves a pipeline that can be broken into three stages: collection, cleaning, and analysis. The specialized nature of certain datasets, especially in the medical field, requires domain expertise at every pipeline step. Domain experts refer to individuals who are not necessarily trained in computational fields but are experts in the data domain. These experts have different requirements from other end-users. In part one of this dissertation, we motivate the need for a separate class of systems that amplify expertise. To this end, we present a framework for amplifying expertise, which includes summarization, guidance, interaction and acceleration. We demonstrate that expertise can be amplified by employing one or more of the above dimensions at every pipeline stage. Amplification during data collection involves accelerating domain expert data entry by optimizing the form interface to reduce input effort. This is addressed by our system, TRANSFORMER, in part two. TRANSFORMER models the cost of human input as a weighted sum of interactions required to fill the form. It then optimizes the cost by leveraging the schema and data of the form’s database. Our results show that the transformed forms are 50% quicker to complete than the original ones, effectively accelerating expert input. In part three, we address expertise amplification at the data cleaning stage. Filling in unreported values can be tedious if experts are unable to effectively interact with the data. To address this, we present ICARUS, which guides experts by showing informative subsets for interactive updates. It uses the database structure to generalize the expert’s edit to rules, thus accelerating data augmentation. Icarus summarizes the impact of a rule before it is applied. Using ICARUS, experts were, on average, able to fill in 56,000 values in just 148 edits, while in its absence, they required weeks. However, the subjective nature of experts’ rules often requires multiple experts to come to a consensus. This involves removing conflicts and redundancies between rules. The complexity of the rules and data requires informative visual summaries. This is tackled by DELPHI, an interactive decision consolidation system. We conducted a design study to find an effective rule representation for experts. DELPHI summarizes rule relationships and their impact on the data. It allows experts to interactively edit the rule-set and accelerates their task by automatically removing redundant rules. In part four, we address amplification in data analysis through DEEDEE. DEEDEE aids experts in making treatment guidelines. It creates a decision-tree to classify patients based on their antibiotic susceptibility. DEEDEE guides the expert by highlighting interesting nodes. It summarizes the data at each node. It allows linked interaction so that the expert can see correlated attributes. Experts can accept paths in the tree as guideline recommendations, thus accelerating their task. Through case studies and empirical evaluation, we show how applying our framework amplifies domain expertise throughout the data pipeline.
Arnab Nandi, Ph.D. (Advisor)
Courtney Hebert, M.D. (Committee Member)
Srinivasan Parthasarathy, Ph.D. (Committee Member)
Huan Sun, Ph.D. (Committee Member)
212 p.

Recommended Citations

Citations

  • Rahman, P. (2020). Amplifying Domain Expertise in Medical Data Pipelines [Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu159823123519264

    APA Style (7th edition)

  • Rahman, Protiva. Amplifying Domain Expertise in Medical Data Pipelines. 2020. Ohio State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu159823123519264.

    MLA Style (8th edition)

  • Rahman, Protiva. "Amplifying Domain Expertise in Medical Data Pipelines." Doctoral dissertation, Ohio State University, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=osu159823123519264

    Chicago Manual of Style (17th edition)