Skip to Main Content
 

Global Search Box

 
 
 
 

Files

File List

Full text release has been delayed at the author's request until May 19, 2025

ETD Abstract Container

Abstract Header

COHORTFINDER: A DATA-DRIVEN, OPEN-SOURCE, TOOL FOR PARTITIONING PATHOLOGY AND IMAGING COHORTS TO YIELD ROBUST MACHINE LEARNING MODELS

Abstract Details

2023, Master of Sciences, Case Western Reserve University, EECS - Electrical Engineering.
Batch effects (BEs) refer to systematic technical differences in data collection unrelated to biological variation shown to negatively impact machine learning (ML) model generalizability. Here we develop CohortFinder (CF), an informed data partitioning algorithm that efficiently provides data-driven cohort partitioning aimed at mitigating BEs. Improved ML model performance in downstream medical image processing tasks is consequently demonstrated. The effectiveness of CF was demonstrated via three use cases: (a) Tubule segmentation (b) Colon adenocarcinoma detection (c) Rectal tumor segmentation. Precision, recall, accuracy, IoU, and F1-score were calculated to evaluate the performance. These five metrics were consistently improved when CohortFinder was employed, an investment of a few seconds, versus the WC and AC in terms of average value and standard deviation. This study demonstrated that CohortFinder can help ameliorate these effects yielding increased performance and generalizability in both digital pathology and radiology imaging use cases.
Andrew Janowczyk (Advisor)
56 p.

Recommended Citations

Citations

  • Fan, F. (2023). COHORTFINDER: A DATA-DRIVEN, OPEN-SOURCE, TOOL FOR PARTITIONING PATHOLOGY AND IMAGING COHORTS TO YIELD ROBUST MACHINE LEARNING MODELS [Master's thesis, Case Western Reserve University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=case168071584744799

    APA Style (7th edition)

  • Fan, Fan. COHORTFINDER: A DATA-DRIVEN, OPEN-SOURCE, TOOL FOR PARTITIONING PATHOLOGY AND IMAGING COHORTS TO YIELD ROBUST MACHINE LEARNING MODELS. 2023. Case Western Reserve University, Master's thesis. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=case168071584744799.

    MLA Style (8th edition)

  • Fan, Fan. "COHORTFINDER: A DATA-DRIVEN, OPEN-SOURCE, TOOL FOR PARTITIONING PATHOLOGY AND IMAGING COHORTS TO YIELD ROBUST MACHINE LEARNING MODELS." Master's thesis, Case Western Reserve University, 2023. http://rave.ohiolink.edu/etdc/view?acc_num=case168071584744799

    Chicago Manual of Style (17th edition)