Skip to Main Content
 

Global Search Box

 
 
 
 

ETD Abstract Container

Abstract Header

Methods in Text Mining for Diagnostic Radiology

Johnson, Eamon B.

Abstract Details

2016, Doctor of Philosophy, Case Western Reserve University, EECS - Computer and Information Sciences.
Information extraction from clinical medical text is a challenge in computing to bring structure to the prose produced for communication in medical practice. In diagnostic radiology, prose reports are the primary means for communication of image interpretation to patients and other physicians, yet secondary use of the report requires either costly review by another radiologist or machine interpretation. In this work, we present mechanisms for improving machine interpretation of domain-specific text with large scale semantic analysis, using a corpus of 726,000 real-world radiology reports as a basis for experimentation. We examine the abstract conceptual problem of detection of incidental findings (uncertain or unexpected results) in imaging study reports. We demonstrate that classifiers incorporating semantic metrics can outperform F-measure of prior methods for follow-up classification and also outperform F-measure of incidental findings classification by physicians in-clinic (0.689 versus 0.648). Further, we propose two semantic metrics, focus and divergence, as calculated over the SNOMED-CT ontology graph, for summarization and projection of discrete report concepts into 2-dimensional space which enables both machine classification and physician interpretation of classifications. With understanding of the utility of semantic metrics for classification, we present methods for enhancing extraction of semantic information from clinical corpora. First, we construct a zero-knowledge method for imputation of semantic class for unlabeled terms through maximization of a confidence factor computed using pairwise co-occurrence statistics and rules limiting recall. Experiments with our method on corpora of reduced Mandelbrot information temperature produce accurate labeling of up to 25% of terms not labeled by prior methods. Second, we propose a method for context-sensitive quantification of relative concept salience and an algorithm capable of increasing both salience and diversity of concepts in document summaries in 28% of reports.
Gultekin Ozsoyoglu (Committee Chair)
Marc Buchner (Committee Member)
Adam Perzynski (Committee Member)
Andy Podgurski (Committee Member)
125 p.

Recommended Citations

Citations

  • Johnson, E. B. (2016). Methods in Text Mining for Diagnostic Radiology [Doctoral dissertation, Case Western Reserve University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=case1459514073

    APA Style (7th edition)

  • Johnson, Eamon. Methods in Text Mining for Diagnostic Radiology. 2016. Case Western Reserve University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=case1459514073.

    MLA Style (8th edition)

  • Johnson, Eamon. "Methods in Text Mining for Diagnostic Radiology." Doctoral dissertation, Case Western Reserve University, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=case1459514073

    Chicago Manual of Style (17th edition)