Skip to Main Content
Frequently Asked Questions
Submit an ETD
Global Search Box
Need Help?
Keyword Search
Participating Institutions
Advanced Search
School Logo
Files
File List
PhD_Dissertation_Amanuel_Alambo.pdf (3.03 MB)
ETD Abstract Container
Abstract Header
Semantics-driven Abstractive Document Summarization
Author Info
Alambo, Amanuel
Permalink:
http://rave.ohiolink.edu/etdc/view?acc_num=wright1655834036433693
Abstract Details
Year and Degree
2022, Doctor of Philosophy (PhD), Wright State University, Computer Science and Engineering PhD.
Abstract
The evolution of the Web over the last three decades has led to a deluge of scientific and news articles on the Internet. Harnessing these publications in different fields of study is critical to effective end user information consumption. Similarly, in the domain of healthcare, one of the key challenges with the adoption of Electronic Health Records (EHRs) for clinical practice has been the tremendous amount of clinical notes generated that can be summarized without which clinical decision making and communication will be inefficient and costly. In spite of the rapid advances in information retrieval and deep learning techniques towards abstractive document summarization, the results of these efforts continue to resemble extractive summaries, achieving promising results predominantly on lexical metrics but performing poorly on semantic metrics. Thus, abstractive summarization that is driven by intrinsic and extrinsic semantics of documents is not adequately explored. Resources that can be used for generating semantics-driven abstractive summaries include: • Abstracts of multiple scientific articles published in a given technical field of study to generate an abstractive summary for topically-related abstracts within the field, thus reducing the load of having to read semantically duplicate abstracts on a given topic. • Citation contexts from different authoritative papers citing a reference paper can be used to generate utility-oriented abstractive summary for a scientific article. • Biomedical articles and the named entities characterizing the biomedical articles along with background knowledge bases to generate entity and fact-aware abstractive summaries. • Clinical notes of patients and clinical knowledge bases for abstractive clinical text summarization using knowledge-driven multi-objective optimization. In this dissertation, we develop semantics-driven abstractive models based on intra- document and inter-document semantic analyses along with facts of named entities retrieved from domain-specific knowledge bases to produce summaries. Concretely, we propose a sequence of frameworks leveraging semantics at various granularity (e.g., word, sentence, document, topic, citations, and named entities) levels, by utilizing external resources. The proposed frameworks have been applied to a range of tasks including 1. Abstractive summarization of topic-centric multi-document scientific articles and news articles. 2. Abstractive summarization of scientific articles using crowd-sourced citation contexts. 3. Abstractive summarization of biomedical articles clustered based on entity-relatedness. 4. Abstractive summarization of clinical notes of patients with heart failure and Chest X-Rays recordings. The proposed approaches achieve impressive performance in terms of preserving semantics in abstractive summarization while paraphrasing. For summarization of topic-centric multiple scientific/news articles, we propose a three-stage approach where abstracts of scientific articles or news articles are clustered based on their topical similarity determined from topics generated using Latent Dirichlet Allocation (LDA), followed by extractive phase and abstractive phase. Then, in the next stage, we focus on abstractive summarization of biomedical literature where we leverage named entities in biomedical articles to 1) cluster related articles; and 2) leverage the named entities towards guiding abstractive summarization. Finally, in the last stage, we turn to external resources such as citation contexts pointing to a scientific article to generate a comprehensive and utility-centric abstractive summary of a scientific article, domain-specific knowledge bases to fill gaps in information about entities in a biomedical article to summarize and clinical notes to guide abstractive summarization of clinical text. Thus, the bottom-up progression of exploring semantics towards abstractive summarization in this dissertation starts with (i) Semantic Analysis of Latent Topics; builds on (ii) Internal and External Knowledge-I (gleaned from abstracts and Citation Contexts); and extends it to make it comprehensive using (iii) Internal and External Knowledge-II (Named Entities and Knowledge Bases).
Committee
Tanvi Banerjee, Ph.D. (Committee Co-Chair)
Krishnaprasad Thirunarayan, Ph.D. (Committee Co-Chair)
Michael Raymer, Ph.D. (Committee Member)
Vijayan Asari, Ph.D. (Committee Member)
Pages
144 p.
Subject Headings
Computer Engineering
;
Computer Science
Keywords
abstractive document summarization
;
semantics-driven abstractive summaries
Recommended Citations
Refworks
EndNote
RIS
Mendeley
Citations
Alambo, A. (2022).
Semantics-driven Abstractive Document Summarization
[Doctoral dissertation, Wright State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=wright1655834036433693
APA Style (7th edition)
Alambo, Amanuel.
Semantics-driven Abstractive Document Summarization.
2022. Wright State University, Doctoral dissertation.
OhioLINK Electronic Theses and Dissertations Center
, http://rave.ohiolink.edu/etdc/view?acc_num=wright1655834036433693.
MLA Style (8th edition)
Alambo, Amanuel. "Semantics-driven Abstractive Document Summarization." Doctoral dissertation, Wright State University, 2022. http://rave.ohiolink.edu/etdc/view?acc_num=wright1655834036433693
Chicago Manual of Style (17th edition)
Abstract Footer
Document number:
wright1655834036433693
Download Count:
357
Copyright Info
© 2022, all rights reserved.
This open access ETD is published by Wright State University and OhioLINK.