Skip to Main Content
 

Global Search Box

 
 
 
 

Files

ETD Abstract Container

Abstract Header

Retrieval and Labeling of Documents Using Ontologies: Aided by a Collaborative Filtering

Abstract Details

2023, PhD, University of Cincinnati, Engineering and Applied Science: Computer Science and Engineering.
Information retrieval is one of the common tasks in today’s world and retrieval systems are aided by various text mining and analysis methods. The objective of retrieval is to obtain information resources from a collection that are relevant to a specified query. The retrieval process begins with a query provided by a user. A search engine is then started to find the relevant resources. Typically, the queries are formed using the same terms (words) that also occur within the resources. The situations of a document matching the non-occurring terms are illustrated by the following examples: we want to retrieve documents relevant to some query terms that do not explicitly occur in the documents but are relevant to their contents. We want to retrieve documents using queries that contain labels from the ontology tree, and these labels may not explicitly occur in documents. We may have a large collection of documents in an organization, and various user communities that may want to refer to the documents using their community-specific ontologies. Several information retrieval methods use clustering of documents followed by determining signatures for each cluster describing the terms predominantly present in each of the clusters. We have designed and implemented a clustering algorithm that partitions the data space in a step-wise manner and seeks to optimize clusters that have good-quality signatures representing the documents in the clusters. The clustering algorithm is modeled on a bi-clustering strategy using the spectral co-clustering method at each step and then optimizing towards clusters that have strong representative signatures. We have shown that this clustering algorithm performs better than other known clustering algorithms such as K-Means and Latent Dirichlet Allocation (LDA). We have accomplished our goal of improving information retrieval systems’ capabilities and performance by presenting a new method to generate predicted terms for the documents by using Singular Value Decomposition (SVD) based collaborative filtering methods. We have shown that retrievals made using such recommended terms for documents retrieve correct documents with reasonably high accuracy. In addition, including predicted terms in the clustering process improves the purity of clusters and the quality of retrieval. We have achieved our goal of integrating ontological labels with information retrieval by adding terms to a document from ontologies and using a collaborative filtering approach to associate ontology labels with other relevant documents. We have tested the performance of our method with many cases of integrating ontologies: single ontology label, single large ontology with all complexities of an ontology tree, and multiple ontology trees. We have tested this method on our document collections and have obtained promising results. Our method has higher performance than other existing methods.
Raj Bhatnagar, Ph.D. (Committee Chair)
Gowtham Atluri, Ph.D. (Committee Member)
Ali Minai, Ph.D. (Committee Member)
Anil Jegga, DVM MRes (Committee Member)
Yizong Cheng, Ph.D. (Committee Member)
95 p.

Recommended Citations

Citations

  • Alshammari, A. (2023). Retrieval and Labeling of Documents Using Ontologies: Aided by a Collaborative Filtering [Doctoral dissertation, University of Cincinnati]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=ucin16847705803491

    APA Style (7th edition)

  • Alshammari, Asma. Retrieval and Labeling of Documents Using Ontologies: Aided by a Collaborative Filtering. 2023. University of Cincinnati, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=ucin16847705803491.

    MLA Style (8th edition)

  • Alshammari, Asma. "Retrieval and Labeling of Documents Using Ontologies: Aided by a Collaborative Filtering." Doctoral dissertation, University of Cincinnati, 2023. http://rave.ohiolink.edu/etdc/view?acc_num=ucin16847705803491

    Chicago Manual of Style (17th edition)