Skip to Main Content
 

Global Search Box

 
 
 
 

ETD Abstract Container

Abstract Header

Developing a Semantic Web Crawler to Locate OWL Documents

Abstract Details

2012, Master of Science (MS), Wright State University, Computer Science.

The terms Semantic Web and OWL are relatively new and growing concepts in the World Wide Web. Because these concepts are so new there are relatively few applications and/or tools for utilizing the potential power of this new concept. Although there are many components to the Semantic Web, this thesis will focus on the research question, "How do we go about developing a web crawler for the Semantic Web that locates and retrieves OWL documents." Specifically for this thesis, we hypothesize that by giving URIs to OWL documents, including all URIs from within these OWL documents, priority over other types of references, then we will locate more OWL documents than by any other type of traversal. We reason that OWL documents have proportionally more references to other OWL documents than non-OWL documents do, so that by giving them priority we should have located more OWL files when the crawl terminates, than by any other traversal method.

In order to develop such an OWL priority queue, we needed to develop some heuristics to predict OWL documents during real-time parsing of Semantic Web documents. These heuristics are based on filename extensions and OWL language constructs, which are not absolute when predicting a document type before retrieval. However, if our reasoning is correct, then URIs found in an OWL document will likely lead to more OWL documents, such that when the crawl ends because of reaching a maximum document limit, we will have retrieved more OWL documents than by other methods such as breadth-first or load-balanced. We conclude our research with an evaluation of our results to test the validity of our hypothesis and to see if it is worthy of future research.

Pascal Hitzler, PhD (Committee Chair)
Gouzhu Dong, PhD (Committee Member)
Krishnaprasad Thirunarayan, PhD (Committee Member)
90 p.

Recommended Citations

Citations

  • Koron, R. D. (2012). Developing a Semantic Web Crawler to Locate OWL Documents [Master's thesis, Wright State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=wright1347937844

    APA Style (7th edition)

  • Koron, Ronald. Developing a Semantic Web Crawler to Locate OWL Documents. 2012. Wright State University, Master's thesis. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=wright1347937844.

    MLA Style (8th edition)

  • Koron, Ronald. "Developing a Semantic Web Crawler to Locate OWL Documents." Master's thesis, Wright State University, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=wright1347937844

    Chicago Manual of Style (17th edition)