Skip to Main Content
 

Global Search Box

 
 
 
 

Files

ETD Abstract Container

Abstract Header

CONCEPT BASED INFORMATION ORGANIZATION AND RETRIEVAL

YARDI, APARNA ARVIND

Abstract Details

2006, MS, University of Cincinnati, Engineering : Computer Science.
The area of Information Retrieval in the Computer Science concerns itself with the retrieval of information from a collection of documents. There are various techniques available to fetch relevant subsets from a large document collection. Keyword based information retrieval is one of the most popular techniques in use today. There are many popular search engines such as Google, Accoona, and Yahoo which provide the ability to retrieve information from large amounts of data. However, since these search engines are general search engines, they tend to overwhelm the user with a large result set. On the other hand, domain specific search engines such as ACM, IEEExplore work in restricted search space of specific technical fields. The keyword based searches solely rely on the presence of keywords in the document and not on the underlying concepts that the document is trying to express. Hence, they tend to over-emphasize certain documents and completely miss some of the relevant documents. The Concept based Information Retrieval System uses a concept lattice as underlying data model for information retrieval. Each document needs to be indexed based on some important conceptual primitives present in the documents. This indexing must be done manually. Some primitive concepts may also be determined based on the presence of syntactic entities such as some keywords, data tables, or line-graphs and this part of indexing may be automated. The concepts, based on the theory of Formal Concept Analysis, group together documents that share identical subsets of primitive indexing attributes. The concepts are stored in a lattice as combinations of these primitive attributes. The hierarchical nature of lattice lends itself to making basic as well as high level queries that are not easily possible with conventional keyword based systems. This retrieval system has created a scalable, modular framework to create basic and high level queries. Based on these ideas we have implemented a system which is populated with documents from the domains of “Data Mining”, “Dynamic Programming”, “Graph Theory” and “Network Theory”. With this example dataset we have implemented and tested various types of complex queries and demonstrated their usefulness, complexity, and efficacy in this thesis.
Dr. Raj Bhatnagar (Advisor)
148 p.

Recommended Citations

Citations

  • YARDI, A. A. (2006). CONCEPT BASED INFORMATION ORGANIZATION AND RETRIEVAL [Master's thesis, University of Cincinnati]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1152832274

    APA Style (7th edition)

  • YARDI, APARNA. CONCEPT BASED INFORMATION ORGANIZATION AND RETRIEVAL. 2006. University of Cincinnati, Master's thesis. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=ucin1152832274.

    MLA Style (8th edition)

  • YARDI, APARNA. "CONCEPT BASED INFORMATION ORGANIZATION AND RETRIEVAL." Master's thesis, University of Cincinnati, 2006. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1152832274

    Chicago Manual of Style (17th edition)