Skip to Main Content
Frequently Asked Questions
Submit an ETD
Global Search Box
Need Help?
Keyword Search
Participating Institutions
Advanced Search
School Logo
Files
File List
Dissertation-20130718-final.pdf (4.15 MB)
ETD Abstract Container
Abstract Header
Result Diversification on Spatial, Multidimensional, Opinion, and Bibliographic Data
Author Info
Kucuktunc, Onur
Permalink:
http://rave.ohiolink.edu/etdc/view?acc_num=osu1374148621
Abstract Details
Year and Degree
2013, Doctor of Philosophy, Ohio State University, Computer Science and Engineering.
Abstract
Similarity search methods in the literature produce results based on the ranked degree of similarity to the query. However, the results are typically unsatisfactory, especially if there is an ambiguity in the query, or the search space include redundantly repeating similar documents. Diversity in query results is preferred by a variety of applications since diverse results may give a complete view of the queried topic. In this study, we investigate the result diversification task in various application areas, such as opinion retrieval, paper recommendation, with different types of data, such as spatial, high-dimensional data, opinions, citation graph, and other networks. Although the definitions of diversity will differ from field to field, we propose techniques considering the general objective of result diversification, which is to maximize the similarity of search results to the query while minimizing the pairwise similarity between the results, without neglecting the efficiency. For the diversity on spatial and high-dimensional data, we make an analogy with the concept of natural neighbors and propose geometric methods. We also introduce a diverse browsing method based on the popular distance browsing feature of R-tree index structures. Next, we focus on search and retrieval of opinion data on certain entities, and start our analysis by looking at direct correlations between sentiments of opinions and the demographics (e.g., gender, age, education level, etc.) of people that generate those opinions. Based on the analysis, we argue that opinion diversity can be achieved by diversifying the sources of opinions. Recommendation tasks on academic networks also suffer from the mentioned ambiguity and redundancy issues. To observe those effects, we present a paper recommendation framework called theadvisor (http://theadvisor.osu.edu) which recommends new papers to researchers using only the reference-citation relationships between academic papers. We introduce direction awareness property to the recommendation process, which allows the users to reach either old, foundational (possibly well-cited and well-known) research papers or recent (most likely less-known) ones. We also present different implementations and ordering techniques for reducing the query processing time. Finally, we enhance various result diversification techniques with direction-awareness property for paper recommendation, propose new algorithms based on vertex selection and query refinement, and compare in this domain. Based on our findings on diversifying citation recommendations, we further extend the diversity of graph-based recommendation algorithms for other types of graphs, such as social and collaboration networks, web and product co-purchasing graphs. Although the diversification problem is understandably addressed as a bi-criteria objective optimization problem over relevance and diversity, the sufficiency of the evaluations of such methods are questionable since a query-oblivious algorithm that returns most of its recommendations without considering the query may still perform the best on these commonly used measures. We show the deficiencies of commonly preferred evaluation techniques of diversification methods, propose a new measure called expanded relevance which combines relevance and diversity. Finally, we present a novel algorithm that optimizes the expanded relevance of the diversified results.
Committee
Umit V. Catalyurek (Advisor)
Srinivasan Parthasarathy (Committee Member)
Arnab Nandi (Committee Member)
Pages
254 p.
Subject Headings
Computer Science
Keywords
diversity
;
relevance
;
graph mining
;
result diversification
;
indexes
;
nearest neighbor search
;
spatial databases
;
collaborative question answering
;
prediction
;
sentiment analysis
;
literature search
;
graph
;
random walks
;
paper recommendation
;
web service
Recommended Citations
Refworks
EndNote
RIS
Mendeley
Citations
Kucuktunc, O. (2013).
Result Diversification on Spatial, Multidimensional, Opinion, and Bibliographic Data
[Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1374148621
APA Style (7th edition)
Kucuktunc, Onur.
Result Diversification on Spatial, Multidimensional, Opinion, and Bibliographic Data.
2013. Ohio State University, Doctoral dissertation.
OhioLINK Electronic Theses and Dissertations Center
, http://rave.ohiolink.edu/etdc/view?acc_num=osu1374148621.
MLA Style (8th edition)
Kucuktunc, Onur. "Result Diversification on Spatial, Multidimensional, Opinion, and Bibliographic Data." Doctoral dissertation, Ohio State University, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=osu1374148621
Chicago Manual of Style (17th edition)
Abstract Footer
Document number:
osu1374148621
Download Count:
1,274
Copyright Info
© 2013, all rights reserved.
This open access ETD is published by The Ohio State University and OhioLINK.