Skip to Main Content
Frequently Asked Questions
Submit an ETD
Global Search Box
Need Help?
Keyword Search
Participating Institutions
Advanced Search
School Logo
Files
File List
PhD_Thesis_Musafa_Coskun.pdf (1014.33 KB)
ETD Abstract Container
Abstract Header
ALGEBRAIC METHODS FOR LINK PREDICTION IN VERY LARGE NETWORKS
Author Info
Coskun, Mustafa, Coskun
Permalink:
http://rave.ohiolink.edu/etdc/view?acc_num=case1499436242956926
Abstract Details
Year and Degree
2017, Doctor of Philosophy, Case Western Reserve University, EECS - Computer Engineering.
Abstract
Link prediction is at the heart of a large class of network analytics and information retrieval techniques, including recommendation systems, threat detection, and disease gene prioritization, among others. A commonly utilized feature in link prediction is network proximity, which is assessed using a variety of algorithms, ranging from effective resistance to random walks. Network proximity is useful in link prediction, since it has been repeatedly shown in many contexts that nodes that have many shared connections or are proximate to each other are likely to interact with each other in the future. Many network proximity measures are based on algebraic formulations and their computations utilize iterative methods. Owing to their importance, significant effort has been devoted to accelerating the iterative processes that underlie network proximity computations. These techniques rely on numerical properties of power iterations, as well as structural properties of the networks to reduce the runtime of iterative algorithms. However, the acceleration provided by existing techniques is usually not sufficient to enable real-time processing of proximity queries on networks with millions of nodes and tens of millions of edges. In this thesis, we present several algebraic approaches to the acceleration of network proximity queries, with a view to facilitating real-time query processing on very large networks. We first develop an algorithm to speed up the iterative computation of random walk based proximity, using Chebyshev polynomials for undirected networks. We show that our approach, called Chopper, yields provable improvements in convergence rate, and significantly reduced convergence times in practice. We then focus on the same problem for directed networks, and develop an indexing scheme that exploits the sparsity of real-life networks. We show that our algorithm, I-Chopper, significantly outperforms existing methods and it offers both scalability and efficiency for processing network proximity queries on very large directed graphs. Specifically, {\sc I-Chopper} drastically reduces the convergence time of the iterative procedure, while also minimizing the storage requirements for indexing. Using a number of large real-world networks, and top-k proximity queries as the benchmark problem, we show that our methods outperform existing methods for wide ranges of parameter values. Lastly, we apply these methods to link prediction and propose two proximity-based link prediction techniques that can capture network topology by utilizing network modularity and sparsity through network proximity queries. We comprehensively test the performance of the resulting algorithms and compare them against state-of-the art link prediction techniques. Our results show that the resulting algorithms drastically improve the performance of unsupervised link prediction methods.
Committee
Mehmet Koyuturk (Advisor)
Pages
121 p.
Subject Headings
Computer Science
Keywords
Random Walk with Restarts, Link Prediction, Network Proximity Queries
Recommended Citations
Refworks
EndNote
RIS
Mendeley
Citations
Coskun, Coskun, M. (2017).
ALGEBRAIC METHODS FOR LINK PREDICTION IN VERY LARGE NETWORKS
[Doctoral dissertation, Case Western Reserve University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=case1499436242956926
APA Style (7th edition)
Coskun, Coskun, Mustafa.
ALGEBRAIC METHODS FOR LINK PREDICTION IN VERY LARGE NETWORKS.
2017. Case Western Reserve University, Doctoral dissertation.
OhioLINK Electronic Theses and Dissertations Center
, http://rave.ohiolink.edu/etdc/view?acc_num=case1499436242956926.
MLA Style (8th edition)
Coskun, Coskun, Mustafa. "ALGEBRAIC METHODS FOR LINK PREDICTION IN VERY LARGE NETWORKS." Doctoral dissertation, Case Western Reserve University, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=case1499436242956926
Chicago Manual of Style (17th edition)
Abstract Footer
Document number:
case1499436242956926
Download Count:
378
Copyright Info
© 2017, all rights reserved.
This open access ETD is published by Case Western Reserve University School of Graduate Studies and OhioLINK.