Skip to Main Content
 

Global Search Box

 
 
 
 

ETD Abstract Container

Abstract Header

Semidefinite Programming Approaches to Network Clustering and Smoothing

Abstract Details

2017, Doctor of Philosophy, Ohio State University, Statistics.
Community detection and link prediction are two important problems in network analysis. In this dissertation, we propose semidefinite programming approaches to network community detection and edge probabilities estimation. Interestingly, despite different motivations, it turns out that the proposed approaches to the two problems are based on the same semidefinite program (SDP) with appropriately chosen input matrices. Our SDP relies on a semidefinite relaxation of the set of normalized clustering matrices. We derive optimality conditions of the SDP from a geometric point of view and deal with a convex body, called the Fantope. This enables us to analyze the SDP by making use of the properties of the Fantope. We design an efficient alternating direction method of multipliers algorithm to solve the SDP, which only requires computation of a small number of leading eigenvectors of symmetric matrices. For community detection, the SDP is derived from the partition criterion of maximizing the sum of average intra-cluster similarities over all clusters. The feasible set of our SDP is contained in the Fantope, which enables us to connect our SDP to sparse PCA and spectral clustering. Unlike previously proposed SDPs for community detection which use the adjacency matrix as the input matrix, we also consider using the symmetric normalized graph Laplacian matrix and the squared adjacency matrix as the input matrix to the SDP. The former choice of the input matrix is motivated by the connection between our SDP and spectral clustering and the latter builds an interesting connection between network clustering and multivariate data clustering. Using optimality conditions, we analyze the population consistency of the SDP using different input matrices under various network models. We show the exact recovery property of our SDP under strongly assortative stochastic block models when the input matrix is chosen to be the adjacency matrix. For network edge probability estimation, the SDP is derived from a least squares criterion. Since the solution to our SDP is a symmetric doubly stochastic matrix, it can be used as a smoother matrix to smooth the adjacency matrix. The estimated edge probability matrix is the smoothed adjacency matrix. We also connect our SDP-based method of estimating edge probabilities to a recently proposed neighborhood smoothing method. Our numerical experiments show that the proposed SDP has good empirical performances for community detection, edge probability estimation, and link prediction and it outperforms other state-of-the-art methods on Facebook collegiate networks for both community detection and link prediction.
Vincent Vu (Advisor)
Yoonkyung Lee (Committee Member)
Yunzhang Zhu (Committee Member)
97 p.

Recommended Citations

Citations

  • Yan, Z. (2017). Semidefinite Programming Approaches to Network Clustering and Smoothing [Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1503180139155502

    APA Style (7th edition)

  • Yan, Zhifei. Semidefinite Programming Approaches to Network Clustering and Smoothing. 2017. Ohio State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu1503180139155502.

    MLA Style (8th edition)

  • Yan, Zhifei. "Semidefinite Programming Approaches to Network Clustering and Smoothing." Doctoral dissertation, Ohio State University, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=osu1503180139155502

    Chicago Manual of Style (17th edition)