Skip to Main Content
 

Global Search Box

 
 
 
 

Files

ETD Abstract Container

Abstract Header

From Pixels to People: Graph Based Methods for Grouping Problems in Computer Vision

Abstract Details

2010, Doctor of Philosophy, Ohio State University, Computer Science and Engineering.

In this dissertation, we study grouping problems in computer vision using graph-based machine learning techniques. Grouping problems abound in computer vision and are typically challenging ones in order to generate perceptually and semantically consistent results. In the context of this dissertation, we strive to (1) group image pixels into meaningful objects and backgrounds; (2) group interacting people present in a video into sound social communities. Traditionally, in a graph-based formulation, the entities (e.g. image pixels) are treated as graph vertices and their interrelations are encoded in a weighted adjacency matrix of the graph. In this dissertation, we go beyond standard graph construction methods by building on probabilistic image hypergraphs and learned social graphs (or social networks) for the two parts of work respectively. Learning on graphs results in labeling of entities. In our work, graph based smoothness and modularity measures are examined and adapted to the problems under study.

Under this general graph-based framework, the first pursued direction is interactive image segmentation, or the problem of grouping image pixels into meaningful objects and their backgrounds, given a limited number of user-supplied seeds. Our contributions in this direction include the probabilistic hypergraph image model (PHIM) to address higher-order relations among pixels in segment labels, which are commonly ignored in competing approaches. To further alleviate the dependence of interactive segmentation on user-supplied seeds, we introduce diffusion signatures derived from salient boundaries and present a framework for automatically introducing new seeds at critical image locations, in order to enhance segmentation results. Both proposed frameworks are extensively tested on a standard image dataset and achieved excellent quantitative and qualitative results in segmentation.

In the second direction, we contribute an automatic framework to infer relations among actors from videos. In particular, we propose a principled graph-based affinity learning method, which synthesizes both co-occurrence information among actors and local grouping cue estimates at the scene level in order to make informed decisions. Once the pairwise affinities between actors are learned from the video content using visual and auditory features, we perform social network analysis based on modularity measures to detect communities, which are groups of actors. Experiments on a dataset of ten movies that we collected have shown promising results. Moreover, the proposed framework has considerably outperformed baseline methods not using visual or auditory features, suggesting the importance of audiovisual cues in high-level relational understanding tasks.

In summary, built on a graph-based learning framework, this dissertation makes contributions to grouping problems in computer vision. Specifically, we have proposed effective techniques to solve problems in both low-level analysis of images (segmentation) and high-level understanding of videos (relational inference).

Mikhail Belkin (Committee Chair)
Alper Yilmaz (Committee Co-Chair)
DeLiang Wang (Committee Member)
Simon Dennis (Committee Member)

Recommended Citations

Citations

  • Ding, L. (2010). From Pixels to People: Graph Based Methods for Grouping Problems in Computer Vision [Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1289845859

    APA Style (7th edition)

  • Ding, Lei. From Pixels to People: Graph Based Methods for Grouping Problems in Computer Vision. 2010. Ohio State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu1289845859.

    MLA Style (8th edition)

  • Ding, Lei. "From Pixels to People: Graph Based Methods for Grouping Problems in Computer Vision." Doctoral dissertation, Ohio State University, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=osu1289845859

    Chicago Manual of Style (17th edition)