Skip to Main Content
 

Global Search Box

 
 
 
 

ETD Abstract Container

Abstract Header

Summarization Of Real Valued Biclusters

Subramanian, Hema

Abstract Details

2011, MS, University of Cincinnati, Engineering and Applied Science: Computer Science.

With an explosion in database sizes, there is an increasing need for mining relevant information from them. Subspace clustering has been applied in various fields for discovering patterns, and many such algorithms have been investigated for finding interesting biclusters from binary-valued datasets. Mining biclusters from real-valued datasets has gained significant importance in many of the recently emerging applications. The algorithms devised for mining such biclusters generally minimize an objective function, and hence the biclusters generated by each algorithm vary depending on the objective function used.

Due to the inherent size and density of the data sets, the algorithms generate a very large number of biclusters, making it dicult to select the useful ones from among them. To overcome this problem, it is important to design strategies to summarize these biclusters into few representatives of the main ideas embedded in the dataset. The objective of this thesis is to apply some statistical properties of the generated biclusters to identify some distinguished clusters that seek to summarize the large number of biclusters into few representative ones.

In order to achieve the above stated objective, similarity measures based on mutual information and standard deviation d between biclusters are used to identify similar biclusters. These measures quantify the information shared (or the similarity) between two biclusters, and this helps in identifying potential biclusters that could be merged. The algorithm has been applied to a synthetic and two real world datasets and the results are presented. The information content and the variance in a bicluster are analyzed as the biclusters are progressively merged. The methodologies proposed in this thesis are compared to a baseline method to verify the quality of the biclusters and validate that our approach performs significantly well and has good merit.

Raj Bhatnagar, PhD (Committee Chair)
Yizong Cheng, PhD (Committee Member)
John Schlipf, PhD (Committee Member)
72 p.

Recommended Citations

Citations

  • Subramanian, H. (2011). Summarization Of Real Valued Biclusters [Master's thesis, University of Cincinnati]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1307442728

    APA Style (7th edition)

  • Subramanian, Hema. Summarization Of Real Valued Biclusters. 2011. University of Cincinnati, Master's thesis. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=ucin1307442728.

    MLA Style (8th edition)

  • Subramanian, Hema. "Summarization Of Real Valued Biclusters." Master's thesis, University of Cincinnati, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1307442728

    Chicago Manual of Style (17th edition)