Skip to Main Content
 

Global Search Box

 
 
 
 

Files

ETD Abstract Container

Abstract Header

Enhanced Bitmap Indexes for Large Scale Data Management

Canahuate, Guadalupe M.

Abstract Details

2009, Doctor of Philosophy, Ohio State University, Computer Science and Engineering.
Advances in technology have enabled the production of massive volumes of data through observations and simulations in many application domains.These new data sets and the associated queries pose a new challenge for efficient storage and data retrieval that requires novel indexing structures and algorithms. We propose a series of enhancements to bitmap indexes to make them account for the inherent characteristics of large scale datasets and to efficiently support the type of queries needed to analyze the data. First, we formalize how missing data should be handled and how queries should be executed in the presence of missing data. Then, we propose an adaptive code ordering as a hybrid between Gray code and lexicographic orderings to reorganize the data and further reduce the size of the already compressed bitmaps. We address the inability of the compressed bitmaps to directly access a given row by proposing an approximate encoding that compresses the bitmap in a hash structure. We also extend the existing run-length encoders of bitmap indexes by adding an extra word to represent future rows with zeros and minimize the insertion overhead of new data. We propose a comprehensive framework to execute similarity searches over the bitmap indexes without changes to the current bitmap structure, without accessing the original data, and using a similarity function that is meaningful in high dimensional spaces. Finally, we propose a new encoding and query execution for non-clustered bitmap indexes that combines several attributes in one existence bitmap, reduces the storage requirement, and improves query execution and update time for low cardinality attributes.
Ferhatosmanoglu Hakan (Advisor)
Gagan Agrawal (Committee Member)
P Sadayappan (Committee Member)
Timothy Long (Committee Member)
220 p.

Recommended Citations

Citations

  • Canahuate, G. M. (2009). Enhanced Bitmap Indexes for Large Scale Data Management [Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1244047153

    APA Style (7th edition)

  • Canahuate, Guadalupe. Enhanced Bitmap Indexes for Large Scale Data Management. 2009. Ohio State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu1244047153.

    MLA Style (8th edition)

  • Canahuate, Guadalupe. "Enhanced Bitmap Indexes for Large Scale Data Management." Doctoral dissertation, Ohio State University, 2009. http://rave.ohiolink.edu/etdc/view?acc_num=osu1244047153

    Chicago Manual of Style (17th edition)