Skip to Main Content
 

Global Search Box

 
 
 
 

ETD Abstract Container

Abstract Header

Smith-Waterman Sequence Alignment For Massively Parallel High-Performance Computing Architectures

Steinfadt, Shannon Irene

Abstract Details

2010, PHD, Kent State University, College of Arts and Sciences / Department of Computer Science.

This research addresses one of the most often used tools in bioinformatics, sequence alignment. The increasing growth and complexity of high-performance computing as well as the stellar data growth in the bioinformatics field are motivating factors. An associative algorithm for performing quality sequence alignments more efficiently and faster is at the center of the dissertation. SWAMP or Smith-Waterman using Associative Massive Parallelism is the parallel algorithm designed and implemented for the ASC associative SIMD computing model. The theoretical parallel speedup for the algorithm is optimal, and reduces the compute time from O(mn) to O(m+n), where m and n are the length of the input sequences. When m = n, the running time becomes O(n) with a very small constant.

Using the capabilities of ASC, innovative new algorithms that are extensions of the above SWAMP algorithm increase the information returned by the alignment algorithms without decreasing the accuracy of those alignments. Known as SWAMP+, these algorithms that provide a highly sensitive parallelized approach extending traditional pairwise sequence alignment. They are useful for in-depth exploration of sequences, including research in expressed sequence tags, regulatory regions, and evolutionary relationships.

Although the SWAMP suite of algorithms was designed for the associative computing platform, they were implemented on the ClearSpeed CSX 620 parallel SIMD accelerator to obtain realistic metrics. The performance for the compute intensive matrix calculation displayed a speedup of roughly 96 using ClearSpeed's 96 processing elements, thus verifying the possibility of achieving the theoretical speedup mentioned above. Additional parallel hardware implementations were explored and a cluster-based approach to test the memory-intensive Smith-Waterman across multiple nodes within a cluster was used. This work utilized a tool called JumboMem. It allowed us to store the huge matrix of computations completely in memory for what we believe to be one of the largest instances of Smith-Waterman ever run.

Johnnie W. Baker, PhD (Advisor)
Kenneth Batcher, PhD (Committee Member)
Paul Farrell, PhD (Committee Member)
James Blank, PhD (Committee Member)
134 p.

Recommended Citations

Citations

  • Steinfadt, S. I. (2010). Smith-Waterman Sequence Alignment For Massively Parallel High-Performance Computing Architectures [Doctoral dissertation, Kent State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=kent1271656353

    APA Style (7th edition)

  • Steinfadt, Shannon. Smith-Waterman Sequence Alignment For Massively Parallel High-Performance Computing Architectures. 2010. Kent State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=kent1271656353.

    MLA Style (8th edition)

  • Steinfadt, Shannon. "Smith-Waterman Sequence Alignment For Massively Parallel High-Performance Computing Architectures." Doctoral dissertation, Kent State University, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=kent1271656353

    Chicago Manual of Style (17th edition)