![]() | Only 14 pages are availabe for public view |
Abstract sequence alignment is an important and challenging task in bioinformatics. Alignment may be defined as an arrangement of two or more DeoxyriboNucleic Acid DNA or protein sequences to highlight the regions of their similarity. Sequence alignment is used to infer the evolutionary relationship between a set of protein or DeoxyriboNucleic Acid DNA sequences. An accurate alignment can provide valuable information for experimentation on the newly found sequences. It is indispensable in basic research as well as in practical applications such as pharmaceutical development, drug discovery, disease prevention and criminal forensics. Many algorithms and methods, such as, dot plot, Needleman-Wunsch, Smith-Waterman, FAST All FASTA, Basic Local Alignment Search Tool BLAST and ClustalW have been proposed to perform and accelerate sequence alignment activities. However, with the ever increasing volume of data in bioinformatics databases, the time needed for biological sequence alignment is always increasing. Rapid development realms of high performance computing algorithms and architectures could have been more efficiently utilized to speed up sequence alignment process; thus achieving advantageous operations in identifying significant functionalities, and structural similarities of proteins, and finding important regions in a genome. The main aim of the research presented in this thesis is to explore and analyze the existing sequence alignment methods and come up with better and optimized solutions. This thesis presents two systems to solve sequence alignment problem based on the most accurate and computationally intensive algorithm. The first proposed system is a hybrid system based on cluster of Symmetric Multi- II Processing SMP machines. The system utilizes both the coarse grain parallelism through usage of Message Passing Interface MPI at the cluster level and the fine grain parallelism through usage of Open Multi-Processing OpenMP at the node level. The system shows better performance compared to pure Message Passing Interface MPI, pure Open Multi-Processing OpenMP and of course serial model. The second proposed system is a Field Programmable Gate Array FPGA based linear systolic array. The proposed system speeds up the sequence alignment over DeoxyriboNucleic Acid DNA molecules as processing elements perform their tasks in parallel thus reducing the execution time. The system is considered a step towards a complete parallel processing architecture to solve computationally intensive applications of bioinformatics. |