الفهرس | Only 14 pages are availabe for public view |
Abstract An integrated methodology is proposed to combine the major areas of protein analysis. It can be considered as a guideline that can be used to analyze and model a protein. This methodology is applied on one of proteins’ viruses for analysis and modeling; non-structural protein 5a (NS5a) of Hepatitis C virus (HCV). The proposed integrative methodology has been successful m understanding mechanisms of interaction, and relating the sequence to structural features and functions of the NS5a protein. Besides, the 3D¬structure for the second domain is successfully predicted. Also, the NS5a protein was confirmed as a hub promiscuous protein after it was studied using the integrative approach. Moreover, SemiBoost-Fold Recognition (SB-FR) algorithm is proposed for predicting protein fold. SB-FR proposes a semi-supervised boosting combination to achieve better multi-class classification model. A famous challengeable dataset (Ding and Dubchak dataset) is used for training and testing this proposed algorithm. Different parameters are applied to the same random sets of labeled and unlabeled sequences, during the training and testing of SB-FR algorithm, to benchmark its performance. Also, ”TreeTest” testing method is introduced for improving the overall accuracy of SB-FR algorithm with lower computational time. To benchmark the proposed ”TreeTest”, All-versus-All (AvA) testing method is used for comparison. The results of using SB-FR algorithm with ”TreeTest” method have achieved an improvement in overall accuracy of 5.6% for 3-class and 8.2% for 5-class compared to the base classifier. |