Prediction of protein phenotype based on protein interaction network by coupling genetic algorithm and K-nearest neighbor algorithm†
Abstract
Quick and accurate identification of protein phenotype is a key step for understanding life at the molecular level, and has a significant impact in the fields of biomedicine and pharmacy. However, as a result of genome and other sequencing projects, there is a huge gap between the number of discovered proteins and the number of phenotype annotated proteins. Therefore, it is indispensable to develop an automated and reliable method for predicting protein phenotype. In this paper, a novel method is proposed and used to identify protein phenotype. It is featured by coupling the genetic algorithm and K-nearest neighbor algorithm, and the feature vector is introduced to take into account the information of the protein and the neighboring proteins in the protein interaction network. As a demonstration, a five-fold cross-validation test and independent test set are performed, and the results indicate that the current method may serve as an important complementary tool for other existing algorithms in this area. The source code of MATLAB is freely available on request from the authors.