Volume 2,Issue 2
Fall 2024
With the increasing computational power and the rapid expansion of biological data, the application of bioinformatics tools has emerged as the primary approach to tackling biological challenges. The precise determination of protein function through bioinformatics tools is pivotal for both biomedical research and drug discovery, making it a focal point of investigation. In this article, we classify bioinformatics-based protein function prediction methods into three main categories: methods based on protein sequences, methods based on protein structures, and methods based on protein interaction networks. We delve deeper into these specific algorithms, emphasizing recent research progress and offering invaluable insights for the utilization of bioinformatics-based protein function prediction in biomedical research and drug discovery.
1. Boadu F, Cao H, Cheng J. Combining Protein Sequences and Structures with Transformers and Equivariant Graph Neural Networks to Predict Protein Function. Bioinformatics, 39(39 supplment 1): i318–i325. (2023)
2. Yuan QM, Chen S, Rao JH, et al. AlphaFold2-Aware Protein-DNA Binding Site Prediction Using Graph Transformer. Briefings in Bioinformatics, 23(2): bbab564. (2022)
3. Xia Y, Xia CQ, Pan XY, et al. GraphBind: Protein Structural Context Embedded Rules Learned by Hierarchical Graph Neural Networks for Recognizing Nucleic-Acid-Binding Residues. Nucleic Acids Research, 49(9): e51. (2021)
4. Yuan QM, Chen JW, Zhao HY, et al. Structure-Aware Protein-Protein Interaction Site Prediction Using Deep Graph Convolutional Network. Bioinformatics, 38(1): 125–132. (2021)
5. Guan WQ. Research Progress of Human Serum Transferrin Glycosylation. Laboratory Medicine, 34(6): 563–566. (2019)
6. Rost B, Liu J, Nair R, et al. Automatic Prediction of Protein Function. Cellular and Molecular Life Sciences (CMLS), 60(12): 2637–2650. (2003)
7. Ashburner M, Ball CA, Blake JA, et al. Gene Ontology: Tool for the Unification of Biology. Nature Genetics, 25(1): 25–29. (2000)
8. Tetko IV, Rodchenkov IV, Walter MC, et al. Beyond the ‘Best’ Match: Machine Learning Annotation of Protein Sequences by Integration of Different Sources of Information. Bioinformatics, 24(5): 621–628. (2008)
9. Teng ZX, Guo MZ. Research Progress on Protein Function Prediction Methods. Intelligent Computers and Applications, 6(4): 1–4, 8. (2016)
10. Tiwari AK, Srivastava R. A Survey of Computational Intelligence Techniques in Protein Function Prediction. International Journal of Proteomics, 2014: 845479. (2014)
11. Zhou NH, Jiang YX, Bergquist TR, et al. The CAFA Challenge Reports Improved Protein Function Prediction and New Functional Annotations for Hundreds of Genes Through Experimental Screens. Genome Biology, 20(1): 244. (2019)
12. Lipman DJ, Pearson WR. Rapid and Sensitive Protein Similarity Searches. Science, 227(4693): 1435–1441. (1985)
13. Altschul SF, Gish W, Miller W, et al. Basic Local Alignment Search Tool. Journal of Molecular Biology, 215(3): 403–410. (1990)
14. Altschul SF, Madden TL, Schäffer AA, et al. Gapped BLAST and PSI-BLAST: A New Generation of Protein Database Search Programs. Nucleic Acids Research, 25(17): 3389–3402. (1997)
15. Hernández-Plaza A, Szklarczyk D, Botas J, et al. eggNOG 6.0: Enabling Comparative Genomics Across 12,535 Organisms. Nucleic Acids Research, 51(D1): D389–D394. (2023)
16. Ranjan A, Fahad MS, Fernandez-Baca D, et al. Deep Robust Framework for Protein Function Prediction Using Variable-Length Protein Sequences. ACM Transactions on Computational Biology and Bioinformatics, 2019: 1. (2019)
17. Devos D, Valencia A. Practical Limits of Function Prediction. Proteins: Structure, Function, and Genetics, 41(1): 98–107. (2000)
18. Devos D, Valencia A. Intrinsic Errors in Genome Annotation. Trends in Genetics, 17(8): 429–431. (2001)
19. Kulmanov M, Hoehndorf R. DeepGOPlus: Improved Protein Function Prediction from Sequence. Bioinformatics, 36(2): 422–429. (2020)
20. Pathak A, Roy T, Edubilli A, et al. Mask Blast with a New Chemical Logic of Amino Acids for Improved Protein Function Prediction. Proteins: Structure, Function, and Bioinformatics, 89(8): 922–924. (2021)
21. Kulmanov M, Khan MA, Hoehndorf R. DeepGO: Predicting Protein Functions from Sequence and Interactions Using a Deep Ontology-Aware Classifier. Bioinformatics, 34(4): 660–668. (2018)
22. Radivojac P, Clark WT, Oron TR, et al. A Large-Scale Evaluation of Computational Protein Function Prediction. Nature Methods, 10(3): 221–227. (2013)
23. Yang MG, Chen SK, Huang ZP, et al. Deep Learning-Enabled Discovery and Characterization of HKT Genes in Spartina alterniflora. The Plant Journal: for Cell and Molecular Biology, 116(3): 690–705. (2023)
24. Jayaram B. Decoding the Design Principles of Amino Acids and the Chemical Logic of Protein Sequences. Nature Precedings, 3: 1. (2008)
25. Kaushik R, Singh A, Jayaram B. Where Informatics Lags Chemistry Leads. Biochemistry, 57(5): 503–506. (2018)
26. You RH, Zhang ZH, Xiong Y, et al. GOLabeler: Improving Sequence-Based Large-Scale Protein Function Prediction by Learning to Rank. Bioinformatics, 34(14): 2465–2473. (2018)
27. Ioannidis VN, Marques AG, Giannakis GB. Graph Neural Networks for Predicting Protein Functions, 2019 IEEE 8th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), Guadeloupe, Le Gosier, 221–225. (2020)
28. Du ZH, He YF, Li JQ, et al. DeepAdd: Protein Function Prediction from k-mer Embedding and Additional Features. Computational Biology and Chemistry, 89: 107379. (2020)
29. Wan C, Jones DT. Protein Function Prediction is Improved by Creating Synthetic Feature Samples with Generative Adversarial Networks. Nature Machine Intelligence, 2(9): 540–550. (2020)
30. Ko CW, Huh J, Park JW. Deep Learning Program to Predict Protein Functions Based on Sequence Information. MethodsX, 9: 101622. (2022)
31. Xia WQ, Zheng LY, Fang JB, et al. PFmulDL: A Novel Strategy Enabling Multi-Class and Multi-Label Protein Function Annotation by Integrating Diverse Deep Learning Methods. Computers in Biology and Medicine, 145: 105465. (2022)
32. Dhanuka R, Tripathi A, Singh JP. A Semi-Supervised Autoencoder-Based Approach for Protein Function Prediction. IEEE Journal of Biomedical and Health Informatics, 26(10): 4957–4965. (2022)
33. Lai BQ, Xu JB. Accurate Protein Function Prediction Via Graph Attention Networks with Predicted Structure Information. Briefings in Bioinformatics, 23(1): bbab502. (2022)
34. Ranjan A, Tiwari A, Deepak A. A Sub-Sequence Based Approach to Protein Function Prediction Via Multi-Attention Based Multi-Aspect Network. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 20(1): 94–105. (2023)
35. Jiang YX, Oron TR, Clark WT, et al. An Expanded Evaluation of Protein Function Prediction Methods Shows an Improvement in Accuracy. Genome Biology, 17(1): 184. (2016)
36. Rives A, Meier J, Sercu T, et al. Biological Structure and Function Emerge from Scaling Unsupervised Learning to 250 Million Protein Sequences. Proceedings of the National Academy of Sciences of the United States of America, 118(15): e2016239118. (2021)
37. Xu JB, McPartlon M, Li J. Improved Protein Structure Prediction by Deep Learning Irrespective of Co-Evolution Information. Nature Machine Intelligence, 3(7): 601–609. (2021)
38. Giri SJ, Dutta P, Halani P, et al. MultiPredGO: Deep Multi-Modal Protein Function Prediction by Amalgamating Protein Structure, Sequence, and Interaction Information. IEEE Journal of Biomedical and Health Informatics, 25(5): 1832–1838. (2021)
39. Kondohx, Iizuka H, Masumoto G, et al. Prediction of Protein Function from Tertiary Structure of the Active Site in Heme Proteins by Convolutional Neural Network. Biomolecules, 13(1): 137. (2023)
40. Piovesan D, Giollo M, Leonardi E, et al. INGA: Protein Function Prediction Combining Interaction Networks, Domain Assignments, and Sequence Similarity. Nucleic Acids Research, 43(W1): W134–W140. (2015)
41. Zhang FH, Song H, Zeng M, et al. DeepFunc: A Deep Learning Framework for Accurate Prediction of Protein Functions from Protein Sequences and Interactions. Proteomics, 19(12): e1900019. (2019)
42. Fan KJ, Guan YF, Zhang Y. Graph2GO: A Multi-Modal Attributed Network Embedding Method for Inferring Protein Functions. GigaScience, 9(8): giaa081. (2020)
43. Cai YD, Wang JC, Deng L. SDN2GO: An Integrated Deep Learning Model for Protein Function Prediction. Frontiers in Bioengineering and Biotechnology, 8: 391. (2020)
44. You RH, Yao SW, Mamitsuka H, et al. DeepGraphGO: Graph Neural Network for Large-Scale, Multispecies Protein Function Prediction. Bioinformatics, 37(supplement_1): i262–i271. (2021)
45. Szklarczyk D, Franceschini A, Wyder S, et al. STRING v10: Protein-Protein Interaction Networks, Integrated Over the Tree of Life. Nucleic Acids Research, 43(D1): D447–D452. (2015)
46. Perozzi B, Al-Rfou R, Skiena S. DeepWalk: Online Learning of Social Representations, Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, USA, 701–710. (2014)
47. You RH, Yao SW, Xiong Y, et al. NetGO: Improving Large-Scale Protein Function Prediction with Massive Network Information. Nucleic Acids Research, 47(W1): W379–W387. (2019)
48. Ji MZ, Fan XY, Cornell CR, et al. Tundra Soil Viruses Mediate Responses of Microbial Communities to Climate Warming. mBio, 14(2): e0300922. (2023)
49. Jumper J, Evans R, Pritzel A, et al. Highly Accurate Protein Structure Prediction with AlphaFold. Nature, 596(7873): 583–589. (2021)