

浏览全部资源
扫码关注微信
中山大学计算机学院,广东 广州 510000
Received:31 December 2022,
Revised:2023-03-07,
Published:30 June 2023
移动端阅览
宋益东, 袁乾沐, 杨跃东. 深度学习在蛋白质功能预测中的应用[J]. 合成生物学, 2023, 4(3): 488-506
SONG Yidong, YUAN Qianmu, YANG Yuedong. Application of deep learning in protein function prediction[J]. Synthetic Biology Journal, 2023, 4(3): 488-506
宋益东, 袁乾沐, 杨跃东. 深度学习在蛋白质功能预测中的应用[J]. 合成生物学, 2023, 4(3): 488-506 DOI: 10.12211/2096-8280.2022-078.
SONG Yidong, YUAN Qianmu, YANG Yuedong. Application of deep learning in protein function prediction[J]. Synthetic Biology Journal, 2023, 4(3): 488-506 DOI: 10.12211/2096-8280.2022-078.
蛋白质功能预测是生物信息学中的一项重要任务,在疾病机制的阐明和药物靶点发现等领域有着重要作用。因为传统的测定蛋白质功能的生化实验通常成本高、耗时长、通量低,所以开发出高效且准确的蛋白质功能预测计算方法十分重要。蛋白质功能预测可以分为残基水平的结合位点预测和蛋白水平的基因本体论(gene ontology, GO)预测。本文首先介绍该领域常用的数据库及蛋白质特征信息,接着对当下最新的蛋白质功能预测方法进行总结。在结合位点预测方面,根据配体类型分别介绍了最新的蛋白质-蛋白质、蛋白质-多肽、蛋白质-核酸和蛋白质-小分子或离子配体的结合位点预测方法;在GO预测方面,按照预测方法的类别分别介绍了最近的基于序列、基于结构和基于蛋白相互作用网络的方法。最后,对目前的蛋白质功能预测方法进行总结、分析优劣,并展望该领域未来的发展方向。
Protein function prediction is essential for bioinformatics analysis
which benefits a wide range of biological studies such as understanding the functions of metagenomes
uncovering mechanism underlying diseases
and finding new drug targets. With the rapid development of high-throughput sequencing technology
protein sequence data have been increased quickly
but functions of most proteins have not yet been identified. Since traditional biochemical experiments to determine protein functions are usually expensive
time-consuming
and less efficient
developing more efficient and effective computational methods for protein function prediction is of great significance. Deep learning technology has made breakthroughs in many fields
including image recognition
natural language processing
genomic analysis and drug discovery. In this review
we address applications of deep learning in protein function prediction
which can be divided into residue-level binding site prediction and protein-level gene ontology (GO) prediction. Protein binding sites are regions that bind to specific ligands
which play an important role in signal transduction
metabolism
revealing molecular mechanisms underlying diseases
and designing new drugs. Gene ontology is a standard function classification system for genes
which provides a set of annotations to comprehensively describe the properties of genes and gene products. Firstly
we introduce commonly used large-scale protein structure and function databases. Secondly
discriminative protein sequence and structure features are described. Thirdly
we summarize the latest protein function prediction methods: in terms of the prediction of binding sites
we introduce the latest methods based on the ligand type
including protein
peptide
nucleic acid and small molecule as well as ion ligand
and in the aspect of GO prediction
we highlight the latest sequence-based
structure-based
and protein interaction network-based methods developed with protein information. Finally
we comment the advantages and disadvantages of the current protein function prediction methods
and discuss the future development in this field.
2
EISENBERG D , MARCOTTE E M , XENARIOS I , et al . Protein function in the post-genomic era [J ] . Nature , 2000 , 405 ( 6788 ): 823 - 826 .
RADIVOJAC P , CLARK W T , ORON T R , et al . A large-scale evaluation of computational protein function prediction [J ] . Nature Methods , 2013 , 10 ( 3 ): 221 - 227 .
ISRALEWITZ B , BAUDRY J , GULLINGSRUD J , et al . Steered molecular dynamics investigations of protein function [J ] . Journal of Molecular Graphics & Modelling , 2001 , 19 ( 1 ): 13 - 25 .
KLEPEIS J L , LINDORFF-LARSEN K , DROR R O , et al . Long-timescale molecular dynamics simulations of protein structure and function [J ] . Current Opinion in Structural Biology , 2009 , 19 ( 2 ): 120 - 127 .
PIERRI C L , PARISI G , PORCELLI V . Computational approaches for protein function prediction: a combined strategy from multiple sequence alignment to molecular docking-based virtual screening [J ] . Biochimica et Biophysica Acta (BBA)-Proteins and Proteomics , 2010 , 1804 ( 9 ): 1695 - 1712 .
YUAN Q M , CHEN S , RAO J H , et al . AlphaFold2-aware protein-DNA binding site prediction using graph transformer [J ] . Briefings in Bioinformatics , 2022 , 23 ( 2 ): bbab564 .
XIA Y , XIA C Q , PAN X Y , et al . GraphBind: protein structural context embedded rules learned by hierarchical graph neural networks for recognizing nucleic-acid-binding residues [J ] . Nucleic Acids Research , 2021 , 49 ( 9 ): e51 .
YUAN Q M , CHEN J W , ZHAO H Y , et al . Structure-aware protein-protein interaction site prediction using deep graph convolutional network [J ] . Bioinformatics , 2021 , 38 ( 1 ): 125 - 132 .
KULMANOV M , HOEHNDORF R . DeepGOPlus: improved protein function prediction from sequence [J ] . Bioinformatics , 2020 , 36 ( 2 ): 422 - 429 .
ZHANG J , KURGAN L . Review and comparative assessment of sequence-based predictors of protein-binding residues [J ] . Briefings in Bioinformatics , 2018 , 19 ( 5 ): 821 - 837 .
KUZMANOV U , EMILI A . Protein-protein interaction networks: probing disease mechanisms using model systems [J ] . Genome Medicine , 2013 , 5 ( 4 ): 37 .
WELLS J A , MCCLENDON C L . Reaching for high-hanging fruit in drug discovery at protein-protein interfaces [J ] . Nature , 2007 , 450 ( 7172 ): 1001 - 1009 .
LI Y W , GOLDING G B , ILIE L . DELPHI: accurate deep ensemble model for protein interaction sites prediction [J ] . Bioinformatics , 2021 , 37 ( 7 ): 896 - 904 .
ABDIN O , NIM S , WEN H , et al . PepNN: a deep attention model for the identification of peptide binding sites [J ] . Communications Biology , 2022 , 5 : 503 .
CHEN J W , XIE Z R , WU Y H . Understand protein functions by comparing the similarity of local structural environments [J ] . Biochimica et Biophysica Acta (BBA)-Proteins and Proteomics , 2017 , 1865 ( 2 ): 142 - 152 .
LIN Y F , CHENG C W , SHIH C S , et al . MIB: metal ion-binding site prediction and docking server [J ] . Journal of Chemical Information and Modeling , 2016 , 56 ( 12 ): 2287 - 2291 .
XIA C Q , PAN X Y , SHEN H B . Protein-ligand binding residue prediction enhancement through hybrid deep heterogeneous learning of sequence and structure data [J ] . Bioinformatics , 2020 , 36 ( 10 ): 3018 - 3027 .
YANG J Y , ROY A , ZHANG Y . Protein-ligand binding site recognition using complementary binding-specific substructure comparison and sequence profile alignment [J ] . Bioinformatics , 2013 , 29 ( 20 ): 2588 - 2595 .
HU X Z , DONG Q W , YANG J Y , et al . Recognizing metal and acid radical ion-binding sites by integrating ab initio modeling with template-based transferals [J ] . Bioinformatics , 2016 , 32 ( 21 ): 3260 - 3269 .
ASHBURNER M , BALL C A , BLAKE J A , et al . Gene ontology: tool for the unification of biology [J ] . Nature Genetics , 2000 , 25 ( 1 ): 25 - 29 .
DAVIS J , GOADRICH M . The relationship between Precision-Recall and ROC curves [C ] // Proceedings of the 23rd international conference on Machine learning . June 25-29, 2006 , Pittsburgh, Pennsylvania, USA . New York : ACM , 2006 : 233 - 240 .
CONESA A , GÖTZ S , GARCÍA-GÓMEZ J M , et al . Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research [J ] . Bioinformatics , 2005 , 21 ( 18 ): 3674 - 3676 .
YOU R H , ZHANG Z H , XIONG Y , et al . GOLabeler: improving sequence-based large-scale protein function prediction by learning to rank [J ] . Bioinformatics , 2018 , 34 ( 14 ): 2465 - 2473 .
LI H . A short introduction to learning to rank [J ] . IEICE Transactions on Information and Systems , 2011 , E94-D( 10 ): 1854 - 1862 .
CAO Y , SHEN Y . TALE: Transformer-based protein function Annotation with joint sequence-Label Embedding [J ] . Bioinformatics , 2021 , 37 ( 18 ): 2825 - 2833 .
GLIGORIJEVIĆ V , DOUGLAS RENFREW P , KOSCIOLEK T , et al . Structure-based protein function prediction using graph convolutional networks [J ] . Nature Communications , 2021 , 12 : 3168 .
OLIVER S . Guilt-by-association goes global [J ] . Nature , 2000 , 403 ( 6770 ): 601 - 602 .
YOU R H , YAO S W , XIONG Y , et al . NetGO: improving large-scale protein function prediction with massive network information [J ] . Nucleic Acids Research , 2019 , 47 ( W1 ): W379 - W387 .
SZKLARCZYK D , GABLE A L , NASTOU K C , et al . The STRING database in 2021: customizable protein-protein networks, and functional characterization of user-uploaded gene/measurement sets [J ] . Nucleic Acids Research , 2021 , 49 ( D1 ): D605 - D612 .
YAO S W , YOU R H , WANG S J , et al . NetGO 2.0: improving large-scale protein function prediction with massive sequence, text, domain, family and network information [J ] . Nucleic Acids Research , 2021 , 49 ( W1 ): W469 - W475 .
WANG S Y , LIANG K , HU Q S , et al . JAK2-binding long noncoding RNA promotes breast cancer brain metastasis [J ] . The Journal of Clinical Investigation , 2017 , 127 ( 12 ): 4498 - 4515 .
TIRALONGO J , COOPER O , LITFIN T , et al . YesU from Bacillus subtilis preferentially binds fucosylated glycans [J ] . Scientific Reports , 2018 , 8 : 13139 .
KUMAR R , CORBETT M A , VAN BON B W M , et al . THOC2 mutations implicate mRNA-export pathway in X-linked intellectual disability [J ] . The American Journal of Human Genetics , 2015 , 97 ( 2 ): 302 - 310 .
SCHMIDTKE P , BARRIL X . Understanding and predicting druggability. A high-throughput method for detection of drug binding sites [J ] . Journal of Medicinal Chemistry , 2010 , 53 ( 15 ): 5858 - 5867 .
XU M Y , RAN T , CHEN H M . De novo molecule design through the molecular generative model conditioned by 3D information of protein binding sites [J ] . Journal of Chemical Information and Modeling , 2021 , 61 ( 7 ): 3240 - 3254 .
HEFFERNAN R , YANG Y D , PALIWAL K , et al . Capturing non-local interactions by long short-term memory bidirectional recurrent neural networks for improving prediction of protein secondary structure, backbone angles, contact numbers and solvent accessibility [J ] . Bioinformatics , 2017 , 33 ( 18 ): 2842 - 2849 .
ALTSCHUL S F , MADDEN T L , SCHÄFFER A A , et al . Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J ] . Nucleic Acids Research , 1997 , 25 ( 17 ): 3389 - 3402 .
SUZEK B E , HUANG H Z , MCGARVEY P , et al . UniRef: comprehensive and non-redundant UniProt reference clusters [J ] . Bioinformatics , 2007 , 23 ( 10 ): 1282 - 1288 .
REMMERT M , BIEGERT A , HAUSER A , et al . HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment [J ] . Nature Methods , 2012 , 9 ( 2 ): 173 - 175 .
MIRDITA M , VON DEN DRIESCH L , GALIEZ C , et al . Uniclust databases of clustered and deeply annotated protein sequences and alignments [J ] . Nucleic Acids Research , 2017 , 45 ( D1 ): D170 - D176 .
MEILER J , MÜLLER M , ZEIDLER A , et al . Generation and evaluation of dimension-reduced amino acid parameter representations by artificial neural networks [J ] . Molecular Modeling Annual , 2001 , 7 ( 9 ): 360 - 369 .
RIVES A , MEIER J , SERCU T , et al . Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences [J ] . Proceedings of the National Academy of Sciences of the United States of America , 2021 , 118 ( 15 ): e2016239118 .
ELNAGGAR A , HEINZINGER M , DALLAGO C , et al . ProtTrans: towards cracking the language of life's code through self-supervised deep learning and high performance computing [EB/OL ] . arXiv , 2020 : 2007 . 06225 [ 2023-02-01 ] . https://arxiv.org/abs/2007.06225 https://arxiv.org/abs/2007.06225 .
KABSCH W , SANDER C . Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features [J ] . Biopolymers , 1983 , 22 ( 12 ): 2577 - 2637 .
POROLLO A , MELLER J . Prediction-based fingerprints of protein-protein interactions [J ] . Proteins: Structure, Function, and Bioinformatics , 2007 , 66 ( 3 ): 630 - 645 .
ZHANG J , KURGAN L . SCRIBER: accurate and partner type-specific prediction of protein-binding residues from proteins sequences [J ] . Bioinformatics , 2019 , 35 ( 14 ): i343 - i353 .
ZENG M , ZHANG F H , WU F X , et al . Protein-protein interaction site prediction through combining local and global features with deep neural networks [J ] . Bioinformatics , 2020 , 36 ( 4 ): 1114 - 1120 .
GAINZA P , SVERRISSON F , MONTI F , et al . Deciphering interaction fingerprints from protein molecular surfaces using geometric deep learning [J ] . Nature Methods , 2020 , 17 ( 2 ): 184 - 192 .
TAHERZADEH G , YANG Y D , ZHANG T , et al . Sequence-based prediction of protein-peptide binding sites using support vector machine [J ] . Journal of Computational Chemistry , 2016 , 37 ( 13 ): 1223 - 1229 .
ZHAO Z J , PENG Z L , YANG J Y . Improving sequence-based prediction of protein-peptide binding residues by introducing intrinsic disorder and a consensus method [J ] . Journal of Chemical Information and Modeling , 2018 , 58 ( 7 ): 1459 - 1468 .
WARDAH W , DEHZANGI A , TAHERZADEH G , et al . Predicting protein-peptide binding sites with a deep convolutional neural network [J ] . Journal of Theoretical Biology , 2020 , 496 : 110278 .
ZHU Y H , HU J , SONG X N , et al . DNAPred: accurate identification of DNA-binding sites from protein sequence by ensembled hyperplane-distance-based support vector machines [J ] . Journal of Chemical Information and Modeling , 2019 , 59 ( 6 ): 3057 - 3071 .
SU H , LIU M C , SUN S S , et al . Im proving the prediction of protein-nucleic acids binding residues via multiple sequence profiles and the consensus of complementary methods [J ] . Bioinformatics , 2019 , 35 ( 6 ): 930 - 936 .
WU Q , PENG Z L , ZHANG Y , et al . COACH-D: improved protein-ligand binding sites prediction with refined ligand-binding poses through molecular docking [J ] . Nucleic Acids Research , 2018 , 46 ( W1 ): W438 - W442 .
ZHANG J , CHEN Q C , LIU B . NCBRPred: predicting nucleic acid binding residues in proteins based on multilabel learning [J ] . Briefings in Bioinformatics , 2021 , 22 ( 5 ): bbaa397 .
YU D J , HU J , YANG J , et al . Designing template-free predictor for targeting protein-ligand binding sites with classifier ensemble and spatial clustering [J ] . IEEE/ACM Transactions on Computational Biology and Bioinformatics , 2013 , 10 ( 4 ): 994 - 1008 .
ROY A , YANG J Y , ZHANG Y . COFACTOR: an accurate comparative algorithm for structure-based protein function annotation [J ] . Nucleic Acids Research , 2012 , 40 ( W1 ): W471 - W477 .
OFER D , LINIAL M . ProFET: feature engineering captures high-level protein functions [J ] . Bioinformatics , 2015 , 31 ( 21 ): 3429 - 3436 .
KOZLOVSKII I , POPOV P . Protein-peptide binding site detection using 3D convolutional neural networks [J ] . Journal of Chemical Information and Modeling , 2021 , 61 ( 8 ): 3814 - 3823 .
CHO K , VAN MERRIENBOER B , GULCEHRE C , et al . Learning phrase representations using RNN encoder-decoder for statistical machine translation [EB/OL ] . arXiv , 2014 : 1406 . 1078 [ 2023-02-01 ] . https://arxiv.org/abs/1406.1078 https://arxiv.org/abs/1406.1078 .
GRAVES A . Long short-term memory [M ] // Studies in Computational Intelligence: Supervised sequence labelling with recurrent neural networks . Berlin, Heidelberg : Springer Berlin Heidelberg , 2012 : 37 - 45 .
LECUN Y , BENGIO Y . Convolutional networks for images, speech, and time series [M/OL ] //The handbook of brain theory and neural networks. Cambridge, MA , USA: MIT Press , 1995[2023-02-01] . https://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=0B925DD52A8A879C47A4032DEC9CE5E4?doi=10.1.1.32.9297&rep=rep1&type=pdf https://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=0B925DD52A8A879C47A4032DEC9CE5E4?doi=10.1.1.32.9297&rep=rep1&type=pdf .
YUAN Q M , CHEN S , WANG Y , et al . Alignment-free metal ion-binding site prediction from protein sequence through pretrained language model and multi-task learning [J ] . Briefings in Bioinformatics , 2022 , 23 ( 6 ): bbac444 .
VASWANI A , SHAZEER N , PARMAR N , et al . Attention is all you need. Advances in neural information processing systems [C/OL ] // Advances in Neural Information Processing Systems 30-NeurIPS 2017[2023-02-01] . https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf .
ZHENG S J , RAO J H , ZHANG Z Y , et al . Predicting retrosynthetic reactions using self-corrected transformer neural networks [J ] . Journal of Chemical Information and Modeling , 2020 , 60 ( 1 ): 47 - 55 .
FINN C , ABBEEL P , LEVINE S . Model-agnostic meta-learning for fast adaptation of deep networks [C ] // Proceedings of the 34th International Conference on Machine Learning-Volume 70 . August 6-11, 2017 , Sydney, NSW, Australia . New York : ACM , 2017 : 1126 - 1135 .
WANG J H , ZHENG S J , CHEN J W , et al . Meta learning for low-resource molecular optimization [J ] . Journal of Chemical Information and Modeling , 2021 , 61 ( 4 ): 1627 - 1636 .
SUN Z , ZHENG S J , ZHAO H Y , et al . To improve prediction of binding residues with DNA, RNA, carbohydrate, and peptide via multi-task deep neural networks [J ] . IEEE/ACM Transactions on Computational Biology and Bioinformatics , 2022 , 19 ( 6 ): 3735 - 3743 .
ZHANG F H , ZHAO B , SHI W B , et al . DeepDISOBind: accurate prediction of RNA-, DNA- and protein-binding intrinsically disordered residues with deep multi-task learning [J ] . Briefings in Bioinformatics , 2022 , 23 ( 1 ): bbab521 .
ZHANG Y , YANG Q . An overview of multi-task learning [J ] . National Science Review , 2018 , 5 ( 1 ): 30 - 43 .
CARUANA R . Multitask learning [J ] . Machine Learning , 1997 , 28 ( 1 ): 41 - 75 .
MERINO G A , SAIDI R , MILONE D H , et al . Hierarchical deep learning for predicting GO annotations by integrating protein knowledge [J ] . Bioinformatics , 2022 , 38 ( 19 ): 4488 - 4496 .
ZHANG C X , FREDDOLINO P L , ZHANG Y . COFACTOR: improved protein function prediction by combining structure, sequence and protein-protein interaction information [J ] . Nucleic Acids Research , 2017 , 45 ( W1 ): W291 - W299 .
KULMANOV M , KHAN M A , HOEHNDORF R . DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier [J ] . Bioinformatics , 2018 , 34 ( 4 ): 660 - 668 .
LAI B Q , XU J B . Accurate protein function prediction via graph attention networks with predicted structure information [J ] . Briefings in Bioinformatics , 2022 , 23 ( 1 ): bbab502 .
XU J B , MCPARTLON M , LI J . Improved protein structure prediction by deep learning irrespective of co-evolution information [J ] . Nature Machine Intelligence , 2021 , 3 ( 7 ): 601 - 609 .
ALTSCHUL S F , GISH W , MILLER W , et al . Basic local alignment search tool [J ] . Journal of Molecular Biology , 1990 , 215 ( 3 ): 403 - 410 .
VILLEGAS-MORCILLO A , MAKRODIMITRIS S , VAN HAM R C H J , et al . Unsupervised protein embeddings outperform hand-crafted sequence and structure features at predicting molecular function [J ] . Bioinformatics , 2021 , 37 ( 2 ): 162 - 170 .
VELIČKOVIĆ P , CUCURULL G , CASANOVA A , et al . Graph attention networks [EB/OL ] . arXiv , 2017 [ 2023-02-01 ] . https://arxiv.org/pdf/1710.10903.pdf https://arxiv.org/pdf/1710.10903.pdf .
LEE J Y , LEE I Y , KANG J W . Self-attention graph pooling [C/OL ] // Proceedings of the 22nd international conference on Machine learnin g, 9 - 15 June 2019, Long Beach, California, USA , 97 : 3734 - 3743 [2023-02-01] . https://proceedings.mlr.press/v97/lee19c.html https://proceedings.mlr.press/v97/lee19c.html .
BOUTET E , LIEBERHERR D , TOGNOLLI M , et al . UniProtKB/Swiss-prot, the manually annotated section of the UniProt KnowledgeBase: how to use the entry view [M ] // Plant Bioinformatics . New York : Springer New York , 2016 : 23 - 54 .
TORRES M , YANG H X , ROMERO A E , et al . Protein function prediction for newly sequenced organisms [J ] . Nature Machine Intelligence , 2021 , 3 ( 12 ): 1050 - 1060 .
MOSTAFAVI S , RAY D , WARDE-FARLEY D , et al . GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function [J ] . Genome Biology , 2008 , 9 ( Suppl 1 ): S4 .
YOU R H , YAO S W , MAMITSUKA H , et al . DeepGraphGO: graph neural network for large-scale, multispecies protein function prediction [J ] . Bioinformatics , 2021 , 37 ( Supplement_1 ): i262 - i271 .
KIPF T N , WELLING M . Semi-supervised classification with graph convolutional networks [EB/OL ] . arXiv , 2016 : 1609 . 02907 [ 2023-02-01 ] . https://arxiv.org/abs/1609.02907 https://arxiv.org/abs/1609.02907 .
MITCHELL A L , ATTWOOD T K , BABBITT P C , et al . InterPro in 2019: improving coverage, classification and access to protein sequence annotations [J ] . Nucleic Acids Research , 2019 , 47 ( D1 ): D351 - D360 .
FINN R D , COGGILL P , EBERHARDT R Y , et al . The Pfam protein families database: towards a more sustainable future [J ] . Nucleic Acids Research , 2016 , 44 ( D1 ): D279 - D285 .
OATES M E , STAHLHACKE J , VAVOULIS D V , et al . The SUPERFAMILY 1.75 database in 2014: a doubling of data [J ] . Nucleic Acids Research , 2015 , 43 ( D1 ): D227 - D233 .
LEWIS T E , SILLITOE I , DAWSON N , et al . Gene3D: extensive prediction of globular domains in proteins [J ] . Nucleic Acids Research , 2018 , 46 ( D1 ): D1282 .
MARCHLER-BAUER A , BO Y , HAN L Y , et al . CDD/SPARCLE: functional classification of proteins via subfamily domain architectures [J ] . Nucleic Acids Research , 2017 , 45 ( D1 ): D200 - D203 .
ZHOU J , CUI G Q , HU S D , et al . Graph neural networks: a review of methods and applications [J ] . AI Open , 2020 , 1 : 57 - 81 .
LIN Z M , AKIN H , RAO R S , et al ., Language models of protein sequences at the scale of evolution enable accurate structure prediction [EB/OL ] . bioRxiv , 2022 [ 2023-02-01 ] . https://www.biorxiv.org/content/10.1101/2022.07.20.500902v1 https://www.biorxiv.org/content/10.1101/2022.07.20.500902v1 .
JUMPER J , EVANS R , PRITZEL A , et al . Highly accurate protein structure prediction with AlphaFold [J ] . Nature , 2021 , 596 ( 7873 ): 583 - 589 .
JING B W , EISMANN S , SURIANA P , et al . Learning from protein structure with geometric vector perceptrons [EB/OL ] . arXiv , 2020 : 2009 . 01411 [ 2023-02-01 ] . https://arxiv.org/abs/2009.01411 https://arxiv.org/abs/2009.01411 .
YUN S J , JEONG M Y , KIM R Y , et al . Graph transformer networks [C/OL ] // Advances in Neural Information Processing Systems 32-NeurIPS 2019[2023-02-01] . https://proceedings.neurips.cc/paper/2019/file/9d63484abb477c97640154d40595a3bb-Paper.pdf https://proceedings.neurips.cc/paper/2019/file/9d63484abb477c97640154d40595a3bb-Paper.pdf .
CHEN T , KORNBLITH S , NOROUZI M , et al . A simple framework for contrastive learning of visual representations [C ] // Proceedings of the 37th International Conference on Machine Learning . New York : ACM , 2020 : 1597 - 1607 .
ZHU Y H , ZHANG C X , YU D J , et al . Integrating unsupervised language model with triplet neural networks for protein gene ontology prediction [J ] . PLoS Computational Biology , 2022 , 18 ( 12 ): e1010793 .
ZHENG S J , RAO J H , SONG Y , et al . PharmKG: a dedicated knowledge graph benchmark for bomedical data mining [J ] . Briefings in Bioinformatics , 2021 , 22 ( 4 ): bbaa344 .
0
Views
1
下载量
1
CSCD
Publicity Resources
Related Articles
Related Author
Related Institution
京公网安备11010802024621