logo

SCIENCE CHINA Information Sciences, Volume 59, Issue 7: 070106(2016) https://doi.org/10.1007/s11432-016-5583-z

Identifying essential proteins based on dynamic protein-protein interaction networks \\and RNA-Seq datasets

More info
  • ReceivedApr 6, 2016
  • AcceptedApr 18, 2016
  • PublishedJun 6, 2016

Abstract

The identification of essential proteins is not only important for understanding organism structure on the molecular level, but also beneficial to drug-target detection and genetic disease prevention. Traditional methods often employ various centrality indices of static protein-protein interaction (PPI) networks and/or gene expression profiles to predict essential proteins. However, the prediction accuracy of most methods still has room to be further improved. In this study, we propose a strategy to increase the prediction accuracy of essential protein identification in three ways. Firstly, RNA-Seq datasets are employed to construct integrated dynamic PPI networks. Using a RNA-Seq dataset is expected to give more accurate predictions than using microarray gene expression profiles. Secondly, a novel integrated dynamic PPI network is constructed by considering both the co-expression pattern and the co-expression level of the RNA-Seq data. Thirdly, a novel two-step strategy is proposed to identify essential proteins from two known centrality indices. Numerical experiments have shown that the proposed strategy can increase the prediction accuracy dramatically, which can be generalized to many existing methods and centrality indices.


Funded by

National Natural Science Foundation of China(61272121)

Fundamental Research Funds for the Central Universities(3102015JSJ0011)

National Natural Science Foundation of China(61332014)

Fundamental Research Funds for the Central Universities(3102015QD029)


Acknowledgment

Acknowledgments

This work was supported by National Natural Science Foundation of China (Grant Nos. 61272121, 61332014) and Fundamental Research Funds for the Central Universities (Grant Nos. 3102015JSJ0011, 3102015QD029).


References

[1] Giaever G, Chu A M, Ni L, et al. Functional profiling of the Saccharomyces cerevisiae genome. Nature, 2002, 418: 387-391 CrossRef Google Scholar

[2] Cullen L M, Arndt G M. Genome-wide screening for gene function using RNAi in mammalian cells. Immun Cell Biol, 2005, 83: 217-223 CrossRef Google Scholar

[3] Wang J X, Peng W, Wu F X. Computational approaches to predicting essential proteins: a survey. Proteom-Clin Appl, 2013, 7: 181-192 CrossRef Google Scholar

[4] Gerdes S Y, Scholle M D, Campbell J W, et al. Experimental determination and system level analysis of essential genes in Escherichia coli MG1655. J Bacteriol, 2003, 185: 5673-5684 CrossRef Google Scholar

[5] Batada N N, Hurst L D, Tyers M. Evolutionary and physiological importance of hub proteins. PLoS Comput Biol, 2006 2: e88. Google Scholar

[6] Hahn M W, Kern A D. Comparative genomics of centrality and essentiality in three eukaryotic protein-interaction networks. Mol Biol Evol, 2005, 22: 803-806 CrossRef Google Scholar

[7] Yu H, Greenbaum D, Lu H X, et al. Genomic analysis of essentiality within protein networks. Trends Genet, 2004, 20: 227-231 CrossRef Google Scholar

[8] Estrada E. Virtual identification of essential proteins within the protein interaction network of yeast. Proteomics, 2006, 6: 35-40 CrossRef Google Scholar

[9] Li M, Lu Y, Wang J X, et al. A topology potential-based method for identifying essential proteins from PPI networks. IEEE/ACM Trans Comput Biol Bioinform, 2015, 12: 372-383 CrossRef Google Scholar

[10] Ren J, Wang J X, Li M, et al. Discovering essential proteins based on PPI network and protein complex. Int J Data Min Bioinform, 2015, 12: 24-43 CrossRef Google Scholar

[11] Li M, Zheng R Q, Zhang H H, et al. Effective identification of essential proteins based on priori knowledge, network topology and gene expressions. Methods, 2014, 67: 325-333 CrossRef Google Scholar

[12] Tang Y, Li M, Wang J X, et al. CytoNCA: a cytoscape plugin for centrality analysis and evaluation of protein interaction networks. Biosystems, 2015, 127: 67-72 CrossRef Google Scholar

[13] Wasserman S, Faust K. Social Network Analysis: Methods and Applications. Cambridge: Cambridge University Press, 1994. Google Scholar

[14] Freeman L C. Centrality in social networks conceptual clarification. Soc Netw, 1979, 1: 215-239 Google Scholar

[15] Zotenko E, Mestre J, O'leary D P, et al. Why do hubs in the yeast protein interaction network tend to be essential: reexamining the connection between the network topology and essentiality. PLoS Comput Biol, 2008, 4: e1000140-239 CrossRef Google Scholar

[16] Jeong H, Mason S P, Barabási A L, et al. Lethality and centrality in protein networks. Nature, 2001, 411: 41-42 CrossRef Google Scholar

[17] Bonacich P. Power and centrality: a family of measures. Amer J Sociol, 1987, 92: 1170-1182 CrossRef Google Scholar

[18] Li M, Wang J X, Chen X, et al. A local average connectivity-based method for identifying essential proteins from the network level. Comput Biol Chem, 2011, 35: 143-150 CrossRef Google Scholar

[19] Estrada E, Rodriguez-Velazquez J A. Subgraph centrality in complex networks. Phys Rev E, 2005, 71: 056103-150 CrossRef Google Scholar

[20] Wang J X, Peng X Q, Peng W, et al. Dynamic protein interaction network construction and applications. Proteomics, 2014, 14: 338-352 CrossRef Google Scholar

[21] Xiao Q H, Wang J X, Peng X Q, et al. Identifying essential proteins from active PPI networks constructed with dynamic gene expression. BMC Genomics, 2015, 16: S1-352 Google Scholar

[22] Tang X W, Wang J X, Liu B B, et al. A comparison of the functional modules identified from time course and static PPI network data. BMC Bioinform, 2011, 12: 339-352 CrossRef Google Scholar

[23] Jin R M, Mccallen S, Liu C C, et al. Identifying dynamic network modules with temporal and spatial constraints. In: Proceedings of Pacific Symposium on Biocomputing, Big Island of Hawaii, 2009. 203--214. Google Scholar

[24] Luo J W, Kuang L. A new method for predicting essential proteins based on dynamic network topology and complex information. Computl Biol Chem, 2014, 52: 34-42 CrossRef Google Scholar

[25] Chen B L, Fan W W, Liu J, et al. Identifying protein complexes and functional modules from static PPI networks to dynamic PPI networks. Brief Bioinform, 2014, 15: 177-194 CrossRef Google Scholar

[26] Oh S, Song S, Grabowski G, et al. Time series expression analyses using RNA-Seq: a statistical approach. BioMed Res Int, 2013, 203681. Google Scholar

[27] Zhang B, Horvath S. A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol, 2005, 4: 17-194 Google Scholar

[28] Langmead B, Salzberg S L. Fast gapped-read alignment with Bowtie 2. Nat Methods, 2012, 9: 357-359 CrossRef Google Scholar

[29] Ferragina P, Manzini G. Opportunistic data structures with applications. In: Proceedings of IEEE 41st Annual Symposium on Foundations of Computer Science, Redondo Beach, 2000. 390--398. Google Scholar

[30] Trapnell C, Pachter L, Salzberg S L. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics, 2009, 25: 1105-1111 CrossRef Google Scholar

[31] Trapnell C, Roberts A, Goff L, et al. Differential gene and transcript expression analysis of RNA-Seq experiments with TopHat and Cufflinks. Nat Protoc, 2012, 7: 562-578 CrossRef Google Scholar

[32] Wang J X, Li M, Wang H, et al. Identification of essential proteins based on edge clustering coefficient. IEEE/ACM Trans Comput Biol Bioinform, 2012, 9: 1070-1080 CrossRef Google Scholar

[33] Liu G M, Wong L, Chua H N. Complex discovery from weighted PPI networks. Bioinformatics, 2009, 25: 1891-1897 CrossRef Google Scholar

[34] Lage K, Karlberg E O, St\o rling Z M, et al. A human phenome-interactome network of protein complexes implicated in genetic disorders. Nat Biotechnol, 2007, 25: 309-316 CrossRef Google Scholar

[35] Chen Y X, Wang W H, Zhou Y Y, et al. In silico gene prioritization by integrating multiple data sources. PLoS ONE, 2011, 6: e21137-316 CrossRef Google Scholar

[36] Stocchetto S, Marin O, Carignani G, et al. Biochemical evidence that Saccharomyces cerevisiae YGR262c gene, required for normal growth, encodes a novel Ser/Thr-specific protein kinase. FEBS Lett, 1997, 414: 171-175 CrossRef Google Scholar

[37] Jaquet L, Jauniaux J C. Disruption and basic functional analysis of five chromosome X novel ORFs of Saccharomyces cerevisiae reveals YJL125c as an essential gene for vegetative growth. Yeast, 1999, 15: 51-61 CrossRef Google Scholar

[38] Huang M E, Cadieu E, Souciet J L, et al. Disruption of six novel yeast genes reveals three genes essential for vegetative growth and one required for growth at low temperature. Yeast, 1997, 13: 1181-1194 CrossRef Google Scholar

Copyright 2019 Science China Press Co., Ltd. 《中国科学》杂志社有限责任公司 版权所有

京ICP备18024590号-1