National Science Review, Volume 7 , Issue 6 : 1012-1023(2020) https://doi.org/10.1093/nsr/nwaa036

On the origin and continuing evolution of SARS-CoV-2

More info
  • ReceivedFeb 25, 2020
  • AcceptedMar 3, 2020
  • PublishedMar 3, 2020


The SARS-CoV-2 epidemic started in late December 2019 in Wuhan, China, and has since impacted a large portion of China and raised major global concern. Herein, we investigated the extent of molecular divergence between SARS-CoV-2 and other related coronaviruses. Although we found only 4% variability in genomic nucleotides between SARS-CoV-2 and a bat SARS-related coronavirus (SARSr-CoV; RaTG13), the difference at neutral sites was 17%, suggesting the divergence between the two viruses is much larger than previously estimated. Our results suggest that the development of new variations in functional sites in the receptor-binding domain (RBD) of the spike seen in SARS-CoV-2 and viruses from pangolin SARSr-CoVs are likely caused by natural selection besides recombination. Population genetic analyses of 103 SARS-CoV-2 genomes indicated that these viruses had two major lineages (designated L and S), that are well defined by two different SNPs that show nearly complete linkage across the viral strains sequenced to date. We found that L lineage was more prevalent than the S lineage within the limited patient samples we examined. The implication of these evolutionary changes on disease etiology remains unclear. These findings strongly underscores the urgent need for further comprehensive studies that combine viral genomic data, with epidemiological studies of coronavirus disease 2019 (COVID-19).


[1] Lu R, Zhao X, Li J et al. Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding. Lancet 2020; 395: 565-74. CrossRef Google Scholar

[2] Zhou P, Yang XL, Wang XG et al. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature 2020; 579: 270-3. CrossRef Google Scholar

[3] Ren L-L, Wang Y-M, Wu Z-Q et al. Identification of a novel coronavirus causing severe pneumonia in human. Chin Med J 2020; 133: 1015-24. CrossRef Google Scholar

[4] Cui J, Li F, Shi Z-L . Origin and evolution of pathogenic coronaviruses. Nat Rev Microbiol 2019; 17: 181-92. CrossRef Google Scholar

[5] Li X, Song Y, Wong G et al. Bat origin of a new human coronavirus: there and back again. Sci China Life Sci 2020; 63: 461-2. CrossRef Google Scholar

[6] Li W, Shi Z, Yu M et al. Bats are natural reservoirs of SARS-like coronaviruses. Science 2005; 310: 676-9. CrossRef Google Scholar

[7] Dominguez SR, O'Shea TJ, Oko LM et al. Detection of group 1 coronaviruses in bats in North America. Emerg Infect Dis 2007; 13: 1295-300. CrossRef Google Scholar

[8] Wu A, Peng Y, Huang B et al. Genome composition and divergence of the novel coronavirus (2019-nCoV) originating in China. Cell Host Microbe 2020; 27: 325-8. CrossRef Google Scholar

[9] Xu X, Chen P, Wang J et al. Evolution of the novel coronavirus from the ongoing Wuhan outbreak and modeling of its spike protein for risk of human transmission. Sci China Life Sci 2020; 63: 457-60. CrossRef Google Scholar

[10] Benvenuto D, Giovanetti M, Ciccozzi A et al. The 2019-new coronavirus epidemic: evidence for virus evolution. J Med Virol 2020; 92: 455-9. CrossRef Google Scholar

[11] Zhu N, Zhang D, Wang W et al. A novel coronavirus from patients with pneumonia in China, 2019. N Engl J Med 2020; 382: 727-33. Google Scholar

[12] Chan JF, Kok KH, Zhu Z et al. Genomic characterization of the 2019 novel human-pathogenic coronavirus isolated from a patient with atypical pneumonia after visiting Wuhan. Emerg Microbes Infect 2020; 9: 221-36. CrossRef Google Scholar

[13] Wei X, Li X, Cui J . Evolutionary perspectives on novel coronaviruses identified in pneumonia cases in China. Natl Sci Rev 2020; 7: 239-42. CrossRef Google Scholar

[14] Paraskevis D, Kostaki EG, Magiorkinis G et al. Full-genome evolutionary analysis of the novel corona virus (2019-nCoV) rejects the hypothesis of emergence as a result of a recent recombination event. Infect Genet Evol 2020; 79: 104212. CrossRef Google Scholar

[15] Gralinski LE, Menachery VD . Return of the coronavirus: 2019-nCoV. Viruses 2020; 12: 135. CrossRef Google Scholar

[16] Wong MC, Cregeen SJJ, Ajami NJ et al. Evidence of recombination in coronaviruses implicating pangolin origins of nCoV-2019. bioRxiv 2020. https://doi.org/10.1101/2020.02.07.939207. CrossRef Google Scholar

[17] Xiao K, Zhai J, Feng Y et al. Isolation and characterization of 2019-nCoV-like coronavirus from malayan pangolins. bioRxiv 2020. doi: 10.1101/2020.02.17.951335. Google Scholar

[18] Lam TT, Shum MH, Zhu H et al. Identifying SARS-CoV-2 related coronaviruses in Malayan pangolins. Nature 2020. https://doi.org/10.1038/s41586-020-2169-0. CrossRef Google Scholar

[19] Wu C-I, Poo MM . Moral imperative for the immediate release of 2019-nCoV sequence data. Natl Sci Rev 2020; 7: 719-20. CrossRef Google Scholar

[20] Yang Z . PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol 2007; 24: 1586-91. CrossRef Google Scholar

[21] Hanson G, Coller J . Codon optimality, bias and usage in translation and mRNA decay. Nat Rev Mol Cell Biol 2018; 19: 20-30. CrossRef Google Scholar

[22] Wan Y, Shang J, Graham R et al. Receptor recognition by novel coronavirus from Wuhan: an analysis based on decade-long structural studies of SARS. J Virol 2020; 94: e00127-20. CrossRef Google Scholar

[23] Wrapp D, Wang N, Corbett KS et al. Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation. Science 2020; 367: 1260-3. Google Scholar

[24] Ou X, Liu Y, Lei X et al. Characterization of spike glycoprotein of SARS-CoV-2 on virus entry and its immune cross-reactivity with spike glycoprotein of SARS-CoV. Nat Commun 2020; 11: 1620. Google Scholar

[25] Qu X-X, Hao P, Song X-J et al. Identification of two critical amino acid residues of the severe acute respiratory syndrome coronavirus spike protein for its variation in zoonotic tropism transition via a double substitution strategy. J Biol Chem 2005; 280: 29588-95. CrossRef Google Scholar

[26] Ren W, Qu X, Li W et al. Difference in receptor usage between severe acute respiratory syndrome (SARS) coronavirus and SARS-Like coronavirus of bat origin. J Virol 2008; 82: 1899-907. CrossRef Google Scholar

[27] Wu F, Zhao S, Yu B et al. A new coronavirus associated with human respiratory disease in China. Nature 2020; 579: 265-9. CrossRef Google Scholar

[28] Ji W, Wang W, Zhao X et al. Homologous recombination within the spike glycoprotein of the newly identified coronavirus may boost cross‐species transmission from snake to human. J Med Virol 2020; 92: 433-40. CrossRef Google Scholar

[29] Zhao Z, Li H, Wu X et al. Moderate mutation rate in the SARS coronavirus genome and its implications. BMC Evol Biol 2004; 4: 21. CrossRef Google Scholar

[30] Zhang C, Wang M . Origin time and epidemic dynamics of the 2019 novel coronavirus. bioRxiv 2020. https://doi.org/10.1101/2020.01.25.919688. CrossRef Google Scholar

[31] Yu W-B, Tang G-D, Zhang L, Corlett RT . Decoding evolution and transmissions of novel pneumonia coronavirus using the whole genomic data. Zool Res 2020; 41: 247-57. CrossRef Google Scholar

[32] Barrett JC, Fry B, Maller J et al. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 2005; 21: 263-5. CrossRef Google Scholar

[33] Waterson RH, Lander ES, Wilson RK et al. Initial sequence of the chimpanzee genome and comparison with the human genome. Nature 2005; 437: 69-87. CrossRef Google Scholar

[34] Gibbs RA, Rogers J, Katze MG et al. Evolutionary and biomedical insights from the rhesus macaque genome. Science 2007; 316: 222. CrossRef Google Scholar

[35] Waterston RH, Lindblad-Toh K, Birney E et al. Initial sequencing and comparative analysis of the mouse genome. Nature 2002; 420: 520-62. CrossRef Google Scholar

[36] Graham RL, Sparks JS, Eckerle LD et al. SARS coronavirus replicase proteins in pathogenesis. Virus Res 2008; 133: 88-100. CrossRef Google Scholar

[37] Hu B, Zeng L-P, Yang X-L et al. Discovery of a rich gene pool of bat SARS-related coronaviruses provides new insights into the origin of SARS coronavirus. PLoS Pathog 2017; 13: e1006698. CrossRef Google Scholar

[38] Edgar RC . MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 2004; 32: 1792-7. CrossRef Google Scholar

[39] Slater GS, Birney E . Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics 2005; 6: 31. CrossRef Google Scholar

[40] Wernersson R, Pedersen AG . RevTrans: multiple alignment of coding DNA from aligned amino acid sequences. Nucleic Acids Res 2003; 31: 3537-9. CrossRef Google Scholar

[41] Kumar S, Stecher G, Li M et al. MEGA X: molecular evolutionary genetics analysis across computing platforms. Mol Biol Evol 2018; 35: 1547-9. CrossRef Google Scholar

[42] Gao F, Chen C, Arab DA et al. EasyCodeML: a visual tool for analysis of selection using CodeML. Ecol Evol 2019; 9: 3891-8. CrossRef Google Scholar

[43] Rozas J, Ferrer-Mata A, Sanchez-DelBarrio JC et al. DnaSP 6: DNA sequence polymorphism analysis of large data sets. Mol Biol Evol 2017; 34: 3299-302. CrossRef Google Scholar

[44] Leigh JW, Bryant D . popart: full-feature software for haplotype network construction. Methods Ecol Evol 2015; 6: 1110-6. CrossRef Google Scholar

[45] Stamatakis A . RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 2014; 30: 1312-3. CrossRef Google Scholar

[46] Li H, Durbin R . Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 2009; 25: 1754-60. CrossRef Google Scholar

[47] Li H, Handsaker B, Wysoker A et al. The sequence alignment/map format and SAMtools. Bioinformatics 2009; 25: 2078-9. CrossRef Google Scholar

[48] Sharp PM, Li WH . Codon usage in regulatory genes in Escherichia coli does not reflect selection for 'rare' codons. Nucleic Acids Res 1986; 14: 7737-49. CrossRef Google Scholar

Copyright 2020  CHINA SCIENCE PUBLISHING & MEDIA LTD.  中国科技出版传媒股份有限公司  版权所有

京ICP备14028887号-23       京公网安备11010102003388号