logo

SCIENCE CHINA Life Sciences, Volume 62, Issue 4: 526-534(2019) https://doi.org/10.1007/s11427-018-9454-7

Gain of transcription factor binding sites is associated to changes in the expression signature of human brain and testis and is correlated to genes with higher expression breadth

More info
  • ReceivedSep 20, 2018
  • AcceptedOct 15, 2018
  • PublishedMar 22, 2019

Abstract

The gain of transcription factor binding sites (TFBS) is believed to represent one of the major causes of biological innovation. Here we used strategies based on comparative genomics to identify 21,822 TFBS specific to the human lineage (TFBS-HS), when compared to chimpanzee and gorilla genomes. More than 40% (9,206) of these TFBS-HS are in the vicinity of 1,283 genes. A comparison of the expression pattern of these genes and the corresponding orthologs in chimpanzee and gorilla identified genes differentially expressed in human tissues. These genes show a more divergent expression pattern in the human testis and brain, suggesting a role for positive selection in the fixation of TFBS gains. Genes associated with TFBS-HS were enriched in gene ontology categories related to transcriptional regulation, signaling, differentiation/development and nervous system. Furthermore, genes associated with TFBS-HS present a higher expression breadth when compared to genes in general. This biased distribution is due to a preferential gain of TFBS in genes with higher expression breadth rather than a shift in the expression pattern after the gain of TFBS.


Funded by

CAPES Ph.D. fellowships. the Ludwig Institute for Cancer Research and by CAPES(23038.004629/2014-19)


Acknowledgment

The authors are indebted to Jorge E.S. de Souza for discussions on the gene expression analysis. VLS and AMRS were supported by CAPES Ph.D. fellowships. This work was supported by the Ludwig Institute for Cancer Research and by CAPES (23038.004629/2014-19).


Interest statement

The author(s) declare that they have no conflict of interest.


Supplement

SUPPORTING INFORMATION

Table S1 Enrichment analysis of transcription factors within the set of TFBS-HS

Table S2 Differential expression of genes associated to TFBS-HS

The supporting information is available online at http://life.scichina.com and http://link.springer.com. The supporting materials are published as submitted, without typesetting or editing. The responsibility for scientific accuracy and content remains entirely with the authors.


References

[1] Arbiza L., Gronau I., Aksoy B.A., Hubisz M.J., Gulko B., Keinan A., Siepel A.. Genome-wide inference of natural selection on human transcription factor binding sites. Nat Genet, 2013, 45: 723-729 CrossRef PubMed Google Scholar

[2] Brawand D., Soumillon M., Necsulea A., Julien P., Csárdi G., Harrigan P., Weier M., Liechti A., Aximu-Petri A., Kircher M., et al. The evolution of gene expression levels in mammalian organs. Nature, 2011, 478: 343-348 CrossRef PubMed ADS Google Scholar

[3] Cunningham F., Amode M.R., Barrell D., Beal K., Billis K., Brent S., Carvalho-Silva D., Clapham P., Coates G., Fitzgerald S., et al. Ensembl 2015. Nucleic Acids Res, 2015, 43: D662-D669 CrossRef PubMed Google Scholar

[4] Dunham I., Aldred S.F., Collins P.J., Davis C.A., Doyle F., Epstein C.B., Frietze S., Harrow J., Kaul R.. An integrated encyclopedia of DNA elements in the human genome. Nature, 2012, 489: 57-74 CrossRef PubMed ADS Google Scholar

[5] Enard W., Khaitovich P., Klose J., Zöllner S., Heissig F., Giavalisco P., Nieselt-Struwe K., Muchmore E., Varki A., Ravid R., et al. Intra- and interspecific variation in primate gene expression patterns. Science, 2002, 296: 340-343 CrossRef PubMed ADS Google Scholar

[6] Fuchs T., Gavarini S., Saunders-Pullman R., Raymond D., Ehrlich M.E., Bressman S.B., Ozelius L.J.. Mutations in the THAP1 gene are responsible for DYT6 primary torsion dystonia. Nat Genet, 2009, 41: 286-288 CrossRef PubMed Google Scholar

[7] Hurst L.D., Sachenkova O., Daub C., Forrest A.R.R., Huminiecki L., Huminiecki L.. A simple metric of promoter architecture robustly predicts expression breadth of human genes suggesting that most transcription factors are positive regulators. Genome Biol, 2014, 15: 413 CrossRef PubMed Google Scholar

[8] Kasowski M., Grubert F., Heffelfinger C., Hariharan M., Asabere A., Waszak S.M., Habegger L., Rozowsky J., Shi M., Urban A.E., et al. Variation in transcription factor binding among humans. Science, 2010, 328: 232-235 CrossRef PubMed ADS Google Scholar

[9] Kent W.J., Zweig A.S., Barber G., Hinrichs A.S., Karolchik D.. BigWig and BigBed: enabling browsing of large distributed datasets. Bioinformatics, 2010, 26: 2204-2207 CrossRef PubMed Google Scholar

[10] Kulikov A.V., Korostina V.S., Kulikova E.A., Fursenko D.V., Akulov A.E., Moshkin M.P., Prokhortchouk E.B.. Knockout Zbtb33 gene results in an increased locomotion, exploration and pre-pulse inhibition in mice. Behav Brain Res, 2016, 297: 76-83 CrossRef PubMed Google Scholar

[11] Marnetto D., Molineris I., Grassi E., Provero P.. Genome-wide identification and characterization of fixed human-specific regulatory regions. Am J Hum Genet, 2014, 95: 39-48 CrossRef PubMed Google Scholar

[12] Miller W., Rosenbloom K., Hardison R.C., Hou M., Taylor J., Raney B., Burhans R., King D.C., Baertsch R., Blankenberg D., et al. 28-Way vertebrate alignment and conservation track in the UCSC Genome Browser. Genome Res, 2007, 17: 1797-1808 CrossRef PubMed Google Scholar

[13] Ni X., Zhang Y.E., Nègre N., Chen S., Long M., White K.P.. Adaptive evolution and the birth of CTCF binding sites in the Drosophila genome. PLoS Biol, 2012, 10: e1001420 CrossRef PubMed Google Scholar

[14] Petryszak R., Burdett T., Fiorelli B., Fonseca N.A., Gonzalez-Porta M., Hastings E., Huber W., Jupp S., Keays M., Kryvych N., et al. Expression Atlas update—a database of gene and transcript expression from microarray- and sequencing-based functional genomics experiments. Nucl Acids Res, 2014, 42: D926-D932 CrossRef PubMed Google Scholar

[15] Quinlan A.R., Hall I.M.. BEDTools: A flexible suite of utilities for comparing genomic features. Bioinformatics, 2010, 26: 841-842 CrossRef PubMed Google Scholar

[16] R Core Team. (2013). R: A language and environment for statistical computing. doi: 10.1007/978-3-540-74686-7. Google Scholar

[17] Rebeiz M., Castro B., Liu F., Yue F., Posakony J.W.. Ancestral and conserved cis-regulatory architectures in developmental control genes. Dev Biol, 2012, 362: 282-294 CrossRef PubMed Google Scholar

[18] Ribeiro-dos-Santos A.M., da Silva V.L., de Souza J.E.S., de Souza S.J.. Populational landscape of INDELs affecting transcription factor-binding sites in humans. BMC Genom, 2015, 16: 536 CrossRef PubMed Google Scholar

[19] Rosenbloom K.R., Armstrong J., Barber G.P., Casper J., Clawson H., Diekhans M., Dreszer T.R., Fujita P.A., Guruvadoo L., Haeussler M., et al. The UCSC Genome Browser database: 2015 update. Nucleic Acids Res, 2015, 43: D670-D681 CrossRef PubMed Google Scholar

[20] Somel M., Liu X., Tang L., Yan Z., Hu H., Guo S., Jiang X., Zhang X., Xu G., Xie G., et al. MicroRNA-driven developmental remodeling in the brain distinguishes humans from other primates. PLoS Biol, 2011, 9: e1001214 CrossRef PubMed Google Scholar

[21] Tuğrul M., Paixão T., Barton N.H., Tkačik G.. Dynamics of transcription factor binding site evolution. PLoS Genet, 2015, 11: e1005639 CrossRef PubMed Google Scholar

[22] Widenius, M., Axmark, D., and DuBois, P. (2002). MySQL reference manual: documentation from the source (Beijing: O’Reilly, Community Press). Google Scholar

[23] Wray G.A.. The evolutionary significance of cis-regulatory mutations. Nat Rev Genet, 2007, 8: 206-216 CrossRef PubMed Google Scholar

[24] Yu G., Wang L.G., Han Y., He Q.Y.. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS, 2012, 16: 284-287 CrossRef PubMed Google Scholar

[25] Zhang W., Landback P., Gschwend A.R., Shen B., Long M.. New genes drive the evolution of gene interaction networks in the human and mouse genomes. Genome Biol, 2015, 16: 202 CrossRef PubMed Google Scholar

[26] Zhang Y.E., Landback P., Vibranovski M., Long M.. New genes expressed in human brains: implications for annotating evolving genomes. Bioessays, 2012, 34: 982-991 CrossRef PubMed Google Scholar

  • Figure 1

    Workflow for the analysis presented here (A). For more details see Methods. Genome coordinates of TFBS mapped to human-specific regions (TFBS-HS) were compared to the coordinates of human Reference Sequences (B). TFBS-HS associated to human genes were defined based on the localization of a given TFBS-HS within a 6 kb window flanking the TSS (C). The resulting set of 9,206 TFBS and 1,283 genes were then evaluated regarding differential expression, ontology and expression breadth.

  • Figure 2

    Evolution of gene expression in genes associated to TFBS-HS. Pace of expression change in human and chimpanzee brain (A), testis (B) and kidney (C). Expression of the corresponding genes in gorilla was used as a reference. All human genes are shown in bold line while TFBS-HS associated genes are shown in dashed lines.

  • Figure 3

    Gene ontology enrichment analysis for human genes associated to TFBS-HS. Bar color refers to the significance of enrichment.

  • Figure 4

    Expression breadth of genes associated to TFBS-HS. A, Pattern of expression breadth for all human genes (red line) and genes associated to TFBS-HS (blue line) using data from the Human Body Map. B, Same analysis as in (A) using data from the Mammalian Project. C, Pattern of expression breadth for all chimpanzee genes (red line) and for the genes orthologous to the human genes associated to TFBS (blue line). D, Pattern of expression breadth for all gorilla genes (red line) and for the genes orthologous to the human genes associated to TFBS (blue line).

  • Table 1   Genes differentially represented in human tissues

    Gene name

    Human (%)

    Chimpanzee (%)

    Gorilla (%)

    Tissues

    RPS6KA5

    100.0

    57.7

    41.0

    Brain

    GAREML

    37.3

    69.3

    91.8

    Brain

    PTCHD1

    58.7

    94.7

    96.2

    Brain

    CAMTA1

    7.7

    73.3

    53.6

    Brain

    WNT7B

    57.3

    96.2

    90.9

    Brain

    FAM72B

    100.0

    13.3

    21.4

    Brain

    C17orf67

    15.5

    56.2

    63.6

    Brain

    FAM228B

    69.3

    13.8

    12.3

    Brain

    SALL3

    100.0

    32.3

    22.4

    Brain

    STON1-GTF2A1L

    66.7

    17.9

    0.1

    Heart

    C11orf21

    93.4

    44.4

    33.3

    Heart

    SCUBE3

    63.2

    26.2

    0.0

    Heart

    MB21D1

    54.2

    19.6

    23.5

    Heart

    MMP23B

    100.0

    55.0

    58.3

    Heart

    CXorf40A

    50.0

    15.6

    10.8

    Heart

    CFD

    60.3

    17.5

    7.5

    Heart

    SNORD15B

    41.0

    0.0

    0.0

    Heart

    POLR2J2

    100.0

    10.9

    14.0

    Heart

    C17orf75

    88.9

    10.9

    18.2

    Kidney

    F2RL3

    11.0

    72.7

    71.4

    Kidney

    CRIM1

    16.1

    52.4

    49.3

    Kidney

    IHH

    66.9

    21.6

    9.0

    Kidney

    GBA

    0.0

    62.4

    40.3

    Kidney

    ODF3B

    4.9

    44.0

    35.5

    Kidney

    MRGPRX3

    30.3

    0.0

    0.0

    Kidney

    SALL3

    0.0

    61.5

    36.7

    Kidney

    SYT7

    32.9

    0.7

    1.0

    Liver

    RAP2C

    62.1

    31.5

    27.5

    Liver

    RALGPS1

    36.7

    0.7

    2.9

    Liver

    MGAT4B

    83.2

    28.9

    32.6

    Liver

    IHH

    26.8

    77.7

    90.5

    Liver

    CD52

    2.5

    37.3

    53.1

    Liver

    PUSL1

    100.0

    25.0

    33.3

    Liver

    GBA

    66.7

    13.5

    15.3

    Liver

    ODF3B

    14.7

    46.6

    49.1

    Liver

    TCF3

    83.4

    14.1

    46.9

    Testis

    SLC35E4

    63.7

    6.2

    17.5

    Testis

    TNNC2

    89.9

    17.5

    55.3

    Testis

    KCNN4

    57.9

    12.1

    25.0

    Testis

    C17orf75

    0.0

    40.0

    48.5

    Testis

    VAX1

    61.5

    9.1

    0.0

    Testis

    NOD2

    33.6

    2.3

    0.0

    Testis

    CD52

    96.2

    4.3

    4.7

    Testis

    ELP5

    92.0

    59.1

    35.4

    Testis

    IRX2

    44.9

    14.3

    12.8

    Testis

    SHCBP1

    50.3

    90.2

    98.4

    Testis

    LRRC37A

    69.0

    37.6

    20.3

    Testis

    ODF3B

    44.2

    3.1

    5.5

    Testis

    MRGPRX3

    36.4

    80.0

    100.0

    Testis

    FAM72B

    0.0

    80.0

    78.6

    Testis

    FAM228B

    0.0

    77.3

    80.2

    Testis

    Values correspond to the proportional level of expression of the corresponding gene in the corresponding tissue.

  • Table 2   Expression divergence between species used in this study

    Brain

    Heart

    Liver

    Kidney

    Testis

    Median distance (TFBS-HS)

    5.4

    3.6

    2.00

    4.00

    8.00

    Median distance (all orthologous genes)

    2.66

    2.00

    1.33

    3.33

    5.33

    % increase

    103

    80

    50.38

    20.12

    50.1

    Values correspond to the median distance of orthologous genes to a hypothetical line representing equal expression in the same tissue in all three species.

Copyright 2020 Science China Press Co., Ltd. 《中国科学》杂志社有限责任公司 版权所有

京ICP备18024590号-1