Click
here to close Hello! We notice that
you are using Internet Explorer, which is not supported by Echinobase
and may cause the site to display incorrectly. We suggest using a
current version of Chrome,
FireFox,
or Safari.
PLoS Comput Biol
2019 Aug 26;158:e1006940. doi: 10.1371/journal.pcbi.1006940.
Show Gene links
Show Anatomy links
CNEr: A toolkit for exploring extreme noncoding conservation.
Tan G
,
Polychronopoulos D
,
Lenhard B
.
Abstract
Conserved Noncoding Elements (CNEs) are elements exhibiting extreme noncoding conservation in Metazoan genomes. They cluster around developmental genes and act as long-range enhancers, yet nothing that we know about their function explains the observed conservation levels. Clusters of CNEs coincide with topologically associating domains (TADs), indicating ancient origins and stability of TAD locations. This has suggested further hypotheses about the still elusive origin of CNEs, and has provided a comparative genomics-based method of estimating the position of TADs around developmentally regulated genes in genomes where chromatin conformation capture data is missing. To enable researchers in gene regulation and chromatin biology to start deciphering this phenomenon, we developed CNEr, a R/Bioconductor toolkit for large-scale identification of CNEs and for studying their genomic properties. We apply CNEr to two novel genome comparisons-fruit fly vs tsetse fly, and two sea urchin genomes-and report novel insights gained from their analysis. We also show how to reveal interesting characteristics of CNEs by coupling CNEr with existing Bioconductor packages. CNEr is available at Bioconductor (https://bioconductor.org/packages/CNEr/) and maintained at github (https://github.com/ge11232002/CNEr).
Fig 1. CNEr workflow.(A) A typical pipeline of identification and visualisation of CNEs. (B) Illustration of scanning an alignment for CNEs. The scanning window moves along the alignment for conserved regions. The exons and repeats regions are skipped during the scanning by default.
Fig 2. Horizon plot of CNE density around key developmental genes along D. melanogaster as reference.(A) H15 and mid genes are spanned by arrays of CNEs. Despite the much lower CNE density from D. melanogaster and Glossina, a CNE cluster boundary shows up that is consistent with CNEs from other Drosophila species. (B) The CNE cluster around ct gene is missing in the comparison of D. melanogaster and Glossina since no CNEs are detected. This implies that this region undergoes a higher CNE turnover rate. (C, D) The same loci as in (A, B) are shown on the Ancora browser in order to compare the normal CNE density plot with the horizon plot. Notations: ensGene, Ensembl gene track; Glossina 21/30, G. morsitans 70% identity over 30 bp; droAna2 49/50, D. ananassae 98% identity over 50 bp; dp3 48/50, D. pseudoobscura 96% identity over 50 bp; droMoj2 48/50, D. mojavensis 96% identity over 50 bp; droVir2 48/50, D. virilis 96% identity over 50 bp.
Fig 3. The number of CNEs between D. melanogaster and several other species, including G. morsitans.70% over 30 bp, 80% over 30 bp, 90% over 30 bp in vs. D. ananassae, 70% over 30 bp in vs. D. pseudoobscura are not counted due to the threshold being too low for close species.
Fig 4. Over-represented GO biological process terms ranked by GeneRatio.The gene ratio is defined as the number of genes associated with the term in our selected genes divided by the number number of selected genes. The universe for over-representation test is all Drosophila with GO annotations. The p-values are adjusted using "BH" (Benjamini-Hochberg) correction and the cutoff is 0.05. The visualisation is done by clusterProfiler [35]. (A) GO enrichment for genes nearest to Drosophila and Glossina CNEs. (B) GO enrichment for genes in the missing CNEs clusters compared between Drosophila and Glossina.
Fig 6. Horizon plot of CNE density at Meis loci on sea urchin Strongylocentrotus purpuratus.The density plots of CNEs detected at similarity threshold 96% (48/50), 98% (49/50) and 100% (50/50) over 50 bp sliding window, in Lytechinus variegatus comparison, are shown in three horizon plot tracks. The boundaries of CNE clusters from various thresholds are mutually consistent.
Fig 7. Pairwise overlap analysis of CNEs demonstrates association with (A) Sox2, (B) Pou5f1 and (C) Nanog binding regions. In all three cases, permutation tests with 1000 permutations of CNEs are shown. The number of overlaps of the randomized regions with the test regions of interest (in this case, Sox2, POU5F1 and Nanog) are depicted in grey. Those overlaps of the randomized regions cluster around the black bar that represents the mean. In green, the number of overlaps of the actual regions (CNEs in this case) with the test regions is shown and is proved to be much larger than expected in all cases. The red line denotes the significance limit.
Anderson,
Mapping the Shh long-range regulatory domain.
2014, Pubmed
Anderson,
Mapping the Shh long-range regulatory domain.
2014,
Pubmed
Ayad,
CNEFinder: finding conserved non-coding elements in genomes.
2018,
Pubmed
Bagadia,
Evolutionary Loss of Genomic Proximity to Conserved Noncoding Elements Impacted the Gene Expression Dynamics During Mammalian Brain Development.
2019,
Pubmed
Bejerano,
Ultraconserved elements in the human genome.
2004,
Pubmed
Cameron,
SpBase: the sea urchin genome database and web site.
2009,
Pubmed
,
Echinobase
Cantarel,
MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes.
2008,
Pubmed
Chiang,
Ultraconserved elements: analyses of dosage sensitivity, motifs and boundaries.
2008,
Pubmed
Chiang,
Ultraconserved elements in the human genome: association and transmission analyses of highly constrained single-nucleotide polymorphisms.
2012,
Pubmed
de la Calle-Mustienes,
A functional survey of the enhancer activity of conserved non-coding sequences from vertebrate Iroquois cluster gene deserts.
2005,
Pubmed
Diao,
A new class of temporarily phenotypic enhancers identified by CRISPR/Cas9-mediated genetic screening.
2016,
Pubmed
Engström,
Genomic regulatory blocks underlie extensive microsynteny conservation in insects.
2007,
Pubmed
Engström,
Ancora: a web resource for exploring highly conserved noncoding elements and their association with developmental regulatory genes.
2008,
Pubmed
Gel,
regioneR: an R/Bioconductor package for the association analysis of genomic regions based on permutation tests.
2016,
Pubmed
Hahne,
Visualizing Genomic Data Using Gviz and Bioconductor.
2016,
Pubmed
Harmston,
Topologically associating domains are ancient features that coincide with Metazoan clusters of extreme noncoding conservation.
2017,
Pubmed
Harmston,
The mystery of extreme non-coding conservation.
2013,
Pubmed
Hubisz,
PHAST and RPHAST: phylogenetic analysis with space/time models.
2011,
Pubmed
International Glossina Genome Initiative,
Genome sequence of the tsetse fly (Glossina morsitans): vector of African trypanosomiasis.
2014,
Pubmed
Jaillon,
Genome duplication in the teleost fish Tetraodon nigroviridis reveals the early vertebrate proto-karyotype.
2004,
Pubmed
Kent,
The human genome browser at UCSC.
2002,
Pubmed
Kent,
BLAT--the BLAST-like alignment tool.
2002,
Pubmed
Kent,
Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes.
2003,
Pubmed
Kiełbasa,
Adaptive seeds tame genomic sequence comparison.
2011,
Pubmed
Kikuta,
Genomic regulatory blocks encompass multiple neighboring genes and maintain conserved synteny in vertebrates.
2007,
Pubmed
Kolder,
A full-body transcriptome and proteome resource for the European common carp.
2016,
Pubmed
Lowe,
Three periods of regulatory innovation during vertebrate evolution.
2011,
Pubmed
McCole,
Ultraconserved Elements Occupy Specific Arenas of Three-Dimensional Mammalian Genome Organization.
2018,
Pubmed
Montalbano,
High-Throughput Approaches to Pinpoint Function within the Noncoding Genome.
2017,
Pubmed
Mumbach,
Enhancer connectome in primary human cells identifies target genes of disease-associated DNA elements.
2017,
Pubmed
Nepveu,
Role of the multifunctional CDP/Cut/Cux homeodomain transcription factor in regulating differentiation, cell growth and development.
2001,
Pubmed
Pennacchio,
In vivo enhancer analysis of human conserved non-coding sequences.
2006,
Pubmed
Polychronopoulos,
Conserved non-coding elements: developmental gene regulation meets genome organization.
2017,
Pubmed
Reim,
Tbx20-related genes, mid and H15, are required for tinman expression, proper patterning, and normal differentiation of cardioblasts in Drosophila.
2005,
Pubmed
Royo,
Identification and analysis of conserved cis-regulatory regions of the MEIS1 gene.
2012,
Pubmed
Sandelin,
Arrays of ultraconserved non-coding regions span the loci of key developmental genes in vertebrate genomes.
2004,
Pubmed
Sanjana,
High-resolution interrogation of functional elements in the noncoding genome.
2016,
Pubmed
Schwartz,
Human-mouse alignments with BLASTZ.
2003,
Pubmed
Sheffield,
LOLA: enrichment analysis for genomic region sets and regulatory elements in R and Bioconductor.
2016,
Pubmed
Walter,
Striking nucleotide frequency pattern at the borders of highly conserved vertebrate non-coding sequences.
2005,
Pubmed
Woolfe,
Highly conserved non-coding sequences are associated with vertebrate development.
2005,
Pubmed
Wright,
CRISPR Screens to Discover Functional Noncoding Elements.
2016,
Pubmed
Yates,
Ensembl 2016.
2016,
Pubmed
Yu,
clusterProfiler: an R package for comparing biological themes among gene clusters.
2012,
Pubmed