Click
here to close Hello! We notice that
you are using Internet Explorer, which is not supported by Echinobase
and may cause the site to display incorrectly. We suggest using a
current version of Chrome,
FireFox,
or Safari.
Genomics Proteomics Bioinformatics
2008 Dec 01;63-4:144-54. doi: 10.1016/S1672-0229(09)60002-4.
Show Gene links
Show Anatomy links
Comparative genomic study reveals a transition from TA richness in invertebrates to GC richness in vertebrates at CpG flanking sites: an indication for context-dependent mutagenicity of methylated CpG sites.
Abstract
Vertebrate genomes are characterized with CpG deficiency, particularly for GC-poor regions. The GC content-related CpG deficiency is probably caused by context-dependent deamination of methylated CpG sites. This hypothesis was examined in this study by comparing nucleotide frequencies at CpG flanking positions among invertebrate and vertebrate genomes. The finding is a transition of nucleotide preference of 5'' T to 5'' A at the invertebrate-vertebrate boundary, indicating that a large number of CpG sites with 5'' Ts were depleted because of global DNA methylation developed in vertebrates. At genome level, we investigated CpG observed/expected (obs/exp) values in 500 bp fragments, and found that higher CpG obs/exp value is shown in GC-poor regions of invertebrate genomes (except sea urchin) but in GC-rich sequences of vertebrate genomes. We next compared GC content at CpG flanking positions with genomic average, showing that the GC content is lower than the average in invertebrate genomes, but higher than that in vertebrate genomes. These results indicate that although 5'' T and 5'' A are different in inducing deamination of methylated CpG sites, GC content is even more important in affecting the deamination rate. In all the tests, the results of sea urchin are similar to vertebrates perhaps due to its fractional DNA methylation. CpG deficiency is therefore suggested to be mainly a result of high mutation rates of methylated CpG sites in GC-poor regions.
Fig. 1. Nucleotide frequency at CpG flanking positions. Nucleotide frequencies at CpG flanking positions were obtained from repeat-masked genomic sequences. Six 5â² flanking positions are labeled as â1 to â6; six 3â² flanking positions are labeled as 1 to 6.
Fig. 2. Percentage of genomic fragments in different CpG obs/exp (o/e) ranges. CpG obs/exp values of all the genomic fragments in 500Â bp unit were calculated. The fragments have been classified into three groups according to GC content. The axis x represents CpG obs/exp ranges in which the proportion of the fragments was shown. The three lines represent the proportions of the genomic segments for three GC groups: G+C=20%â35% (solid black line), 35%â50% (dashed line) and >50% (gray solid line).
Fig. 3. Cumulative percent of genomic fragments in CpG obs/exp (o/e) ranges. All of the genomic fragments of 500Â bp from 8 organisms were classified according to their GC level (G+C=20%â35%, 35%â50% and >50%) and CpG obs/exp values were then measured. In each GC level group, the percentage of the fragments belonging to individual CpG obs/exp ranges (interval is 0.05) was measured as shown in Figure 2, and then cumulative percentage was calculated for each of the CpG obs/exp ranges. The numbers on axis x represent the starting value of the CpG obs/exp ranges; the axis y denotes cumulative percentage of the genomic fragments.
Fig. 4. G+C percent difference between real and Markov artificial sequences. The positions of â6 to +6 are the 12 flanking positions of CpG sites. The GC contents in percent were measured in real and Markov artificial sequences, and the GC content differences (GCrealâGCartificial) were as shown of individual positions. The artificial sequences were Markov first-order sequences created using repeat-masked real genomic sequences.
Aïssani,
CpG islands, genes and isochores in the genomes of vertebrates.
1991, Pubmed
Aïssani,
CpG islands, genes and isochores in the genomes of vertebrates.
1991,
Pubmed
Bernardi,
The mosaic genome of warm-blooded vertebrates.
1985,
Pubmed
Bird,
DNA methylation and the frequency of CpG in animal DNA.
1980,
Pubmed
Cacciò,
Methylation patterns in the isochores of vertebrate genomes.
1997,
Pubmed
Chargaff,
How genetics got a chemical education.
1979,
Pubmed
Cooper,
The CpG dinucleotide and human genetic disease.
1988,
Pubmed
Cross,
Non-methylated islands in fish genomes are GC-poor.
1991,
Pubmed
Eyre-Walker,
The evolution of isochores.
2001,
Pubmed
Forsdyke,
Chargaff's legacy.
2000,
Pubmed
Forsdyke,
Symmetry observations in long nucleotide sequences: a commentary on the Discovery Note of Qi and Cuticchia.
2002,
Pubmed
Fryxell,
Cytosine deamination plays a primary role in the evolution of mammalian isochores.
2000,
Pubmed
Fryxell,
CpG mutation rates in the human genome are highly dependent on local GC content.
2005,
Pubmed
Fullerton,
Local rates of recombination are positively correlated with GC content in the human genome.
2001,
Pubmed
Gowher,
DNA of Drosophila melanogaster contains 5-methylcytosine.
2000,
Pubmed
Honeybee Genome Sequencing Consortium,
Insights into social insects from the genome of the honeybee Apis mellifera.
2006,
Pubmed
International Chicken Genome Sequencing Consortium,
Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution.
2004,
Pubmed
Jabbari,
Evolutionary changes in CpG and methylation levels in the genome of vertebrates.
1997,
Pubmed
Jabbari,
Cytosine methylation and CpG, TpG (CpA) and TpA frequencies.
2004,
Pubmed
Jabbari,
CpG doublets, CpG islands and Alu repeats in long human DNA sequences from different isochore families.
1998,
Pubmed
JOSSE,
Enzymatic synthesis of deoxyribonucleic acid. VIII. Frequencies of nearest neighbor base sequences in deoxyribonucleic acid.
1961,
Pubmed
Karlin,
Dinucleotide relative abundance extremes: a genomic signature.
1995,
Pubmed
Karlin,
Heterogeneity of genomes: measures and values.
1994,
Pubmed
Lander,
Initial sequencing and analysis of the human genome.
2001,
Pubmed
Lyko,
DNA methylation in Drosophila melanogaster.
2000,
Pubmed
Nekrutenko,
Assessment of compositional heterogeneity within and between eukaryotic genomes.
2000,
Pubmed
Pesole,
Structural and compositional features of untranslated regions of eukaryotic mRNAs.
1997,
Pubmed
Qi,
Compositional symmetries in complete genomes.
2001,
Pubmed
Rollins,
Large-scale structure of genomic methylation patterns.
2006,
Pubmed
Shen,
The rate of hydrolytic deamination of 5-methylcytosine in double-stranded DNA.
1994,
Pubmed
Simmen,
Nonmethylated transposable elements and methylated genes in a chordate genome.
1999,
Pubmed
Suzuki,
CpG methylation is targeted to transcription units in an invertebrate genome.
2007,
Pubmed
Takai,
Comprehensive analysis of CpG islands in human chromosomes 21 and 22.
2002,
Pubmed
Tweedie,
Methylation of genomes and genes at the invertebrate-vertebrate boundary.
1997,
Pubmed
Varriale,
DNA methylation and body temperature in fishes.
2006,
Pubmed
Wang,
Functional CpG methylation system in a social insect.
2006,
Pubmed
Weiner,
Nonviral retroposons: genes, pseudogenes, and transposable elements generated by the reverse flow of genetic information.
1986,
Pubmed
Yoder,
Cytosine methylation and the ecology of intragenomic parasites.
1997,
Pubmed
You,
Methylation of CpG dinucleotides in the lacI gene of the Big Blue transgenic mouse.
1998,
Pubmed