Click
here to close Hello! We notice that
you are using Internet Explorer, which is not supported by Echinobase
and may cause the site to display incorrectly. We suggest using a
current version of Chrome,
FireFox,
or Safari.
Mob DNA
2011 Oct 19;21:12. doi: 10.1186/1759-8753-2-12.
Show Gene links
Show Anatomy links
Crypton transposons: identification of new diverse families and ancient domestication events.
Kojima KK
,
Jurka J
.
???displayArticle.abstract???
BACKGROUND: "Domestication" of transposable elements (TEs) led to evolutionary breakthroughs such as the origin of telomerase and the vertebrate adaptive immune system. These breakthroughs were accomplished by the adaptation of molecular functions essential for TEs, such as reverse transcription, DNA cutting and ligation or DNA binding. Cryptons represent a unique class of DNA transposons using tyrosine recombinase (YR) to cut and rejoin the recombining DNA molecules. Cryptons were originally identified in fungi and later in the sea anemone, sea urchin and insects.
RESULTS: Herein we report new Cryptons from animals, fungi, oomycetes and diatom, as well as widely conserved genes derived from ancient Crypton domestication events. Phylogenetic analysis based on the YR sequences supports four deep divisions of Crypton elements. We found that the domain of unknown function 3504 (DUF3504) in eukaryotes is derived from Crypton YR. DUF3504 is similar to YR but lacks most of the residues of the catalytic tetrad (R-H-R-Y). Genes containing the DUF3504 domain are potassium channel tetramerization domain containing 1 (KCTD1), KIAA1958, zinc finger MYM type 2 (ZMYM2), ZMYM3, ZMYM4, glutamine-rich protein 1 (QRICH1) and "without children" (WOC). The DUF3504 genes are highly conserved and are found in almost all jawed vertebrates. The sequence, domain structure, intron positions and synteny blocks support the view that ZMYM2, ZMYM3, ZMYM4, and possibly QRICH1, were derived from WOC through two rounds of genome duplication in early vertebrate evolution. WOC is observed widely among bilaterians. There could be four independent events of Crypton domestication, and one of them, generating WOC/ZMYM, predated the birth of bilaterian animals. This is the third-oldest domestication event known to date, following the domestication generating telomerase reverse transcriptase (TERT) and Prp8. Many Crypton-derived genes are transcriptional regulators with additional DNA-binding domains, and the acquisition of the DUF3504 domain could have added new regulatory pathways via protein-DNA or protein-protein interactions.
CONCLUSIONS: Cryptons have contributed to animal evolution through domestication of their YR sequences. The DUF3504 domains are domesticated YRs of animal Crypton elements.
???displayArticle.pubmedLink???
22011512
???displayArticle.pmcLink???PMC3212892 ???displayArticle.link???Mob DNA
Figure 1. Schematic structures of Cryptons. Crypton-Cn1 and MarCry-1_FO belong to the CryptonF group. YR = tyrosine recombinase; GCR1_C = DNA-binding domain; DDE = DDE-transposase; C48 = C48 peptidase; HTH = helix-turn-helix motif.
Figure 2. Phylogeny of Cryptons, DUF3504 genes and other eukaryotic tyrosine recombinases. The numbers at nodes are bootstrap values over 40. Open circles indicate the clusters of Cryptons, and filled circles show the clusters of DUF3504 genes. YR = tyrosine recombinase. Prefixes of names are as follows. Cry = Crypton; 1958 = KIAA1958. Accession numbers of DUF3504 genes are shown in Additional file 5. Sequences of the transposable elements are deposited in Repbase http://www.girinst.org/repbase/. Other abbreviations and accession numbers are as follows. FLP = FLP recombinase of the 2-micron plasmid in Saccharomyces cerevisiae (NP_040488); FLP_Klac = FLP recombinase of the plasmid pKD1 in Kluyveromyces lactis (YP_355327); CRE = Cre recombinase of the enterobacteria phage P1 (YP_006472); Vlf1_AcNPV = very late expression factor 1 from the Autographa californica nucleopolyhedrovirus (NP_054107); Tn916 = Tn916 transposase from Enterococcus faecalis (NP_0687929); XerD = XerD from Escherichia coli (NP_417370); Lambda = lambda phage recombinase (NP_040609); At_Ti = recombinase from the Agrobacterium tumefaciens Ti plasmid (NP_059767); SpPat1 from Strongylocentrotus purpuratus (obtained at http://biocadmin.otago.ac.nz/fmi/xsl/retrobase/home.xsl). Suffixes for species names are as follows. Animals: Hs = human, Homo sapiens; Oa = platypus, Ornithorhynchus anatinus; Gg = chicken, Gallus gallus; Tg = zebra finch, Taeniopygia guttata; Ac/ACa = lizard, Anolis carolinensis; Xt/XT = frog, Xenopus tropicalis; Dr/DR = zebrafish, Danio rerio; OL = medaka, Oryzias latipes; Cm = chimaera, Callorhinchus milii; SP = sea urchin, Strongylocentrotus purpuratus; SK = acorn worm, Saccoglossus kowalevskii; Dm = fruit fly, Drosophila melanogaster; Tc/TC/TCa = beetle, Tribolium castaneum; NVi = parasitic wasp, Nasonia vitripennis; CQ = southern house mosquito, Culex quinquefasciatus; AA = yellow fever mosquito, Aedes aegypti; DPu = water flea, Daphnia pulex; Acal = sea hare, Aplysia californica; Sm = bloodfluke, Schistosoma mansoni; NV = sea anemone, Nematostella vectensis. Fungi: RO = Rhizopus oryzae; CGlo = Chaetomium globosum; TS = Talaromyces stipitatus; CI = Coccidioides immitis; FO = Fusarium oxysporum. Stramenopiles: PI = Phytophthora infestans; PS = Phytophthora sojae; PU = Pythium ultimum; HAra = Hyaloperonospora arabidopsidis; ALai = Albugo laibachii; PTri = Phaeodactylum tricornutum. Plants: CR = Chlamydomonas reinhardtii.
Figure 3. Distribution and schematic structures of Crypton-derived genes in Saccharomycetaceae fungi. (A) Schematic protein structures encoded by Crypton-derived genes and Cryptons. (B) Distribution of Crypton-derived genes. Each gene identified in the haploid genome is represented by a plus symbol. (C) The phylogeny of Crypton-derived genes and Cryptons using the GCR1_C domain sequences. The numbers at nodes are bootstrap values over 50. Accession numbers of genes are shown in Additional file 2. "Cry" stands for Crypton. Suffixes for species names are as follows. Sc = Saccharomyces cerevisiae; Cg = Candida glabrata; Vp = Vanderwaltozyma polyspora; Zr = Zygosaccharomyces rouxii; Lt = Lachancea thermotolerans; Kl = Kluyveromyces lactis; Ag = Ashbya gossypii; Ct = Candida tropicalis; Ca = Candida albicans; Ps = Pichia stipitis; Pg = Pichia guilliermondii.
Figure 4. Crypton-derived sequence in an intron of ATF7IP gene. (A) Alignment of proteins coded by deuterostome Cryptons and Crypton-derived sequences. Catalytically essential residues are shown below the alignment. (B) Illustration of the conservation of ATF7IP loci. The position of the YR sequence is indicated by the open box. Black boxes represent exons of the chicken ATF7IP gene. Gray boxes indicate conserved blocks between chicken and respective species based on the Net Tracks of the UCSC Genome Browser http://genome.ucsc.edu/. Lines between gray boxes indicate that boxes are connected by unalignable sequences. (C) Alignment of nucleotide sequences of Crypton-derived sequences.
Figure 5. Schematic structures of DUF3504 proteins. KIAA1958 gene has two isoforms, each of which encodes a DUF3504 domain. The structures of KCTD1, KIAA1958, QRICH1, ZMYM2, ZMYM3 and ZMYM4 are from humans. The structure of WOC is from Drosophila melanogaster.
Figure 6. Distribution of Cryptons and Crypton-derived genes. Each gene identified in the haploid genome is represented by a plus symbol. Minus symbols indicate the absence of Cryptons or Crypton-derived genes. Asterisks indicate the presence of their disrupted fragments. The branch ages are based on TimeTree [30]. The unit of time is indicated. Crypton-derived genes listed at nodes of the tree indicate the times of their domestication based on their distribution in different species. KIAA1958L, QRICH1, ZMYM2, ZMYM3 and ZMYM4 are not shown, because they were likely derived by gene duplications.
Figure 7. Paralogous relationships of WOC/ZMYM/QRICH1 genes. (A) Two conserved intron positions among WOC, ZMYM2, ZMYM3, ZMYM4 and QRICH1. Introns are printed in lowercase letters and shaded. Protein sequences are shown below nucleotide sequences. The upper and lower intron positions correspond to the 20th and 22nd introns of human ZMYM2, respectively. (B) The synteny blocks of ZMYM2, ZMYM3 and ZMYM4. Ohnologous relationships reported by Makino and McLysaght [33] are indicated by dotted lines. GJB = gap junction protein β; GJA = gap junction protein α; DLGAP3 = discs large homolog-associated protein 3; C1orf212 = chromosome 1 open reading frame 212. Other gene names are described in the text.
Altschul,
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.
1997, Pubmed
Altschul,
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.
1997,
Pubmed
Anisimova,
Approximate likelihood-ratio test for branches: A fast, accurate, and powerful alternative.
2006,
Pubmed
Aziz,
Transposases are the most abundant, most ubiquitous genes in nature.
2010,
Pubmed
Bao,
Ginger DNA transposons in eukaryotes and their evolutionary relationships with long terminal repeat retrotransposons.
2010,
Pubmed
Bao,
New superfamilies of eukaryotic DNA transposons and their internal divisions.
2009,
Pubmed
Broach,
Replication and recombination functions associated with the yeast plasmid, 2 mu circle.
1980,
Pubmed
Bundock,
An Arabidopsis hAT-like transposase is essential for plant development.
2005,
Pubmed
Casola,
Convergent domestication of pogo-like transposases into centromere-binding proteins in fission yeast and mammals.
2008,
Pubmed
Curcio,
The outs and ins of transposition: from mu to kangaroo.
2003,
Pubmed
Ding,
The interaction of KCTD1 with transcription factor AP-2alpha inhibits its transactivation.
2009,
Pubmed
Dlakić,
Prp8, the pivotal protein of the spliceosomal catalytic center, evolved from a retroelement-encoded reverse transcriptase.
2011,
Pubmed
Doak,
Selection on the genes of Euplotes crassus Tec1 and Tec2 transposons: evolutionary appearance of a programmed frameshift in a Tec2 gene encoding a tyrosine family site-specific recombinase.
2003,
Pubmed
Dong,
PTB-associated splicing factor (PSF) functions as a repressor of STAT6-mediated Ig epsilon gene transcription by recruitment of HDAC1.
2011,
Pubmed
Dutta,
Kctd15 inhibits neural crest formation by attenuating Wnt/beta-catenin signaling output.
2010,
Pubmed
Edgar,
MUSCLE: multiple sequence alignment with high accuracy and high throughput.
2004,
Pubmed
Feng,
Human L1 retrotransposon encodes a conserved endonuclease required for retrotransposition.
1996,
Pubmed
Font-Burgada,
Drosophila HP1c isoform interacts with the zinc-finger proteins WOC and Relative-of-WOC to regulate gene expression.
2008,
Pubmed
Gillette,
Probing the Escherichia coli transcriptional activator MarA using alanine-scanning mutagenesis: residues important for DNA binding and activation.
2000,
Pubmed
Gladyshev,
Telomere-associated endonuclease-deficient Penelope-like retroelements in diverse eukaryotes.
2007,
Pubmed
Gocke,
ZNF198 stabilizes the LSD1-CoREST-HDAC1 complex on chromatin through its MYM-type zinc fingers.
2008,
Pubmed
Goh,
NDC10: a gene involved in chromosome segregation in Saccharomyces cerevisiae.
1993,
Pubmed
Golic,
The FLP recombinase of yeast catalyzes site-specific recombination in the Drosophila genome.
1989,
Pubmed
Goodwin,
The DIRS1 group of retrotransposons.
2001,
Pubmed
,
Echinobase
Goodwin,
A new group of tyrosine recombinase-encoding retrotransposons.
2004,
Pubmed
Goodwin,
Cryptons: a group of tyrosine-recombinase-encoding DNA transposons from pathogenic fungi.
2003,
Pubmed
Greider,
The telomere terminal transferase of Tetrahymena is a ribonucleoprotein enzyme with two kinds of primer specificity.
1987,
Pubmed
Guindon,
PHYML Online--a web server for fast maximum likelihood-based phylogenetic inference.
2005,
Pubmed
Guindon,
New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0.
2010,
Pubmed
Guo,
Structure of Cre recombinase complexed with DNA in a site-specific recombination synapse.
1997,
Pubmed
Hakimi,
A candidate X-linked mental retardation gene is a component of a new family of histone deacetylase-containing complexes.
2003,
Pubmed
Hammer,
Homologs of Drosophila P transposons were mobile in zebrafish but have been domesticated in a common ancestor of chicken and human.
2005,
Pubmed
Holland,
The GCR1 gene encodes a positive transcriptional regulator of the enolase and glyceraldehyde-3-phosphate dehydrogenase gene families in Saccharomyces cerevisiae.
1987,
Pubmed
Jacobs,
Tec3, a new developmentally eliminated DNA element in Euplotes crassus.
2003,
Pubmed
Jiang,
Isolation and characterization of a gene (CBF2) specifying a protein component of the budding yeast kinetochore.
1993,
Pubmed
Kapitonov,
RAG1 core and V(D)J recombination signal sequences were derived from Transib transposons.
2005,
Pubmed
,
Echinobase
Kapitonov,
Self-synthesizing DNA transposons in eukaryotes.
2006,
Pubmed
,
Echinobase
Kapitonov,
Harbinger transposons and an ancient HARBI1 gene derived from a transposase.
2004,
Pubmed
Kapitonov,
A universal classification of eukaryotic transposable elements implemented in Repbase.
2008,
Pubmed
Kasyapa,
Mass spectroscopy identifies the splicing-associated proteins, PSF, hnRNP H3, hnRNP A2/B1, and TLS/FUS as interacting partners of the ZNF198 protein associated with rearrangement in myeloproliferative disease.
2005,
Pubmed
Katoh,
MAFFT version 5: improvement in accuracy of multiple sequence alignment.
2005,
Pubmed
Kohany,
Annotation, submission and screening of repetitive elements in Repbase: RepbaseSubmitter and Censor.
2006,
Pubmed
Kojima,
An extraordinary retrotransposon family encoding dual endonucleases.
2005,
Pubmed
Kunapuli,
ZNF198 protein, involved in rearrangement in myeloproliferative disease, forms complexes with the DNA repair-associated HHR6A/6B and RAD18 proteins.
2003,
Pubmed
Kuraku,
Timing of genome duplications relative to the origin of the vertebrates: did cyclostomes diverge before or after?
2009,
Pubmed
Liu,
MCAF1/AM is involved in Sp1-mediated maintenance of cancer-associated telomerase activity.
2009,
Pubmed
Lorenzi,
The VIPER elements of trypanosomes constitute a novel group of tyrosine recombinase-enconding retrotransposons.
2006,
Pubmed
Makino,
Ohnologs in the human genome are dosage balanced and frequently associated with disease.
2010,
Pubmed
Martin,
Complex formation between activator and RNA polymerase as the basis for transcriptional activation by MarA and SoxS in Escherichia coli.
2002,
Pubmed
Miller,
P-element homologous sequences are tandemly repeated in the genome of Drosophila guanche.
1992,
Pubmed
Nunes-Düby,
Similarities and differences among 105 members of the Int family of site-specific recombinases.
1998,
Pubmed
Pancer,
Somatic diversification of variable lymphocyte receptors in the agnathan sea lamprey.
2004,
Pubmed
Pyatkov,
Reverse transcriptase and endonuclease activities encoded by Penelope-like retroelements.
2004,
Pubmed
Raffa,
The putative Drosophila transcription factor woc is required to prevent telomeric fusions.
2005,
Pubmed
Rajesh,
The splicing-factor related protein SFPQ/PSF interacts with RAD51D and is necessary for homology-directed repair and sister chromatid cohesion.
2011,
Pubmed
Rep,
Osmotic stress-induced gene expression in Saccharomyces cerevisiae requires Msn1p and the novel nuclear factor Hot1p.
1999,
Pubmed
Richards,
Evolution of filamentous plant pathogens: gene exchange across eukaryotic kingdoms.
2006,
Pubmed
Sarkar,
Molecular evolutionary analysis of the widespread piggyBac transposon family and related "domesticated" sequences.
2003,
Pubmed
Shav-Tal,
PSF and p54(nrb)/NonO--multi-functional nuclear proteins.
2002,
Pubmed
Tudor,
The pogo transposable element family of Drosophila melanogaster.
1992,
Pubmed
Volff,
Turning junk into gold: domestication of transposable elements and the creation of new genes in eukaryotes.
2006,
Pubmed
Warner,
Identification of novel Smad binding proteins.
2003,
Pubmed
Xiao,
FGFR1 is fused with a novel zinc-finger gene, ZNF198, in the t(8;13) leukaemia/lymphoma syndrome.
1998,
Pubmed
Yang,
Identification of the endonuclease domain encoded by R2 and other site-specific, non-long terminal repeat retrotransposable elements.
1999,
Pubmed
de Crozé,
Reiterative AP2a activity controls sequential steps in the neural crest gene regulatory network.
2011,
Pubmed
van der Maarel,
Cloning and characterization of DXS6673E, a candidate gene for X-linked mental retardation in Xq13.1.
1996,
Pubmed