Click
here to close Hello! We notice that
you are using Internet Explorer, which is not supported by Echinobase
and may cause the site to display incorrectly. We suggest using a
current version of Chrome,
FireFox,
or Safari.
PLoS Biol
2005 Jun 01;36:e181. doi: 10.1371/journal.pbio.0030181.
Show Gene links
Show Anatomy links
RAG1 core and V(D)J recombination signal sequences were derived from Transib transposons.
Kapitonov VV
,
Jurka J
.
???displayArticle.abstract???
The V(D)J recombination reaction in jawed vertebrates is catalyzed by the RAG1 and RAG2 proteins, which are believed to have emerged approximately 500 million years ago from transposon-encoded proteins. Yet no transposase sequence similar to RAG1 or RAG2 has been found. Here we show that the approximately 600-amino acid "core" region of RAG1 required for its catalytic activity is significantly similar to the transposase encoded by DNA transposons that belong to the Transib superfamily. This superfamily was discovered recently based on computational analysis of the fruit fly and African malaria mosquito genomes. Transib transposons also are present in the genomes of sea urchin, yellow fever mosquito, silkworm, dog hookworm, hydra, and soybean rust. We demonstrate that recombination signal sequences (RSSs) were derived from terminal inverted repeats of an ancient Transib transposon. Furthermore, the critical DDE catalytic triad of RAG1 is shared with the Transib transposase as part of conserved motifs. We also studied several divergent proteins encoded by the sea urchin and lancelet genomes that are 25%-30% identical to the RAG1 N-terminal domain and the RAG1 core. Our results provide the first direct evidence linking RAG1 and RSSs to a specific superfamily of DNA transposons and indicate that the V(D)J machinery evolved from transposons. We propose that only the RAG1 core was derived from the Transib transposase, whereas the N-terminal domain was assembled from separate proteins of unknown function that may still be active in sea urchin, lancelet, hydra, and starlet sea anemone. We also suggest that the RAG2 protein was not encoded by ancient Transib transposons but emerged in jawed vertebrates as a counterpart of RAG1 necessary for the V(D)J recombination reaction.
Figure 1. Schematic Presentation of Transib transposons, RAG1, RAG2, and RAG1-Like Proteins in EukaryotesThe basic timescale of the evolutionary tree is based on published literature [49–51]. Red circles mark species in which Transib TPases were found. Gray squares indicate RAG2; orange and blue ellipses show the RAG1 core and RAG1 N-terminal domain, respectively. Overall taxonomy, including common and Latin names, is reported on the right side of the figure. A question mark at the lamprey lineage indicates insufficient sequence data. A lack of any labels means that the Transib TPase and RAG1/2 are not present in the sequenced portions of the corresponding genomes. Among branches lacking Transib TPases, only lamprey and crocodile genomes are not extensively sequenced to date. In sea anemone, the RAG1 core–like protein is capped by the ring finger motif, which also forms the C-terminus in the RAG1 N-terminal domain. In fungi, the Transib TPase was detected in soybean rust only.
Figure 2. Diversity of the Transib TPases and RAG1 Core–Like Proteins in AnimalsThe phylogenetic tree was obtained by using the neighbor-joining algorithm implemented in MEGA [44]. Evolutionary distance for each pair of protein sequences was measured as the proportion of aa sites at which the two sequences were different. Its scale is shown by the horizontal bar. Bootstrap values higher than 60% are reported at the corresponding nodes. Species abbreviations are as follows: AA, yellow fever mosquito; AG, African malaria mosquito; BF, lancelet; CL, bull shark; DP, D. pseudoobscura fruit fly; FR, fugu fish; HM, hydra; HS, human; NV, starlet sea anemone; SP, sea urchin; XL, frog. (Transib1 through Transib5 are from D. melanogaster fruit fly).
Figure 3. Multiple Alignment of Ten Conserved Motifs in the RAG1 Core Proteins and Transib TPasesThe motifs are underlined and numbered from 1 to 10. Starting positions of the motifs immediately follow the corresponding protein names. Distances between the motifs are indicated in numbers of aa residues. Black circles denote conserved residues that form the RAG1/Transib catalytic DDE triad. The RAG1 proteins are as follows: RAG1_XL (GenBank GI no. 2501723, Xenopus laevis, frog), RAG1_HS (4557841, Homo sapiens, human), RAG1_GG (131826, Gallus gallus, chicken), RAG1_CL (1470117, Carcharhinus leucas, bull shark), RAG1_FR (4426834, Fugu rubripes, fugu fish). Coloring scheme [43] reflects physiochemical properties of amino acids: black shading marks hydrophobic residues, blue indicates charged (white font), positively charged (red font), and negatively charged (green font); red indicates proline (blue font) and glycine (green font); gray indicates aliphatic (red font) and aromatic (blue font); green indicates polar (black font) and amphoteric (red font); and yellow indicates tiny (blue font) and small (green font). The species abbreviations for the Transib transposons are as follows: AA, yellow fever mosquito; AG, African malaria mosquito; DP, D. pseudoobscura fruit fly. (Transib1 through Transib5 are from the fruitfly D. melanogaster).
Figure 4. Structural Similarities between the Transib TIRs and V(D)J RSS SignalsThe species abbreviations are: AA, yellow fever mosquito; AG, African malaria mosquito; DM, D. melanogaster fruit fly DP, D. pseudoobscura fruit fly; SP, sea urchin. (Transib1 through Transib5 are from the fruit fly D. melanogaster).(A) Frequencies of the most frequent nucleotides at each position of the consensus sequence of the 5′ TIRs of transposons that belong to 20 families of Transib transposons identified in fruit flies and mosquitoes. The RSS23 consensus sequence is shown immediately under the TIRs consensus sequence. The most conserved nucleotides in the RSS23 heptamer and nonamer, which are necessary for efficient V(D)J recombination, are highlighted. The 23 ± 1 bp variable spacer is marked by Ns.(B) Non-gapped alignment of consensus sequences of 5′ TIRs from 21 families of Transib transposons.
(C) The 12/23 rule follows from the basic structure of TIRs of the consensus sequences of transposons that belong to the Transib5, Transib2_AG, TransibN1_AG, TransibN2_AG, and TransibN3_AG families. The 5′ TIRs of these transposons are aligned with the corresponding 3′ TIRs. Structures of the 5′ and 3′ TIRs resemble RSS12 and RSS23, respectively.
Figure 5. Schematic Structure of the Sea Urchin RAG1-Like SequencesContig accession numbers are shown in the left column. Inverted complement contigs are marked by “c” followed by the contig number. In each contig, RAG1-like proteins (white rectangle) are schematically aligned with the human RAG1 core (top rectangle). Nucleotide positions of the RAG1-like sequences are shown beneath the white rectangles. Three pairs of recently duplicated sequences (nucleotide identity is higher than 95%) are underlined by red, green, and black lines, respectively. Transposable and repetitive elements detected in the flanking regions are marked by painted rectangles. Names of these elements are shown above the rectangles. Asterisks denote stop codons in the corresponding RAG1-like sequences. BLASTP E-values characterizing similarities between the sea urchin and RAG1 proteins are shown above the white rectangles. Multiple alignment of these protein sequences is reported in Figure S5.
Figure 6. Multiple Alignment of the RAG1 N-Terminal Domain and Sea Urchin Protein SequencesRAG1_HS, RAG1_PD, RAG1_SS, RAG1_RM, and RAG1_LM mark the human (GenBank accession number NP_000439), lungfish (AAS75810), pig (BAC54968), stripe-sided rhabdornis or Rhabdornis mysticalis bird (AAQ76078), and latimeria (AAS75807) proteins, respectively. The sea urchin and lancelet proteins are marked by “_SP” and “_BF” following the identification numbers of the corresponding contigs. Protein sequences assembled from the sea urchin and lancelet WGS Trace Archives are denoted as P4-P5_SP and P1-P5_BF, respectively. Three conserved motifs are underlined and numbered. The third conserved motif is known as the ring finger. Distances from the protein N-termini are indicated by numbers.
Agrawal,
Transposition mediated by RAG1 and RAG2 and its implications for the evolution of the immune system.
1998, Pubmed
Agrawal,
Transposition mediated by RAG1 and RAG2 and its implications for the evolution of the immune system.
1998,
Pubmed
Aidinis,
Definition of minimal domains of interaction within the recombination-activating genes 1 and 2 recombinase complex.
2000,
Pubmed
Akamatsu,
Distinct roles of RAG1 and RAG2 in binding the V(D)J recombination signal sequences.
1998,
Pubmed
Akira,
Two pairs of recombination signals are sufficient to cause immunoglobulin V-(D)-J joining.
1987,
Pubmed
Altschul,
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.
1997,
Pubmed
Arbuckle,
Identification of two topologically independent domains in RAG1 and their role in macromolecular interactions relevant to V(D)J recombination.
2001,
Pubmed
Bellon,
Crystal structure of the RAG1 dimerization domain reveals multiple zinc-binding motifs including a novel zinc binuclear cluster.
1997,
Pubmed
Cannon,
The phylogenetic origins of the antigen-binding receptors and somatic diversification mechanisms.
2004,
Pubmed
Clatworthy,
V(D)J recombination and RAG-mediated transposition in yeast.
2003,
Pubmed
Difilippantonio,
RAG1 mediates signal sequence recognition and recruitment of RAG2 in V(D)J recombination.
1996,
Pubmed
Douzery,
The timing of eukaryotic evolution: does a relaxed molecular clock reconcile proteins and fossils?
2004,
Pubmed
Gellert,
V(D)J recombination: RAG proteins, repair factors, and regulation.
2002,
Pubmed
Hiom,
DNA transposition by the RAG1 and RAG2 proteins: a possible source of oncogenic translocations.
1998,
Pubmed
Jurka,
Repbase update: a database and an electronic journal of repetitive elements.
2000,
Pubmed
Kapitonov,
Rolling-circle transposons in eukaryotes.
2001,
Pubmed
Kapitonov,
Harbinger transposons and an ancient HARBI1 gene derived from a transposase.
2004,
Pubmed
Kapitonov,
The esterase and PHD domains in CR1-like non-LTR retrotransposons.
2003,
Pubmed
Kapitonov,
Molecular paleontology of transposable elements in the Drosophila melanogaster genome.
2003,
Pubmed
Kim,
Mutations of acidic residues in RAG1 define the active site of the V(D)J recombinase.
1999,
Pubmed
Koonin,
2003,
Pubmed
Kumar,
MEGA3: Integrated software for Molecular Evolutionary Genetics Analysis and sequence alignment.
2004,
Pubmed
Landree,
Mutational analysis of RAG1 and RAG2 identifies three catalytic amino acids in RAG1 critical for both cleavage steps of V(D)J recombination.
1999,
Pubmed
Lee,
A functional analysis of the spacer of V(D)J recombination signal sequences.
2003,
Pubmed
Lewis,
The old and the restless.
2000,
Pubmed
Lupas,
Predicting coiled-coil regions in proteins.
1997,
Pubmed
Melek,
Rejoining of DNA by the RAG1 and RAG2 proteins.
1998,
Pubmed
Messier,
In vivo transposition mediated by V(D)J recombinase in human T lymphocytes.
2003,
Pubmed
Mo,
A C-terminal region of RAG1 contacts the coding DNA during V(D)J recombination.
2001,
Pubmed
Notredame,
T-Coffee: A novel method for fast and accurate multiple sequence alignment.
2000,
Pubmed
Oettinger,
RAG-1 and RAG-2, adjacent genes that synergistically activate V(D)J recombination.
1990,
Pubmed
Peterson,
Estimating metazoan divergence times with a molecular clock.
2004,
Pubmed
Pires-daSilva,
Conservation of the global sex determination gene tra-1 in distantly related nematodes.
2004,
Pubmed
Ramsden,
Conservation of sequence in recombination signal sequence spacers.
1994,
Pubmed
Rodgers,
A zinc-binding domain involved in the dimerization of RAG1.
1996,
Pubmed
Sadofsky,
Expression and V(D)J recombination activity of mutated RAG-1 proteins.
1993,
Pubmed
Sakano,
Sequences at the somatic recombination sites of immunoglobulin light-chain genes.
1979,
Pubmed
Schäffer,
Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements.
2001,
Pubmed
Silver,
Dispensable sequence motifs in the RAG-1 and RAG-2 genes for plasmid V(D)J recombination.
1993,
Pubmed
Spanopoulou,
The homeodomain region of Rag-1 reveals the parallel mechanisms of bacterial and V(D)J recombination.
1996,
Pubmed
Thompson,
New insights into V(D)J recombination and its role in the evolution of the immune system.
1995,
Pubmed
Tonegawa,
Somatic generation of antibody diversity.
1983,
Pubmed
Tsai,
Evidence of a critical architectural function for the RAG proteins in end processing, protection, and joining in V(D)J recombination.
2002,
Pubmed
Venkatesh,
Molecular synapomorphies resolve evolutionary relationships of extant jawed vertebrates.
2001,
Pubmed
Willett,
Characterization and expression of the recombination activating genes (rag1 and rag2) of zebrafish.
1997,
Pubmed
Wootton,
Analysis of compositionally biased regions in sequence databases.
1996,
Pubmed
Yurchenko,
The RAG1 N-terminal domain is an E3 ubiquitin ligase.
2003,
Pubmed
Zhou,
Transposition of hAT elements links transposable elements and V(D)J recombination.
2004,
Pubmed
van Gent,
Similarities between initiation of V(D)J recombination and retroviral integration.
1996,
Pubmed