Click
here to close Hello! We notice that
you are using Internet Explorer, which is not supported by Echinobase
and may cause the site to display incorrectly. We suggest using a
current version of Chrome,
FireFox,
or Safari.
BMC Evol Biol
2003 Mar 24;3:4. doi: 10.1186/1471-2148-3-4.
Show Gene links
Show Anatomy links
The evolution of Runx genes I. A comparative study of sequences from phylogenetically diverse model organisms.
Rennert J
,
Coffman JA
,
Mushegian AR
,
Robertson AJ
.
???displayArticle.abstract???
BACKGROUND: Runx genes encode proteins defined by the highly conserved Runt DNA-binding domain. Studies of Runx genes and proteins in model organisms indicate that they are key transcriptional regulators of animal development. However, little is known about Runx gene evolution.
RESULTS: A phylogenetically broad sampling of publicly available Runx gene sequences was collected. In addition to the published sequences from mouse, sea urchin, Drosophila melanogaster and Caenorhabditis elegans, we collected several previously uncharacterised Runx sequences from public genome sequence databases. Among deuterostomes, mouse and pufferfish each contain three Runx genes, while the tunicate Ciona intestinalis and the sea urchin Strongylocentrotus purpuratus were each found to have only one Runx gene. Among protostomes, C. elegans has a single Runx gene, while Anopheles gambiae has three and D. melanogaster has four, including two genes that have not been previously described. Comparative sequence analysis reveals two highly conserved introns, one within and one just downstream of the Runt domain. All vertebrate Runx genes utilize two alternative promoters.
CONCLUSIONS: In the current public sequence database, the Runt domain is found only in bilaterians, suggesting that it may be a metazoan invention. Bilaterians appear to ancestrally contain a single Runx gene, suggesting that the multiple Runx genes in vertebrates and insects arose by independent duplication events within those respective lineages. At least two introns were present in the primordial bilaterian Runx gene. Alternative promoter usage arose prior to the duplication events that gave rise to three Runx genes in vertebrates.
Figure 1. Comparative structure of Runx genes collected from genome sequences of phylogenetically diverse model organisms. Diagram of Runx genes from selected species. Boxes indicate exons and curved lines indicate introns, which are scaled approximately to the scale bar as indicated. Absolute exon and intron sizes are listed in Supplemental Table 2 and Supplemental Figure 1. The Runt domain is represented as filled boxes ranging from light grey (N-terminus) to black (C-terminus) in order to facilitate comparison to the chordate Runt domain, which is encoded by three different exons. The black bar at the end of all the genes represents the VWRPY (or IWRPF in the case of CeRun) Groucho recruitment motif that is at the C-terminus of proteins encoded by all Runx genes, which facilitated mapping of the previously uncharacterised genes. Exon sequences representing 5' and 3' untranslated regions are not depicted in the diagram. For the vertebrate Runx genes, only exons downstream of the proximal promoters (P2) are shown. Human RUNX1 contains 3 additional alternatively spliced exons between those encoding the Runt domain and the C-terminal VWRPY motif [12], which may or may not be present in the mouse orthologue (Mm1) and are not shown here. Because of sequence gaps in the genomic sequence of the Anopheles gambiae RunxA gene (AgA), the exon structure 3' to the Runt domain of this gene is not known, as indicated by question marks. Abbreviations: Mm1, M. musculus Runx1, etc.; Tr1, T. rubripes Runx1, etc.; Ci, C. intestinalis; Sp, S. purpuratus; AgA, A. gambiae RunxA; DmA, D. melanogaster RunxA; DmB, D. melanogaster RunxB; AgL, A. gambiae Lozenge; DmL, D. melanogaster Lozenge; AgR, A. gambiae Runt; DmR, D. melanogaster Runt; Ce, C. elegans.
Figure 2. Multiple sequence alignment of the Runt domain from collected sequences. Alignment of Runx amino acid sequences created using Clustal W [16]. The number 1 denotes the conventional beginning of the Runt domain, which is 128 amino acids long. The shading highlights areas of amino acid conservation, with yellow indicating absolute conservation among all members, and blue or green indicating high conservation of a residue among several members. The domain structure (alpha helix and beta sheet elements) determined for MmRunx1 [14,17], is shown as arrows above the alignment, and the surfaces involved in DNA contact (major and minor groove, shown in green) or interaction with specific structural motifs within the beta subunit (shown in pink) are denoted by lines above the structural motifs [14]. In cases where it is known, amino acid pairs split by introns in the primary transcripts are underlined and in boldface (i.e., all but the two spider sequences, the crayfish, and the nematode Meloidogyne hapla, for which only cDNAs are available). The arrow indicates a highly conserved intron that falls within sequence that encodes the purine nucleotide-binding consensus. Note that the Runt domains from the spider gene Cs2 and the nematode gene Mh are partial, as indicated by the x's; these partial sequences were not used in the alignment used to generate trees (Table 1 and Figure 3). Abbreviations are as in Figure 1, with the addition of Pl, Pacifastacus leniusculus; Cs1, Cupiennius salei Run-1, etc.; Mh, Meloidogyne hapla.
Figure 3. Phylogenetic tree of the Runx protein family. Maximum likelihood tree calculated with the assumption of a molecular clock from the multiple amino acid sequence alignment shown in Figure 2. The numbers at each node are the bootstrap support values obtained by maximum likelihood (no molecular clock assumed), neighbour joining, and maximum parsimony, in that order. The dashes (-) by the branch leading to the Ciona gene (CiRunt) indicate that the maximum likelihood and maximum parsimony methods did not support that node (these methods each place CiRunt on a branch with TrRunx1, with support values of 50 and 52, respectively). Branches with unresolved topology were collapsed. Coloured branches are used to highlight the vertebrate lineage (green) and the insect lineage (red). The shading highlights the deuterostome (blue) and protostome (pink) representatives in the sample. Abbreviations for species names are the same as in Figures 1 and 2.
Figure 4. Alternative transcripts and N-termini produced by transcription from proximal and distal promoters of vertebrate runx genes. (A) Diagram of alternative transcripts from distal (P1) and proximal (P2) promoters in vertebrate Runx genes. Solid lines represent untranslated regions of exons, while dashed lines represent introns. TrRunx1 distal transcript is theoretical and based on its similarities to other distal transcripts (i.e., its open reading frame begins with a sequence encoding MASNS), location (~14 kb from proximal transcript) and size (19 aa). TrRunx2 and TrRunx3 transcripts from P1 were derived from similarities to MmRunx2 and Danio rerio runtb, respectively. TrRunx2 was recently published [11]. M. musculus distal transcripts were gathered from Genebank entries and verified with publications (see Supplemental Table 1). Note: Unlike all of the other vertebrate Runx genes in our collection, TrRunx1 transcript from P2 has a small exon (encoding 9 amino acids) followed by a 1 kb intron (see also B, below, and Supplemental Figure 1). This small exon splices into the same location within the downstream coding sequence as does the exon from the distal transcript. (B) Alternative N-terminal amino acid sequences of various vertebrate Runx genes analysed in our survey. N-termini from the distal promoters are shown in green, while the N-termini from the proximal promoters are shown in red. The aspartate or glutamate to which the N-terminal sequences from the distal promoter are joined are highlighted in yellow. The positions and nucleotides at the termini of introns are indicated in lower case.
Altschul,
Basic local alignment search tool.
1990, Pubmed
Altschul,
Basic local alignment search tool.
1990,
Pubmed
Bäckström,
The RUNX1 Runt domain at 1.25A resolution: a structural switch and specifically bound chloride ions modulate DNA binding.
2002,
Pubmed
Bangsow,
The RUNX3 gene--sequence, structure and regulated expression.
2001,
Pubmed
Canon,
Runt and Lozenge function in Drosophila development.
2000,
Pubmed
Coffman,
Runx transcription factors and the developmental balance between cell proliferation and differentiation.
2003,
Pubmed
,
Echinobase
Coffman,
SpRunt-1, a new member of the runt domain family of transcription factors, is a positive regulator of the aboral ectoderm-specific CyIIIA gene in sea urchin embryos.
1996,
Pubmed
,
Echinobase
Crute,
Biochemical and biophysical properties of the core-binding factor alpha2 (AML1) DNA-binding domain.
1996,
Pubmed
Damen,
Expression patterns of hairy, even-skipped, and runt in the spider Cupiennius salei imply that these genes were segmentation genes in a basal arthropod.
2000,
Pubmed
Eggers,
Genomic characterization of the RUNX2 gene of Fugu rubripes.
2002,
Pubmed
Holland,
Gene duplications and the origins of vertebrate development.
1994,
Pubmed
Huelsenbeck,
MRBAYES: Bayesian inference of phylogenetic trees.
2001,
Pubmed
Kagoshima,
The Runt domain identifies a new family of heteromeric transcriptional regulators.
1993,
Pubmed
Komori,
Targeted disruption of Cbfa1 results in a complete lack of bone formation owing to maturational arrest of osteoblasts.
1997,
Pubmed
Levanon,
Architecture and anatomy of the genomic locus encoding the human leukemia-associated transcription factor RUNX1/AML1.
2001,
Pubmed
Lund,
RUNX: a trilogy of cancer genes.
2002,
Pubmed
Nam,
Expression pattern, regulation, and biological role of runt domain transcription factor, run, in Caenorhabditis elegans.
2002,
Pubmed
Robertson,
The expression of SpRunt during sea urchin embryogenesis.
2002,
Pubmed
,
Echinobase
Tahirov,
Structural analyses of DNA recognition by the AML1/Runx-1 Runt domain and its allosteric control by CBFbeta.
2001,
Pubmed
Thompson,
CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice.
1994,
Pubmed
Wheeler,
Mechanisms of transcriptional regulation by Runt domain proteins.
2000,
Pubmed