Click
here to close Hello! We notice that
you are using Internet Explorer, which is not supported by Echinobase
and may cause the site to display incorrectly. We suggest using a
current version of Chrome,
FireFox,
or Safari.
Nucleic Acids Res
2009 Oct 01;3718:6184-93. doi: 10.1093/nar/gkp600.
Show Gene links
Show Anatomy links
Accurate and efficient reconstruction of deep phylogenies from structured RNAs.
Stocsits RR
,
Letsch H
,
Hertel J
,
Misof B
,
Stadler PF
.
???displayArticle.abstract???
Ribosomal RNA (rRNA) genes are probably the most frequently used data source in phylogenetic reconstruction. Individual columns of rRNA alignments are not independent as a consequence of their highly conserved secondary structures. Unless explicitly taken into account, these correlation can distort the phylogenetic signal and/or lead to gross overestimates of tree stability. Maximum likelihood and Bayesian approaches are of course amenable to using RNA-specific substitution models that treat conserved base pairs appropriately, but require accurate secondary structure models as input. So far, however, no accurate and easy-to-use tool has been available for computing structure-aware alignments and consensus structures that can deal with the large rRNAs. The RNAsalsa approach is designed to fill this gap. Capitalizing on the improved accuracy of pairwise consensus structures and informed by a priori knowledge of group-specific structural constraints, the tool provides both alignments and consensus structures that are of sufficient accuracy for routine phylogenetic analysis based on RNA-specific substitution models. The power of the approach is demonstrated using two rRNA data sets: a mitochondrial rRNA set of 26 Mammalia, and a collection of 28S nuclear rRNAs representative of the five major echinoderm groups.
Figure 1. Overview of the RNAsalsa workflow. Starting from an initial sequence alignment 𝔸0 and a structure constraint σ0, a relaxed consensus constraint and then constraints for each input sequence are produced. Pairwise sequence alignments and the intersections of the two individualized constraints are used to compute a collection of constrained pairwise consensus structures τij, from which a refined individualized structure constraint τi is extracted by means of a majority voting procedure. Then τi is used as constraint in minimum energy folding to obtain the final structural annotation ψi for input sequence xi. The structure-annotated input sequences are now aligned with a structure-aware alignment method, resulting in the final alignment 𝔹 and a corresponding consensus structure ω.
Figure 2. Accuracy of structure prediction. Fraction of correctly predicted helices (green bars) compared with the mammalian 16S rRNA consensus models (36). (A) RNAsalsa significantly outperforms MXSCARNA and RNAalifold (default parameter settings; three-sample test for equality of proportions without continuity correction; χ2 = 19.96, df = 2, P < 0.0001). Average tree-edit distance (37) (B) between predicted individual structures and the mammalian 16S rRNA reference model. RNAsalsa predictions conform the consensus model much better (paired sampled t-test; t = 33.46, df = 1, P < 0.0001; N = 26).
Figure 3. Bayesian tree inferred from the combined mammalian 12S rRNA and 16S rRNA. (A) Analysis with GTR + Γ model in simple DNA mode. (B) Analysis with GTR + Γ model in RNA mode for paired positions and DNA mode for loop regions. Numbers indicate Bayesian posterior probabilities. The scale bar denotes the estimated number of substitutions per site.
Figure 4. Phylogenies inferred from combined analyses of the mammalian 12S rRNA and 16S rRNA. Sequences are aligned with (A) RNAsalsa, (B) MAFFT, (C) ClustalW and (D) MXSCARNA. Tree reconstruction is based on ML analyses with GTR + Γ model. Numbers indicate Bootstrap support values (1000 replicates). The scale bar denotes the estimated number of substitutions per site.
Figure 5. Phylogenies inferred from analyses of the echinoderm 28S rRNA. Sequences are aligned with (A) RNAsalsa, (B) MAFFT, (C) ClustalW and (D) MXSCARNA. Tree reconstruction is based on ML analyses with GTR + Γ model. Numbers indicate Bootstrap support values (1000 replicates). The scale bar denotes the estimated number of substitutions per site.
Arnason,
Mammalian mitogenomic relationships and the root of the eutherian tree.
2002, Pubmed
Arnason,
Mammalian mitogenomic relationships and the root of the eutherian tree.
2002,
Pubmed
Bernhart,
RNAalifold: improved consensus structure prediction for RNA alignments.
2008,
Pubmed
Buckley,
Secondary structure and conserved motifs of the frequently sequenced domains IV and V of the insect mitochondrial large subunit rRNA gene.
2000,
Pubmed
Collins,
Use of RNA secondary structure for studying the evolution of RNase P and RNase MRP.
2000,
Pubmed
Dalli,
STRAL: progressive alignment of non-coding RNA using base pairing probability vectors in quadratic time.
2006,
Pubmed
Ding,
Ab initio RNA folding by discrete molecular dynamics: from structure prediction to folding mechanisms.
2008,
Pubmed
Doshi,
Evaluation of the suitability of free-energy minimization using nearest-neighbor energy parameters for RNA secondary structure prediction.
2004,
Pubmed
Gardner,
A benchmark of multiple sequence alignment programs upon structural RNAs.
2005,
Pubmed
Goodman,
Toward a phylogenetic classification of Primates based on DNA evidence complemented by fossil evidence.
1998,
Pubmed
Gotoh,
An improved algorithm for matching biological sequences.
1982,
Pubmed
Gowri-Shankar,
On the correlation between composition and site-specific evolutionary rate: implications for phylogenetic inference.
2006,
Pubmed
Gruber,
Strategies for measuring evolutionary conservation of RNA secondary structures.
2008,
Pubmed
Havgaard,
Pairwise local structural alignment of RNA sequences with sequence similarity less than 40%.
2005,
Pubmed
Hayasaka,
Molecular phylogeny and evolution of primate mitochondrial DNA.
1988,
Pubmed
Hillis,
Ribosomal DNA: molecular evolution and phylogenetic inference.
1991,
Pubmed
Hofacker,
Secondary structure prediction for aligned RNA sequences.
2002,
Pubmed
Hudelot,
RNA-based phylogenetic methods: application to mammalian mitochondrial RNA sequences.
2003,
Pubmed
Höchsmann,
Pure multiple RNA secondary structure alignments: a progressive profile approach.
2004,
Pubmed
Jow,
Bayesian phylogenetics using an RNA substitution model applied to early mammalian evolution.
2002,
Pubmed
Katoh,
Improved accuracy of multiple ncRNA alignment by incorporating structural information into a MAFFT-based framework.
2008,
Pubmed
Katoh,
MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform.
2002,
Pubmed
Kjer,
Use of rRNA secondary structure in phylogenetic studies to identify homologous positions: an example of alignment and data presentation from the frogs.
1995,
Pubmed
Kjer,
Site specific rates of mitochondrial genomes and the phylogeny of eutheria.
2007,
Pubmed
Kjer,
Why weight?
2007,
Pubmed
Kjer,
Aligned 18S and insect phylogeny.
2004,
Pubmed
Layton,
A statistical analysis of RNA folding algorithms through thermodynamic parameter perturbation.
2005,
Pubmed
Lindgreen,
MASTR: multiple alignment and structure prediction of non-coding RNAs using simulated annealing.
2007,
Pubmed
Mallatt,
28S and 18S rDNA sequences support the monophyly of lampreys and hagfishes.
1998,
Pubmed
Mallatt,
Ribosomal RNA genes and deuterostome phylogeny revisited: more cyclostomes, elasmobranchs, reptiles, and a brittle star.
2007,
Pubmed
,
Echinobase
Misof,
A hexapod nuclear SSU rRNA secondary-structure model and catalog of taxon-specific structural variation.
2006,
Pubmed
Misof,
A Monte Carlo approach successfully identifies randomness in multiple sequence alignments: a more objective means of data exclusion.
2009,
Pubmed
Murphy,
Molecular phylogenetics and the origins of placental mammals.
2001,
Pubmed
Parsch,
Comparative sequence analysis and patterns of covariation in RNA secondary structures.
2000,
Pubmed
Reeder,
Consensus shapes: an alternative to the Sankoff algorithm for RNA consensus structure prediction.
2005,
Pubmed
Ronquist,
MrBayes 3: Bayesian phylogenetic inference under mixed models.
2003,
Pubmed
Rzhetsky,
Estimating substitution rates in ribosomal RNA genes.
1995,
Pubmed
Savill,
RNA sequence evolution with secondary structure constraints: comparison of substitution rate models using maximum-likelihood methods.
2001,
Pubmed
Schmitz,
The complete mitochondrial sequence of Tarsius bancanus: evidence for an extensive nucleotide compositional plasticity of primate mitochondrial DNA.
2002,
Pubmed
Schmitz,
SINE insertions in cladistic analyses and the phylogenetic affiliations of Tarsius bancanus to other primates.
2001,
Pubmed
Schmitz,
The colugo (Cynocephalus variegatus, Dermoptera): the primates' gliding sister?
2002,
Pubmed
Schöniger,
A stochastic model for the evolution of autocorrelated DNA sequences.
1994,
Pubmed
Scouras,
The complete mitochondrial genomes of the sea lily Gymnocrinus richeri and the feather star Phanogenia gracilis: signature nucleotide bias and unique nad4L gene rearrangement within crinoids.
2006,
Pubmed
,
Echinobase
Shapiro,
Comparing multiple RNA secondary structures using tree comparisons.
1990,
Pubmed
Stephan,
The rate of compensatory evolution.
1996,
Pubmed
Tabei,
A fast structural multiple alignment method for long RNA sequences.
2008,
Pubmed
Thompson,
CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice.
1994,
Pubmed
Tillier,
High apparent rate of simultaneous compensatory base-pair substitutions in ribosomal RNA.
1998,
Pubmed
Torarinsson,
Multiple structural alignment and clustering of RNA sequences.
2007,
Pubmed
Washietl,
Fast and reliable prediction of noncoding RNAs.
2005,
Pubmed
Will,
Inferring noncoding RNA families and classes by means of genome-scale structure-based clustering.
2007,
Pubmed
Wilm,
An enhanced RNA alignment benchmark for sequence alignment programs.
2006,
Pubmed
Woese,
Bacterial evolution.
1987,
Pubmed
Yusupov,
Crystal structure of the ribosome at 5.5 A resolution.
2001,
Pubmed
Zietkiewicz,
Phylogenetic affinities of tarsier in the context of primate Alu repeats.
1999,
Pubmed