|
Figure 1. Primer sequences. Sequences are written using the IUPAC code. We used the following codes to describe the rules that underpinned primer design: Capital letters represent the 5'clamp (non-degenerate), whereas small letters represent the degenerate part supposed to contain no mismatch whatever the species (based on known sequences, protein for CH primers or nucleotide for D or NH primers). CHX+Y-z: CodeHop primer with a 5'clamp (non-degenerate part) of X bases, and a 3' z-fold degenerate end of Y bases. CHX + Y-z + 2GT: CodeHop primer designed at the intron limit, which contains the first two bases of the intron (by mistake, we reversed the two bases in the single such case, i8F). DX-Y-mz-t: Classical degenerate primer of X bases long, Y-fold degenerate, containing z to t mismatches according to the species (despite degeneracy in primer design). NHX+Y-Z: We called this a 'Nucleotide-hop' primer, by homology with CodeHop primers, but design was based on nucleotide alignment; we designed a 5' clamp (non-degenerate) and degenerate the 3' end according to the set of nucleotide sequences available thus ignoring codons. NHX+Y-Z-mz-t: Same as above, but, despite primer degeneracy, there may remain mismatches in some species; in this case there are from z to t mismatches according to the species for which we have sequence data. For instance, a primer (D30-1-m0-2) actually does not contain ambiguity bases (-1: not degenerate), and contains 0 to 2 mismatches according to the species. Other symbols: * this primer was not used. # erroneous primer sequence, the subsequently corrected primer i13Rcor was not tried
|
|
Figure 2. Flowchart representing the different steps and stages of the bioinformatic assessment of EPIC loci. Step 3 was not performed in stages II and III. Steps 1-5 were identical for stages II and III. Visual examination of protein alignment (part of step 7) was performed for all stages I-III.
|
|
Figure 3. Example of a graphical representation (stage III) of the multiple nucleotide alignment. This tool was introduced at stage III, to help select conserved regions encompassing introns for PCR primer design. The multiple alignment of the gene family retrieved from Homolens appears at the top; dots indicate intron occurrence (intron positions are reported in gray at the bottom of the graphic). The similarity score ω (black), as well as similarity scores with Strongylocentrotus (ω1, green), Saccoglossus (ω2, blue), and Nematostella (ω3, red) are plotted at the bottom of the graphics; for a better readability, ω1 ... ω3, are halved. Peak of nucleotide conservation and corresponding ω values and positions on the multiple alignment are identified by vertical lines (a colour code indicates the number of species for which additional sequences were available).
|
|
Figure 4. Agarose gel electrophoresis results for 4 primer pairs (one 96-well PCR plate). a: intron 54 (standard protocol); b: intron 21 (S-CR protocol). The size marker, labelled L, is a 100 bp ladder with the brightest band corresponding to 500 bp. (a) Four individuals of each species are presented, in the following order: P. lividus, C. eumyota, P. japonica, S. clava, A. squamata, E. cordatum, for intron 54 (primer pairs: a, b, c and d). (b) Lanes successively correspond to the following species (number of individuals in parentheses): Abatus cavernosus (4), A. agassizi (3), A. cordatus (3), A. nimrodi (1), Sterechinus neumayeri (3), S. agassizi (1), Macoma balthica (4), Cerastoderma edule (4).
|
|
Figure 5. Phylogenetic relationships [39]among the genera tested and global results for intron amplification. To the right of the tested genus (name in black) symbols reflect the level of technical effort [✧ Not all primer pairs were tested, ✦ standard effort, ✦✦ more tests than standard (either PCR conditions or DNA extracts), ✦✦✦ several of the previous improvements], and the success column gives the number of introns scored as 'P' or 'I'. The taxa whose sequences or genomes most influenced primer design, either by being our models for the non-degenerate part of the codehop primers or by over-representation in gene family databases, are written in small grey letters. Major phylogenetic splits are indicated using the following abbreviations: BIL (Bilateria), P (Protostomia) D (Deuterostomia), Echi. (phylum Echinodermata), Uroc. (phylum Urochordata), Cnid. (phylum Cnidaria).
|
|
Figure 6. Correspondence analysis representing the bilaterian genera according to their results for each intron. Nulls, A, I and P were respectively scored as 0,1,2 and 3. Methodological changes such as (i) adding cnidarians (therefore reducing the number of variates from 52 to 22 or 29), (ii) considering nulls (scored as 0) versus all amplifying categories (1) or (iii) changing the nature of the multivariate analysis did not change the pattern. When included in analyses, the two cnidarians appeared neither to form a tight group, nor to be outliers relative to bilaterian phyla. Empty symbols represent genera of the Echinodermata (stars, circles and ovoids intuitively represent ophiuroids, regular sea uchins and irregular sea urchins, respectively), black triangles represent ascidians (Urochordata) and black squares represent the two bilvalves (Mollusca).
|
|
Figure 7. Effect of the degeneracy levels of codehop primers on PCR results. Results from the 10 bilaterian genera in which both primers were "codehop" were considered here (they cannot be directly derived from Table 1 since, for some introns, not all primer pairs were "codehop"). The bars represent the total numbers of cases (species × primer pairs) displaying results 'P', 'I' 'A' or nulls, for three categories of introns, those in which no (0), one (1) or both (2) primers have a more than 6-fold degeneracy. Exact tests were performed from the 4 × 3 contingency table used to build the histogram, as well as from tables derived from it after pooling some columns (for instance nulls versus "A + I + P"): all were highly significant. The pie chart diagrams display the proportion of the four categories of results within each category of degeneracy; they illustrate the increase in the proportion of null loci when primer degeneracy increases.
|