|
Fig. 1. Chromosome evolution in echinoderms.a, Phylogenetic relationships of the five echinoderm classes (orange), with the position of the Eleutherozoa ancestor highlighted, and hemichordates and chordates as outgroups. Classes with available chromosome-scale genome assembly are shown in dark orange. Divergence times among echinoderms and with hemichordates were extracted from ref. 13, divergence with chordates from TimeTree155. b, Synteny comparison between the 22 chromosomes of spiny sea star and the 23 chromosomes of the black sea cucumber. The single macrosyntenic rearrangement between the two genomes is indicated with arrows. c, Synteny comparison between the 22 chromosomes of spiny sea star and the 20 chromosomes of brittle star. The three brittle star chromosomes with a one-to-one relationship with sea star chromosomes are shown with a colour matching its orthologous counterpart in spiny sea star (Fisher’s exact test Padj < 10−5). d, Chromosome evolution in Eleutherozoa. We named the ancestral ELG using established naming conventions proposed for the 24 bilaterian ancestral linkage groups defined previously23,52. B2 + C2 corresponds to a fusion of bilaterian B2 and C2 present in the Eleutherozoa ancestor. e, Repeat landscapes for the brittle star and the three selected echinoderm genomes, with the y axis representing the genomic coverage and the x axis the CpG-corrected Kimura divergence to the repeat consensus. Species are presented in the same order as in d. The dashed red line indicates the repeat burst in the brittle star.
|
|
Fig. 2. Hox and ParaHox clusters organization across echinoderms.a, Phylogenetic relationships among the five classes of echinoderms, with hemichordates as the outgroup. b, Genomic organization of the brittle star A. filiformis Hox cluster. Significantly expanded repeats at the Hox cluster location are represented in their respective tracks below Hox genes, with the average sequence divergence to consensus indicated (div., %). Divergence to consensus is a proxy for repeat age, where higher divergence indicates older repeat insertions. Vertical grey rectangles indicate breakpoint locations. c, Schematic representation of Hox cluster organization across echinoderms and outgroups, based on organization reported in Saccoglossus kowalevskii and Ptychodera flava156 for Hemichordata, feather star Anneissia japonica50 for Crinoidea, brittle star A. filiformis for Ophiuroidea, crown-of-thorns sea star A. planci18,30 for Asteroidea, Japanese sea cucumber Apostichopus japonicus22,66 for Holothuroidea and purple sea urchin S. purpuratus26 for Echinoidea. d, ParaHox gene cluster organization, based on the same genomes as in b. Double slashes indicate non-consecutive genes, all separated by distances >5 Mb on the same chromosome or scaffold. e, Expression of Hox and ParaHox genes throughout 4 brittle star developmental time points and in the adult arm. hpf, hours post fertilization. Expression data from refs. 61–63 were normalized across samples using the TMM method146 on the full set of brittle star genes, and shown as log2(TPM + 1).
|
|
Fig. 3. Gene family evolution in echinoderms.a, Number of significantly expanded (red) and contracted (blue) gene families throughout echinoderm evolution, from a total of 10,367 tested gene families (Methods). b, Gene Ontology (GO) functional enrichment tests (biological process) for expanded and contracted families in the different echinoderm classes. We selected the top 15 representative GO terms enriched in expanded brittle star gene families and 10 in contracted families (Methods). In the heat map, colours indicate GO terms significantly enriched in expanded or contracted families in other echinoderm classes (FDR < 0.05). Complete GO enrichment test results are provided in Supplementary Table 6, including P values, enrichment ratios, background and foreground gene families and genes. c, Gene copy number variation across echinoderms for regeneration gene families with significant expansion in A. filiformis (>1 brittle star gene in the family annotated with the GO term ‘regeneration’). Gene families were named according to the S. purpuratus gene name. Red and blue colours denote significantly expanded and contracted families, respectively.
|
|
Fig. 4. Gene expression during brittle star arm regeneration.a, Soft-clustering of gene expression profiles throughout regeneration time points, yielding 9 main temporal co-expression clusters (A1–A9) (Methods and Extended Data Fig. 6). Co-expression clusters are temporally ordered (from top to bottom) on the basis of their first expression time point. Barplots on the right indicate the number of genes assigned to each cluster. The RNA sampling procedure for each stage is illustrated at the bottom. Early stages are sampled at 48 and 72 hpa, when wound healing followed by regenerative bud formation occurs. Subsequent stages are defined by morphological landmarks: stage 3 corresponds to the appearance of the radial water canal and nerve (∼6 dpa), stage 4 is the appearance of the first regenerated metameric units (∼8 dpa), stage 5 corresponds to advanced arm extension and differentiation onset (∼9 dpa), 50% stages correspond to when 50% of the regenerated arm has differentiated (∼2–3 weeks post amputation) sampled at the distal (D, less differentiated) and proximal (P, more differentiated) ends42,47. b, GO enrichment for each co-expression cluster (Methods, see Extended Data Fig. 7 and Supplementary Table 10 for exhaustive GO results). c, Curated gene list enrichment for each co-expression cluster (hypergeometric test, Benjamini–Hochberg Padj < 0.05; Methods and Supplementary Table 2). d, Transcription factor (TF)-binding motifs enriched around the TSS (5 kb upstream to 1 kb downstream) of genes from co-expression clusters (hypergeometric test Padj < 0.05; Methods).
|
|
Fig. 5. Gene expression throughout appendage regeneration across animals.a, Gene age enrichments for brittle star arm regeneration clusters (hypergeometric test, Benjamini–Hochberg Padj < 0.05). Clusters are ordered by the time of expression onset. b, Comparison of co-expressed gene clusters deployed during appendage regeneration in axolotl, brittle star and Parhyale (left to right: axolotl vs brittle star, brittle star vs Parhyale, Parhyale vs axolotl). Clusters in Parhyale (clusters P1–P8) correspond to the clustering reported previously48, but clusters were renamed to follow temporal activation and homogenize with respect to brittle star and axolotl clusters (Methods). Co-expression clusters in each species are shown in order of their temporal expression (from top to bottom), except for brittle star cluster A9 and Parhyale clusters P6–P8 which are expressed throughout several regeneration time points and shown at the bottom. Clusters are represented by vertical rectangles whose sizes are proportional to the number of homologous genes in the cluster, and coloured according to enriched GO terms (Methods, Fig. 4 and Extended Data Fig. 8; see a for legend). Links between clusters of the two compared species indicate cluster membership of homologous genes, with coloured links indicating significant overlaps (permutation-based P values with Benjamini–Hochberg correction <0.05; Methods). Credits for Parhyale silhouette: Collin Gross (CC BY 3.0). c, Most genes identified as co-expressed in the brittle star–Parhyale and brittle star–axolotl comparisons are not recovered in the direct Parhyale–axolotl comparison. Most genes co-expressed in the axolotl and brittle star have no identified homologues in Parhyale (54%, left pie chart). Genes co-expressed in Parhyale and the brittle star have a divergent expression in the axolotl, that is, they are not found in matched co-expression clusters (55%, right pie chart). d, Gene list enrichment and depletion tests performed for the set of brittle star genes with conserved temporal expression during animal regeneration (Methods). e, GO enrichment tests, as in d.
|
|
Fig. 6. Comparison of gene expression during wound closure and regeneration in brittle star explant experiments.a, Experimental setup. Brittle star arms are amputated at the proximal (cut 1) and distal (cut 2) ends. Proximal, distal and medial (control) segments are sampled for RNA-seq at 3 and 5 dpa, using 3–4 replicates each (Supplementary Table 1). We identify DEGs in proximal (wound closure only, not followed by regeneration) segments and distal (regenerative) segments, compared to control medial segments. b, Comparison of DEGs from explant experiments with brittle star arm regeneration time-course clusters (Fig. 4; hypergeometric enrichment test, BH-corrected P < 0.05). c, Overlap between DEGs genes in distal and proximal segments. Bars in the UpSet plot are coloured to highlight (i) segment-specific DEGs, for DEGs unique to distal or proximal segments, (ii) shared proximal and distal segments, for DEGs shared between proximal and distal, and (iii) opposite proximal and distal segments, for DEGs upregulated in proximal and downregulated in distal (or vice-versa).
|
|
Extended Data Fig. 1. Genome assembly and repeat classification.A. K-mer spectrum of the diploid assembly, that is before haplotype removal. Read-only k-mers (black curve) correspond to sequencing errors, and are not represented in the assembly. The 1-copy k-mer peak at 15X coverage (1n, red) corresponds to reads from heterozygous regions, whereas the 2-copy k-mer peak at 30X coverage (2n, blue) corresponds to homozygous regions. B. K-mer spectrum of the primary assembly, that is after haplotype removal. Following the collapse of haplotypes, half of the k-mers of the heterozygous peaks are accordingly not represented in the assembly anymore and homozygous regions are present as single copies only. C. Hi-C contact map showing the density of interactions between binned genomic regions in the proximity ligation data. The high contact regions are consistent with a 20 chromosome A. filiformis karyotype. D. Validation accuracy of a new DeepTE model124, trained to classify repeats into 5 main classes: LTR, SINE, DNA, LINE and Rolling Circle (RC). The vertical dotted line corresponds to the calibrated 0.55 threshold that we used on the DeepTE scores to classify repetitive elements. E. Accuracy of the newly-trained and the default Metazoa DeepTE models on the test set of A. filiformis repeats. The accuracy of the new model is superior to the default model and can classify repeats into 5 as opposed to 3 classes (repeats of ClassI, ClassII and ClassIII).
|
|
Extended Data Fig. 2. Reconstruction of the ancestral Euleterozoa linkage groups (ELG).A. Synteny comparison between spiny starfish and black sea cucumber reveals one macrosyntenic rearrangement (red boxes). ELGs colours are indicated at the top and correspond to colours on Fig. 1. Pairwise synteny comparisons with Amphioxus and Sea Scallop are similarly displayed on B., C., D., E. and F, with red boxes highlighting that B3, and O2 are all on distinct chromosomes in Amphioxus and Sea Scallop, thus confirming that the sea star B3-O2 fusion is a sea star-specific derived rearrangement.
|
|
Extended Data Fig. 3. Inter-chromosomal macrosyntenic rearrangements since the Eleutherozoa ancestor in sequenced echinoderms.A. Synteny comparison between ELGs and available chromosome-scale sea star genomes 53,55,56. All examined sea star genomes are marked by the single B3 + O2 fusion. B. Synteny comparison between ELGs and available chromosome-scale sea urchin genomes 14,20,21,23,57. All examined sea urchin genomes are marked by the (B2 + C2) + E and B3 + J1 fusion. L. variegatus underwent the additional (B3 + J1) + J2 fusion and D + G. P. lividus underwent the additional A1 + A2, O1 + O2 and Q + R fusions (note that an additional fission of ELG D may have occurred if the large unplaced scaffold noted “Scaf.” is not an assembly artefact.) C. Synteny comparison between ELGs and brittle star chromosomes reveals a total of 26 macrosyntenic inter-chromosomal rearrangements, in the most parsimonious scenario involving fusion, fission and translocation events. The rearrangements can be inferred from the oxford grid plot: 1 [fusion + mixing + fission] of 3 ELGs = 3 inter-chromosomal rearrangements (B3-G-O1), 3 [fusion + mixing + fission] of 2 ELGs = 6 inter-chromosomal rearrangements (B1-J1, C1-Q, J2-O2) and 17 translocations (A2-N, B1J1-H, B3GO1-N, D-G, DG-L, E-I, G-H, B2 + C2-H, B2 + C2H-I, B2 + C2HI-L, B1J1-I, J2O2-R, K-L, K-P, M-R, J2O2-M, B3GO1-P).
|
|
Extended Data Fig. 4. Molecular phylogeny of echinoderm Hox genes.The phylogenetic tree is shown as an unrooted tree, with clades of Hox genes indicated with the same colours as in Fig. 2. The phylogenetic position of each identified Hox gene in the brittle star (“Ampfil”) is highlighted in pink.
|
|
Extended Data Fig. 5. Evolution of the pmar1/phb and luciferase-like genes by tandem duplications.A. Molecular phylogeny of the pmar1/phb genes in echinoderms. The tree was reconstructed with RAxML-NG136 (10 starting parsimony trees, 1000 bootstraps, LG + G4 + F model), lowly supported nodes (bootstrap < 60) were subsequently corrected with Treerecs137 to maximise the parsimony of duplications and losses. Species are indicated by abbreviations (Ptyfla = P. flava, Sackow = S. kowalevskii, Annjap = A. japonica, Parliv = P. lividus, Strpur = S. purpuratus, Apojap = A. japonicus, Acapla = A. planci, Ak = A. kochii, Ampfil = A. filiformis). Inferred duplication nodes are shown in red. pmar1/phb full gene sequences were identified based on ref. 23,70 (Dataset_s4127). B. Phylogeny of luciferase genes in echinoderms, as in A. Luciferase-like genes were identified based on sequences from8 (Dataset_s4127). C. Genomic location of tandem-duplicated A. filiformis phb genes. D. Genomic location of tandem-duplicated A. filiformis luciferase genes. E. phb expression throughout 4 brittle star developmental time points and in the adult arm, showing the early developmental expression of phb genes (hpf: hours post-fertilization). Expression across samples was normalised using the TMM method146 on the full set of brittle star genes, and is shown as log2(TPM + 1). F. Luciferase-like gene expression during brittle star arm regeneration, showing that most luciferase-like genes are expressed in differentiated arms only: control arms and the latest regeneration time point (hpa: hours post-amputation, see Fig. 4 for staging details). Expression normalisation as in E.
|
|
Extended Data Fig. 6. Clustering of gene expression during the brittle star arm regeneration.A. Optimal number of clusters estimated using the centroid distance. After n = 19 clusters, there is no continuous decrease of the centroid distance. B. Normalised expression profiles (expression of the centroid) for each of the n = 19 clusters. Clusters with genes expressed over a single regeneration time point (or one regeneration point + control) were defined as minor clusters and not presented in the main text as these typically do not display significant enrichments and may be driven by noisy gene expression. C. Signalling pathways enrichment for each co-expression cluster (hypergeometric test, Benjamini-Hochberg adjusted p-values < 0.05, Methods). D. Expression of brittle star genes previously implicated in arm regeneration (gene names are from previous studies, see Supplementary Table 8). Co-expression clusters are shown on the left, gene names on the right, with red indicating availability of published in situ data. E. Expression of core genes in each co-expression cluster. Genes were filtered based on their cluster membership score (Supplementary Table 9, “acore” score) to retain the top 5% core genes in each cluster, and the five genes with the highest expression were selected for the heatmap. Gene names starting with ‘Unchar’ indicate genes without significant blast hits in the swissprot database. F. Expression of key TF genes during regeneration, as identified by binding motifs overrepresentation analysis (Fig. 4d). TF genes were identified by reciprocal blasts with mouse and swissprot blast hits; several copies were reported where blast results were ambiguous. TF genes with consistent expression and binding motifs overrepresentation are shown in red. No homologue for ZNF268 could be identified in brittle star and the expression of the identified p53 homologue does not match motif enrichment results (but p53 pathway activation is consistent with p53 motif enrichments, see C). G. Expression throughout arm regeneration of genes in the expanded gene families annotated with the GO term ‘regeneration’ (see Fig. 3b,c). Gene family membership (correspondence with Fig. 3c) are indicated with colours on the right of the expression heatmap, clusters are shown on the left. H. Duplicated genes from expanded ‘regeneration’ gene families significantly associate with specific regeneration co-expression clusters (hypergeometric test, Benjamini-Hochberg adjusted p-values). Significant associations (FDR < 0.05) are presented in colour, non-significant enrichments (enrichment ratio > 1 but FDR > 0.05) in grey (Supplementary Table 7). I. Expression throughout arm regeneration of the brittle star genes in the expanded and contracted gene families annotated with the GO term ‘keratan sulfate metabolism’ (see Fig. 3b). Representation is as in G. Note that one identified contracted gene family contains no brittle star genes (ST3GAL1-like) and is thus absent from the figure. J. Genes from expanded and contracted keratan sulfate gene families are associated with specific regeneration clusters (Supplementary Table 7). Representation is as in H.
|
|
Extended Data Fig. 7. Gene ontology enrichment results for brittle star arm regeneration co-expression clusters.GO enrichment tests were performed on each co-expression cluster and summarised using REVIGO (Methods). The complete list of enriched terms is presented in Supplementary Table 10.
|
|
Extended Data Fig. 8. Clustering and functional enrichments for the axolotl and Parhyale limb regeneration gene expression time series.A. Normalised expression profiles (expression of the centroid) for each of the n = 12 axolotl limb regeneration co-expression clusters. Raw expression data were re-processed from Stewart et al.49 (Methods). Barplots on the right indicate the number of genes assigned to each cluster. Clusters with genes expressed over a single regeneration time point were defined as minor clusters and not presented in the main text as they may be driven by noisy gene expression. B. Gene ontology enrichment for each co-expression cluster (Methods, Supplementary Table 6). C. TF binding motifs enriched around the TSS of genes from axolotl co-expression clusters (hypergeometric test adjusted p-value < 0.05, Methods). Note that only TFBS motifs enriched in brittle star clusters are represented. D. Optimal number of clusters estimated using the centroid distance. We selected n = 12 clusters since further increase of the number of clusters does not result in a significant decrease of the centroid distance until n = 16, which, on the basis of functional enrichment tests, over-clusters the data. E. TF binding motifs enriched around the TSS of genes from Parhyale co-expression clusters as in C. Parhyale clusters were renamed from Sinigaglia et al.48 as follows: P1 is R4 in the notation of Sinigaglia et al., P2 is R1, P3 is R8, P4 is R2, P5 is R6, P6 is R3, P7 is R5 and P8 is R7.
|
|
Extended Data Fig. 9. Comparison of co-expression gene clusters during regeneration and development.A. Clustering of the brittle star development time series. Normalised expression profiles for each of the n = 8 development co-expression clusters. Processing, clustering procedure and representation is as in (Fig. 4, Extended Data Fig. 8). RNA-seq source listed in Supplementary Table 1. B. Gene ontology enrichment for each co-expression cluster. C. Curated gene lists enrichment for each co-expression cluster (hypergeometric test, Benjamini-Hochberg adjusted p-values < 0.05). D. Comparison of co-expressed gene clusters deployed during embryonic development and appendage regeneration in the brittle star. Note that the embryonic development in brittle star does not produce appendages and is thus less informative than Parhyale development data. Clusters are represented by vertical rectangles whose sizes are proportional to the number of homologous genes in the cluster, and coloured according to enriched GO terms. Genes are linked across clusters, with coloured links indicating significant overlaps (hypergeometric test with the Benjamini-Hochberg correction <0.01, darker shades indicate p-values < 10-15). E. Comparison of co-expressed gene clusters deployed during appendage regeneration in the brittle star and leg development in Parhyale. Clusters in Parhyale (clusters PE1 to PE4) correspond to the clustering reported in Sinigaglia et al.48, but clusters were renamed to follow temporal activation (PE1 corresponds to E2, PE2 to E4, PE3 to E1, PE4 to E3). Coloured links indicating significant overlaps (permutation-based over-representation p-values with Benjamini-Hochberg correction <0.05, Methods). F. Comparison of co-expressed gene clusters deployed during development in the brittle star and leg development in Parhyale, as in E. G-K. Gene list enrichment tests, for genes with a conserved expression profile during appendage regeneration, as in Fig. 5d, but sub-divided by cluster and species comparisons (hypergeometric tests, p-values corrected for multiple testing with the BH procedure, * p-values < 0.05, ** p-values < 0.01, *** p-values < 0.001).
|
|
Extended Data Fig. 10. Differential transcriptional activity of repetitive elements in the immune and proliferative phases of brittle star arm regeneration.A. Differentially expressed repetitive elements in early regeneration (immune phase: 48 hpa and 72 hpa samples) versus middle regeneration (proliferation: Stage3, Stage4, Stage5 samples). Coloured dots represent repeat families with significant up-expression in immune (blue) or proliferation phases (orange) (absolute log fold change > 1, FDR < 0.001, two-sided Wald test p-values corrected for multiple testing using the BH procedure, Methods). B. Immune up-expressed repeat families (n = 80) have a higher genomic coverage than proliferation up repeat families (n = 23), regardless of repeat class. Coverage is shown subdivided by repeat class, where classification was performed first using the homology-based approach of RepeatModeler, then with DeepTE for repeats that could not be classified by RepeatModeler (Methods). We note that the DeepTE classification has higher false positives than the RepeatModeler classification. C. Immune up-expressed repeat families have significantly lower divergence to their consensus (Kimura distance, Methods) than proliferation up-expressed repeat families (Mann–Whitney U test, one-sided p-value corrected for multiple testing with the BH procedure, * p-values < 0.05), indicating they are younger repeats with a higher potential to still be active mobilisable transposable elements. Distribution details [minima, bottom whisker, q1, median, q3, top whiskers and maxima] are as follows: not DE [0, 0, 5.02, 10.48, 17.75, 36.81, 49.97], up immune [0.4, 0.4, 4.62, 8.76, 15.44, 31.43, 33.7], up proliferation [0.88, 0.88, 6.56, 15.12, 22.26, 35.93, 35.93]. D. Immune up-expressed repeat families with low divergence from their consensus have significantly higher fraction of intergenic repeat instances, suggesting up-expression is less likely to be a side-effect of host gene transcription. P-values and boxplot colours are as in C (grey = no significant differential expression, blue = up in immune, orange = up in proliferation). Repeat families were subdivided in 4 balanced categories based on their divergence to consensus (Kimura distance, d): d < 5.02 (very low), 5.02 < d < 10.48 (low), 10.48 < d < 17.73 (medium), 17.73 < d (high). Distribution details as in C are as follows (boxes from left to right): [0.0, 0.19, 0.46, 0.56, 0.64, 0.91, 1.0], [0.34, 0.34, 0.47, 0.53, 0.61, 0.66, 0.66], [0.18, 0.35, 0.39, 0.47, 0.5, 0.65, 0.65], [0.0, 0.2, 0.46, 0.56, 0.64, 0.9, 1.0], [0.18, 0.41, 0.52, 0.6, 0.66, 0.76, 0.76], [0.41, 0.41, 0.44, 0.57, 0.62, 0.62, 0.94], [0.0, 0.19, 0.46, 0.56, 0.64, 0.9, 1.0], [0.2, 0.57, 0.6, 0.63, 0.66, 0.71, 0.78], [0.31, 0.31, 0.36, 0.43, 0.47, 0.47, 0.64], [0.0, 0.22, 0.48, 0.58, 0.66, 0.91, 1.0], [0.33, 0.47, 0.58, 0.64, 0.67, 0.79, 0.79], [0.29, 0.29, 0.37, 0.43, 0.48, 0.56, 0.56].
|