|
Fig. 1. Conceptual overview of molecular mechanisms that could produce differences in chromatin accessibility between species. While several distinct molecular processes can modulate chromatin configuration, these can be grouped into 2 broad categories: those that are genetically based near the OCR of interest (cis) and those based elsewhere in the genome (trans). a) Cis-based changes (magenta here and in subsequent figures) are likely caused by a local mutation that alters the binding or interaction of an already-present protein (the alternative is a trans-generational nongenetic influence on chromatin that differs between species; such influences are likely to be uncommon based on their incidence within well studies species). Depending on whether the mutation raises or lowers binding affinity, and the protein's biochemical function, the consequence could be either an increase or decrease in accessibility at a specific genomic location (bidirectional arrows). Trans-based changes (light blue here and in subsequent figures) can be caused by a mutation that either alters the amino acid sequence or post-translational processing of a protein and thereby modifies its function, or by a change in the presence or concentration of a protein. Again, depending on the specific nature of these changes, chromatin accessibility at a specific location in the genome could either increase or decrease (bidirectional arrows). b) Model of cis-based change. In the He same-species cross (left panels), DNA is tightly wound around nucleosomes, and thus trans-acting factors (proteins) are unable to interact with it, resulting in very small peaks on both alleles. In the Ht same-species cross (right panels), DNA is not wrapped around nucleosomes, leaving it accessible to trans-acting factors, and thus generating 2 large peaks in the corresponding browser tracks. In hybrids (center), 1 allele from each parent is inherited, yielding 1 large peak and 1 small peak as the 2 alleles differ in their ability to interact with trans-acting factors. c) Model of trans-based change. In the He same-species cross (left panels), DNA is tightly wound around nucleosomes, and trans-acting factors are unable to interact with it, resulting in very small peaks on both alleles. In the Ht same-species cross (right panels), trans-acting factors (which differ from He trans-acting factors) are able to open up the chromatin and interact with it, thus generating 2 large peaks in the corresponding browser tracks. In the hybrid cross (center), trans-acting factors from each parent are present and able to interact with alleles inherited from either parent. Thus, the Ht trans-acting factors are able to open up the chromatin on both alleles, generating 2 equal-sized peaks in the browser tracks for the hybrid cross. Right, real examples of browser tracks correspond to distinct regulatory modes.
|
|
Fig. 2. Chromatin configuration in parents and hybrids. a) Experimental design and workflow. Samples from 3 biological replicates of 3 genetic crosses (He × He, the maternal same-species cross; Ht × Ht, the paternal same-species cross; and He (female)×Ht (male), the hybrid cross) were collected at 3 timepoints (12, 18, 24 hpf: hours post fertilization). b) Venn diagram (not area-proportional) of peaks that are unique versus. shared among the same-species crosses and the hybrid cross. The reported peak count is the number of peaks following low-count removal. c) PCA of ATAC-seq results generated from counts table of reads in OCRs. Throughout this study, orange indicates He origin; green Ht origin; and olive hybrid origin.
|
|
Fig. 3. Contrasts between genetic basis for evolutionary changes in chromatin configuration and in transcript abundance, and real examples of peaks with various regulatory modes. a) Line plots of inheritance mode classification for all OCRs (left) and for all genes (right). See supplementary Fig. S6, Supplementary Material online for models of peak accessibility that exemplify these inheritance modes in the epigenome. b) Line plots of regulatory mode classification for all OCRs (left) and for all genes (right). Transcript abundance data from Wang, et al. (2020). See Fig. 2 and supplementary Fig. S4, Supplementary Material online for models of peak accessibility that exemplify these regulatory modes. c) Example from the data of browser track for a peak with a “cis-based” change in regulation, d) Example from the data browser track for a peak with a “trans-based” change in regulation. In both cases, total accessibility is shown for the same-species crosses, while for the hybrid, the browser track has been broken out into the accessibility of the 2 different alleles. e) Example from the data of browser track for a peak with “conserved” regulation, indicating no statistically significant difference in accessibility for any of the crosses and, in the case of the hybrid, no statistically significant difference in the accessibility of either allele compared to the accessibility of the respective same-species cross. Note that for the same-species crosses, total accessibility is shown, while for the hybrid, the browser track has been broken out into the accessibility of the 2 different alleles.
|
|
Fig. 4. Relationship between evolutionary change in chromatin configuration and transcript abundance. Chi-squared tests for independence were used to measure the correlation between evolutionary changes in OCRs and expression of nearby genes. Heatmaps show residuals for tests carried out in 3 different contexts (see supplementary Fig. S9, Supplementary Material online for a mechanistic explanation of what each test measures). Larger residual values (darker red squares) indicate an enrichment and suggest that there are more of the given event than expected by chance. Tests were carried out separately at each of the 3 developmental stages. a) Conceptual illustration of the tests shown in b, c, and d, illustrating what each quadrant of the test represents. b) The “peaks-focused” tests ask whether the nearest gene to a DA peak is itself differential expressed more often than expected by chance. The chi-squared tests were significant (test statistic P < 0.1) at gastrula and larva. c) The “gene-focused” tests ask whether there is at least one DA peak within 25 kb of a DE gene more often than expected by chance. The chi-squared tests were significant (test statistic P < 0.05) at gastrula and larva. Note that “peaks-focused” and “gene-focused” tests are not redundant, due to the 1-to-many relationship between genes and regulatory elements. d) The “regulatory mode-focused” tests were carried out for genes that are DE between species. These tests ask whether cis- and/or trans-based differential expression of genes are enriched for DA peaks within 25 kb more often than expected by chance. The chi-squared tests were significant (test statistic P < 0.05) at gastrula and larva, with cis-based differential expression enriched for nearby DA peaks and trans-based differential expression depleted for nearby DA peaks.
|
|
Fig. 5. Distinct evolutionary trends in proximal and distal OCRs. a) Line plots showing the proportion of OCRs in each regulatory mode classification for proximal (<500 bp from the TLS of the nearest gene) versus distal (between 500 and 25 kb from TLS). Trans-based differences dominate proximal peaks, while cis-based differences are more common than trans-based differences in the (much more abundant) distal peaks. b) Violin plots contrasting the effect size for distal versus proximal peaks. At each stage, the mean effect size for proximal OCRs was significantly greater than the mean effect size for distal OCRs (Welch's t-test: P = 1.781e-12 for blastula, P = 8.638e-08 for gastrula, P < 2.23e-16 for larva). For a more detailed breakdown of the regulatory modes and effect sizes of distal peaks, see supplementary Fig. S10, Supplementary Material online.
|
|
Fig. 6. Distinct evolutionary trends in open chromatin near developmental regulatory genes. Plots present results for genes and OCRs in 3 classes of interest (“all genes”, “transcription factors” and “GRN genes”). a) Smoothed histograms of the proportion of genes with a given number of nearby peaks. The distributions for “transcription factors” and “GRN genes” were both significantly different from the distribution for “all genes” by a Kolmogorov–Smirnov test (P<<0.01), but not significantly different from each other. The X axis was truncated to 25 for illustration purposes (all 3 distributions are heavily right-skewed, with a tiny proportion of values >25). b) Line plots of regulatory mode classification for all peaks (left) as a proportion of the total number of peaks versus peaks within 25 kb of a GRN gene (right). c) Violin plots of effect size for cis-based peaks in 3 classes of interest. The mean effect size for cis-based peaks near GRN genes was significantly greater than the mean effect size for cis-based peaks near any transcription factor, and also significantly greater than the mean effect size for cis-based peaks in general (Welch's t-test and Scheffe test, both P < 0.05). The mean effect sizes of the latter 2 categories did not significantly differ from each other. d) Violin plots as in (c) but for trans-based peaks in the same 3 classes of interest. Here, the mean effect size for trans-based peaks near GRN genes was smaller than the mean effect size for trans-based peaks near any transcription factor, but the difference was not significant; however, the mean effect size for trans-based peaks near GRN genes was significantly smaller than the mean effect size for trans-based peaks in general (Welch's t-test and Scheffe test, both P < 0.05).
|