|
Figure 1. Clustal alignment of representative PGBD5 orthologs including Petromyzon marinus (sea lamprey) and Branchiostoma floridae (lancelet or amphioxus). For simplicity, many complete or partial vertebrate PGBD5s have been omitted from the alignment. The N-terminal motifs encoded by exon 1 are moderately conserved in all species including zebra finch, suggesting that chicken exon 1 lies within an unsequenced 770 bp gap located 29 kb upstream of exon 2 and immediately downstream of the sole substantial CpG island in the vicinity of the gene. Although human PGBD5 lacks 3 of the 4 catalytically active aspartates that are often conserved among diverse piggyBac elements, the positions of these four aspartates in a ClustalW alignment of piggyBac proteins most closely related to the active cabbage looper moth (Trichoplusia ni) transposase including human PGBD1, 2, 3, 4, and 5 [21] are highlighted in yellow. The Pfam homology with eubacterial and archeal IS4 transposases of the RNase H clan spans almost all of human PGBD5 exons 2–7 (residues 121–487). Figure key: black carets, vertebrate introns; blue caret, lamprey intron apparently orthologous to lancelet although shifted by 3 residues; red carets, lancelet introns; red underline, lancelet protein sequence derived from genomic tandem repeats (Additional file 1); red dashes, 13 residue deletion resulting from exclusion of predicted lancelet exon 5 which is embedded within the 108 bp genomic tandem repeats and would, if included, result in the 57 residue insertion; magenta underline, predicted nuclear localization signal not conserved in active Trichoplusia ni transposase; yellow highlight, position of four conserved, catalytic aspartates in active piggyBac transposases and homologs including human PGBD1, 2, 3, 4, and 5; gray XXX, regions of known length but undetermined sequence arbitrarily positioned in the clustal alignment; zfish, zfinch, xenopu, coelac, lampre, lancel are zebrafish, zebra finch, Xenopus tropicalis, coelacanth, lamprey, and lancelet respectively. Amino acid residues are colored according to the EBI Clustal convention for side chains (red, AVFPMILW; blue, DE; magenta, RK; green, STYHCNGQ; others, grey). To avoid prejudicial judgments regarding the relationship between highly divergent sequences, we refrained from assigning a similarity or homology score to each residue.
|
|
Figure 2. Simplified phylogenetic tree of organisms examined. PGBD5 homologs are found in cephalochordates and all vertebrates examined, but nowhere else. This cladogram does not imply either the timing or degree of evolutionary divergence.
|
|
Figure 3. Lancelet, lamprey, and human PGBD5 are orthologous. Only shared syntenic genes are shown. The schematic is not drawn to scale, introns are not shown, genomic gaps are unsequenced, and distances are indicated in kb (red). The PGBD5 orthologs are oriented for clarity; the order and orientation of the two lamprey scaffolds is arbitrary. The 5′ and 3′ ends of lamprey GALNT2 are joined in the UCSC browser based on homology to the 5′ and 3′ ends of vertebrate GALNT2. Although >500 families of transposable elements constitute approximately 30% of the lancelet genome [32], no other PGBD5 homologs or fragments are found in the lancelet v2.0 draft genome; very similar PGBD5 sequences are present in both scaffolds 83 and 84 of the v1.0 draft genome (genome.jgi-psf.org/Brafl1/Brafl1.home.html), but PGBD5 appears only once in the v2.0 draft genome (downloadable from the UCSC browser at hgdownload.cse.ucsc.edu/gbdb/braFlo2/) in a sequence context most closely resembling scaffold 84 (Additional file 1).
|
|
Figure 4. PGBD5 is primarily if not exclusively expressed in brain and CNS. Affymetrix expression profiles for 79 human tissues and cell lines as described [37] on the BioGPS Gene portal (biogps.org/) indicate that PGBD5 is strongly expressed in brain, and very weakly expressed in spinal cord as well as 4 different leukemia and lymphoma cell lines but not in normal B- and T-cells. Vertical gray dashed lines, levels of expression; vertical magenta solid lines, mean, 4× mean, and 30× mean expression.
|
|
Figure 5. PGBD5 expression in the adult and developing mouse brain. Panels A-D illustrate expression of PGBD5 in sagittal brain sections (right) and a corresponding view from the Allen Institute Brain Atlas (left). The age of the mouse and the sagittal position relative to midline (0.00 mm) are indicated within each panel. The in situ hybridization stains are from public expression databases including The Allen Institute for Brain Science [39] for the adult mouse and Max Planck’s Genepaint [38] for embryos. The atlas images are color coded: olfactory bulb (green); cerebellum (orange); medial pallium (red); hypothalamus (brown); prepontine hindbrain (purple); and medullary hindbrain (blue). PGBD5 expression is restricted to a subset of cells within each nucleus denoted by a more saturated color. For example, PGBD5 expression in the cerebellar granule cell layer is indicated by a darker orange than the surrounding cerebellar cortex. The Max Planck and Allen Institute in situ hybridization probes both span exons 2–7, and are nearly identical; the absence of exon 1 is unlikely to affect the in situ patterns because there is no evidence in either the ENCODE (Additional file 5) or Chromatin State Segmentation data (Figure 6) of a functional promoter between exons 1 and 2.
|
|
Figure 6. PGBD5 chromatin structure confirms location of promoter and transcription start site. (Upper panel) Structure of the human PGBD5 gene (top), chromatin state segmentation data from 9 non-neuronal cell lines (middle), and CpG islands (bottom). (Lower panel) An expanded view of the PGBD5 promoter and presumptive enhancer region. Chromatin state segmentation data are from the same 9 human cell types: GM12878 (EBV-transformed B-lymphocytes), H1-hESC (embryonic stem cells), K562 (chronic myelogenous leukemia), HepG2 (hepatocellular carcinoma), HMEC (mammary epithelial cells), HSMM (skeletal muscle myoblasts), HUVEC (umbilical vein endothelial cells), NHEK (epidermal keratinocytes), and NHLF (lung fibroblasts). The DNase I hypersensitive sites are from H1-hESC and GM12878. The TF ChIP-seq data are a composite of 24 neural and non-neural cell lines including H1-hESC, A549 (adenocarcinomic alveolar basal epithelial cells), BE2_C (brain neuroblastoma), HEK293 (embryonic kidney), HeLa (cervical carcinoma), HUVEC, Jurkat (T-lymphocyte), K562, HepG2, NB4 (acute promyelocytic leukemia), PANC-1 (pancreatic carcinoma), PFSK-1, SK-N-MC (neuronal epithelioma), SK-N-SH_RA (neuroblastoma differentiated with retinoic acid), and U87 (primary glioblastoma), and 10 EBV-transformed B-lymphocytes from various ethnic backgrounds. For a color key to the chromatin state segmentation data, see inset (lower panel). For CpG islands, darker green indicates higher CpG density. For DNase I hypersensitive sites, stronger signals indicate higher sensitivity. For TF ChIP-seq data, darker gray reflects higher occupancy. All data is publicly available as UCSC Genome Browser tracks: Open Chromatin by DNase I HS (ENCODE/Duke University); TF ChIP-seq (ENCODE/Broad Institute/HudsonAlpha Institute/Stanford/Duke/University of Washington); and chromatin state segmentation using a Hidden Markov Model to identify genome-wide patterns in ChIP-seq data for histone methylations and acetylations, histone variant H2AZ, RNAP II, and CTCF (ChromHMM from ENCODE/Broad Institute). Only 8 of the 15 distinguishable chromatin states [48,49] are seen in this particular genomic interval of the 9 cell lines. Figure adapted from visualizations of current ENCODE databases available on the UCSC Genome Browser for hg19.
|
|
Figure 7. PGBD5 partitions between nucleus and cytoplasm in the Daudi Burkitt’s lymphoma B-cell line. Nuclear and cytoplasmic fractions were prepared as described in Methods. (Upper panel) Whole cell, cytoplasmic, and nuclear fractions of 2×105 cells/lane were resolved by 7% SDS-PAGE, blotted, and probed with anti-rPGBD5 antibody. Mouse whole brain extract and human rPGBD5 served as controls. PGBD5 is indicated by an asterisk. Immunoreactive proteins larger than PGBD5 do not immunoprecipitate with anti-rPGBD5 antibody (compare with Figure 8). These proteins may cross-react weakly with antibody after denaturation and immobilization on the membrane, just as the antibody cross-reacts weakly with the hexahistidine tag but not the HA tag on western blotting (data not shown). (Lower panels) Daudi whole cell, cytoplasmic, and nuclear fractions were also probed for lamin A and β-actin to confirm successful partition.
|
|
Figure 8. Immunoprecipitation of endogenous PGBD5 from mouse brain. Mouse whole brain homogenate (5 or 25 μL) was immunoprecipitated with anti-rPGBD5 antibody. The proteins were resolved by SDS-PAGE, blotted, and probed with the same anti-rPGBD5 antibody as in Figure 7. rPGBD5, recombinant PGBD5; E1 and E2, first and second SDS elutions from the beads. Note that proteins larger than PGBD5 are present in whole brain homogenate (rightmost lane) but do not immunoprecipitate (lanes E1 and E2).
|
|
Figure 9. PGBD5 partition in mouse brain. Whole frozen mouse brains were thawed, Dounced with a tight pestle, and partitioned as described in Methods. Crude nuclei were pelleted through a sucrose cushion to remove debris and residual cytoplasm. Aliquots corresponding to equal cell equivalents were assayed by western blotting for PGBD5, glutathione-S-transferase (cytoplasmic marker), and histone H3 (nuclear marker). Nuclear integrity was determined visually by DAPI staining. As in Figure 8, PGBD5 partitioned about equally between nucleus and cytoplasm.
|
|
Figure 10. Salt extraction of sonicated crude nuclei from mouse brain. (A) Crude nuclei from mouse whole brain were lysed and chromatin sheared by sonication. Nuclear lysis was assayed by DAPI staining before and after sonication (top left panel). Sonication was monitored by agarose gel electrophoresis with and without DNase I digestion (top right panel). (B) Sonicated nuclei were extracted with NaCl at the indicated concentrations (mM) with or without prior DNase I digestion. Supernatant and pellet fractions were separated by centrifugation, resolved by SDS-PAGE, and assayed by western blotting for PGBD5 (top panels) or histone H3 (bottom panels).
|