A Caenorhabditis elegans protein with a PRDM9-like SET domain localizes to chromatin-associated foci and promotes spermatocyte gene expression, sperm production and fertility

To better understand the tissue-specific regulation of chromatin state in cell-fate determination and animal development, we defined the tissue-specific expression of all 36 C. elegans presumptive lysine methyltransferase (KMT) genes using single-molecule fluorescence in situ hybridization (smFISH). Most KMTs were expressed in only one or two tissues. The germline was the tissue with the broadest KMT expression. We found that the germline-expressed C. elegans protein SET-17, which has a SET domain similar to that of the PRDM9 and PRDM7 SET-domain proteins, promotes fertility by regulating gene expression in primary spermatocytes. SET-17 drives the transcription of spermatocyte-specific genes from four genomic clusters to promote spermatid development. SET-17 is concentrated in stable chromatin-associated nuclear foci at actively transcribed msp (major sperm protein) gene clusters, which we term msp locus bodies. Our results reveal the function of a PRDM9/7-family SET-domain protein in spermatocyte transcription. We propose that the spatial intranuclear organization of chromatin factors might be a conserved mechanism in tissue-specific control of transcription.


KMT identification
Using Blast searches, we confirmed the existence of 36 putative SET-domain-containing KMTs encoded by the C. elegans genome, as reported by Andersen and Horvitz (2007). We scored SET domain sequence alignments with the BLOSUM62 scoring matrix and used simple bootstrapping to assess statistical robustness (Software package Geneious, Biomatters Ltd.).
SET domain orthology relationships were defined as mutual best matches in BlastP searches using only the core SET domain of each protein. We then compared these results with mutual best matches using the entire protein sequence. We refer to genes as orthologs only if they were mutual best matches in both comparisons.

KMT probe sets for smFISH
smFISH probe sets were designed using standard parameters (Raj et al., 2008;2010): 20 nt oligos at least 2 nt apart from each other with a target GC-content of 45%. We computationally checked all oligos for off-target matches in the C. elegans genome and eliminated those with perfect off-target matches. We ordered 48 20 nt oligos for each KMT mRNA. If mRNAs were too short to accommodate 48 probes, we obtained the highest number of high-quality probes possible. Probes and their sequences are available upon request. We ordered the probes as individual 20 nt oligos each with a 3' amine residue. Fluorescent dyes were coupled to the oligos using reactive N-hydroxysuccinimide (NHS) esters in aqueous solution. We coupled all KMT probe sets to Cy5 NHS-mono ester (GE Health Care, PA15101) and Alexafluor 594 NHS-mono ester (Thermo Fisher Scientific, A20004). smFISH smFISH studies were performed essentially as described (Raj et al., 2008;2010;Ji and van Oudenaarden, 2012). Animals were collected in M9, washed 3 times with water and fixed in 4% formaldehyde, 1x PBS for 30-45 min at room temperature. Samples were stored in 70% EtOH at 4˚C until hybridization for at least 24 hrs. For hybridization, animals were prepared in Wash Buffer (10% Formamide, 2x SSC), and hybridization was done in Hybridization Buffer (10% formamide) at 30˚C overnight. Samples were washed in Wash Buffer for 2x 30 min, including 5 ng/µl DAPI for DNA staining in the second wash step. Formamide was washed 3 times in 2x SSC, and samples were resuspended in Imaging Buffer (glucose oxidase buffer with glucose oxidase and catalase, Sigma). For imaging, two coverslips were used to compact samples for improved imaging quality. Imaging was done using a Nikon Ti microscope with a mercury light source (Prior), and images were acquired on a Pixis 1024 (Princeton Instruments) CCD camera. Custom filters for Cy5 (628 nm-40 exciter, 685 nm-40 emitter, 655 nm dichroic) and Alexa594 (590 nm-20, 628 nm-25, 602 nm) were used (Semrock). Exposure times were 2 sec for both filter sets.
We collected Z-stacks of 0.3 µm across the sample, 15-25 slices per sample. smFISH images were processed using custom Matlab (Mathworks) scripts. In essence, images were filtered twice with Laplacian-of-Gaussian edge-detection filters to amplify smFISH signal, and a threshold was defined from the signal intensity histogram as the first plateau that separates background from signal. Only images that could be processed at high quality and clear thresholding were used for tissue-specificity analysis. We used manual inspection of highquality smFISH images of each KMT probe set to determine the tissues of expression. We scored a gene as being expressed in a tissue if we detected expression in most cells in that 6 tissue in at least 10 animals. Tissues were identified by position in the animal, and individual cells by nuclear positions and nuclear morphology.
msp mRNA fluorescence was quantified from raw images. Background signal was subtracted and was <1% of the msp signal for all genotypes. msp expression was quantified in late-stage mature spermatocytes by quantifying the average signal across several cells.

Broodsize assays
For broodsize determination, young L4 animals were individually placed on plates with food and transferred to fresh plates every 24 hr for 3 days. The number of progeny was determined by counting the number of adult animals produced over 6 days. All experiments were performed at 25˚C. Strains were always maintained at 20˚C and only raised to the experimental temperature as young larvae for broodsize analysis.

Single-copy transgene strain construction
Single-copy transgenes were created essentially as described (Frojkaer-Jensen, 2008;2012  , for details about the plasmids and procedure). For the construction of a germline-specific set-17 rescue strain, we replaced the set-17 5' upstream region with the mex-5 minimal promoter and followed the same injection and selection protocol.

Mating assays
Mating efficiency was determined at 25˚C. Individual L4 females (hermaphrodites feminized by fog-2 mutation) were moved to standard plates for 24 hrs to ensure they had not previously mated. L4 males were also separated from females for 24 hrs prior to mating assays.
For mating assays, a single male was placed onto a 6 cm plate with one female. Bacterial lawns on mating plates were kept small, ~1 cm diameter. Mating was allowed to proceed for 24 hrs, after which the male was removed and the female was moved to a new plate every 24 hrs for 4 days, until after all progeny had been generated. We determined separately that set-17 animals generate 1:1 males:females from crosses (data not shown).

Spermatid counts
To determine the number of spermatids in hermaphrodite spermathecas, we maintained animals at 25˚C and collected bleach-synchronized early young adults. We selected only animals that had completed sperm production and with oocytes that either had loaded into the spermatheca or were about to be loaded. We stained animals with DAPI and imaged them using an inverted Nikon Ti Epi-fluorescence microscope with a 100x 1.4 NA oil objective.
We collected Z-stacks of 0.5 µm thickness through the entire spermathecal region (10-20 slices). Images were collected using a Pixis 1024 CCD (Princeton Instruments) or an Orca Flash 4.0 scMOS (Hamamatsu) camera. To quantify spermatids, we counted DNA stained bodies 8 manually in the Z-stacks. Often only one spermatheca could be imaged per animal because the intestine obstructed reliable quantification of spermatids.

Electron microscopy
Males were maintained at 25˚C and collected 24 hrs post-L4 stage as adults. Adult males were fixed in 0.7% glutaraldehyde, 0.7% OsO 4 , 0.1M cacodylate buffer for 1 hr on ice. During this step the head region of individual worms was excised. Worms were washed in 0.1M cacodylate buffer and post-fixed in 2% OsO 4 in 0.1 M cacodylate buffer overnight at 4˚C. Three to five worms were mounted into agar blocks, dehydrated in a series of alcohols and embedded in Epon resin. Thin sections of 50-70 nm of longitudinal and cross sections were obtained using an Ultracut microtome E. Samples were observed using a JEOL JEM 1200 EX II electron microscope at 80 kv and imaged with a side-mounted AMT XR-41 CCD camera.
Images of wild-type and set-17 adult male germlines were inspected manually for abnormalities of germ cells, spermatocytes or spermatids. The set-17 defect in fibrous-body membranous organelle production was first detected at this level. Quantification of cytoplasmic cross-sectional area was done manually using Image J and Excel. Cytoplasmic cross-sectional area was defined as the area of the total cell minus that of the nucleus. The size of individual FB-MOs was determined manually by tracing their circumferences. We computed the percent cross-sectional cytoplasmic area of FB-MOs by dividing the sum of the individual FB-MO crosssectional areas by that of the cytoplasmic cross-sectional area. FB-MOs were identified by homogeneous grey staining and the double-membrane around the grey structures. We analyzed spermatocytes from at least three different adult males of wild-type, set-17 and set-17; P set-17 ::set-17(+).

9
To calculate the number of FB-MOs per cell from the EM cross sections, the diameter of individual FB-MOs has to be taken into account. Since the cross sections randomly sampled the cells, the probability of observing an FB-MO is proportional to its diameter. We computed the radii of FB-MOs assuming that they behave like spheres, i.e. that the radius is proportional to the square root of the area. Using FB-MO areas already determined, we computed the average FB-MO radius for wild-type, set-17 and set-17; P set-17 ::set-17(+). We then normalized the observed number of FB-MOs by the observed radii from wild-type, set-17 and set-17; P set-17 ::set-17(+) adult males.
Special membrane structures in spermatids were quantified by manual inspection of EM cross-sections of spermatids.

Immunofluorescence of male spermatocyte nuclei
We maintained males at 25˚C and extruded their germlines by microsurgery with syringe needles. Fixation and staining were conducted as described (Libuda et al., 2014).

Transcript analysis using RNA sequencing
We collected total RNA from two biological replicates of bleach-synchronized L4s from wild-type, set-17 (n5017) Kolmogorov-Smirnov (KS). Such tests were appropriate given the asymmetric negative skew of the fold-change distributions.

Identification of clusters of spermatocyte-enriched genes
Using the chromosome and base-pair start position of all previously identified spermatogenic and oogenic genes in WormMine (wormbase.org, genome assembly WS210), we generated histograms with a bin size of 50 kb to identify gene clusters of spermatogenic and/or oogenic genes. To test for significant enrichment of genes in a particular 50 kb interval we used the hypergeometric test as described (Miller et al., 2004), with a multiple-hypothesistesting correction for the number of spermatogenic genes per chromosome. We chose a cut-off of FDR < 0.05 for cluster identification for all chromosomes.

Transcription site (TS) analysis
msp transcription sites (TSs) were identified in the nuclei of primary spermatocytes that also exhibited msp mRNA expression in the cytoplasm. Foci were visually separated from background using a finite Fourier transform bandpass filter to suppress high frequencies and counted manually. Nuclei were identified based on DNA staining with DAPI.
To determine the probability of transcriptional activity of individual msp gene clusters, we tested if the TS data followed the binomial distribution. The TS distribution should be governed by a simple binomial model if TSs are independent random events that occur with equal probability at each cluster. The probability of cluster activation is then given by the empirical mean number of clusters per nucleus divided by the number of possible states. The number of states, i.e. the maximal number of msp gene clusters that were transcriptionally active simultaneously was equal to four in our experiments.
To visualize SET-17::GFP and msp TSs, we prepared samples as described above for smFISH imaging, except we used EM-grade MeOH-free formaldehyde and fixed samples in the dark to detect GFP fluorescence from the transgene directly without additional labeling of the protein. To preserve GFP fluorescence we identified msp transcription sites in the nuclei of msp-expressing spermatocytes and imaged SET-17::GFP only at those sites. We computed the fraction of msp transcription sites that also were SET-17::GFP foci.

Fluorescence recovery after photo-bleaching (FRAP)
We immobilized 1-day old adult males with 10 mM levamisole (an acetylcholine agonist that immobilizes animals by inducing muscle contraction) on soft 2% agarose pads for confocal imaging. Animals can be revived from this preparation for up to 1 hr after immobilization. We imaged spermatocytes that were proximal to the cover slip, i.e. only the hypoderm and cover slip separated the objective and the spermatocytes. We identified imaging regions of interest of different sizes, including at least two spermatocyte nuclei for each FRAP experiment (one for FRAP measurements, one for bleaching and movement control). We imaged several Z-slices of 0.5-1 µm separation, because there was frequent shifting of the animals, altering the focal plane of the foci. A region of interest for FRAP was defined manually for a given SET-17::GFP focus. To achieve complete bleaching of GFP in the foci, bleaching was conducted with 25 rounds of 100% laser induction with a 488 nm laser (Zeiss LSM 700). Spermatocytes were then tracked for at least 6 min, up to 20 min.
Data were analyzed manually, tracking the bleached ROI in the movies and quantifying average fluorescence for the bleached and control foci.     (E) Levels of msp expression of all 28 msp genes correlate with set-17 transcript levels as measured by RNAseq in wild-type, set-17, set-17; P set-17 ::set-17(+) (orange) and set-17; P mex-