Zebrafish were recently used as an in vivo model system to study DNA replication timing during development. Here is detailed the protocols for using zebrafish embryos to profile replication timing. This protocol can be easily adapted to study replication timing in mutants, individual cell types, disease models, and other species.
DNA replication timing is an important cellular characteristic, exhibiting significant relationships with chromatin structure, transcription, and DNA mutation rates. Changes in replication timing occur during development and in cancer, but the role replication timing plays in development and disease is not known. Zebrafish were recently established as an in vivo model system to study replication timing. Here is detailed the protocols for using the zebrafish to determine DNA replication timing. After sorting cells from embryos and adult zebrafish, high-resolution genome-wide DNA replication timing patterns can be constructed by determining changes in DNA copy number through analysis of next generation sequencing data. The zebrafish model system allows for evaluation of the replication timing changes that occur in vivo throughout development, and can also be used to assess changes in individual cell types, disease models, or mutant lines. These methods will enable studies investigating the mechanisms and determinants of replication timing establishment and maintenance during development, the role replication timing plays in mutations and tumorigenesis, and the effects of perturbing replication timing on development and disease.
For cells to successfully divide, they must first accurately and faithfully replicate their entire genome. Genome duplication occurs in a reproducible pattern, known as the DNA replication timing program1. DNA replication timing is correlated with chromatin organization, epigenetic marks, and gene expression2,3. Changes in replication timing occur throughout development, and are significantly related to transcriptional programs and alterations to chromatin marks and organization4,5. Furthermore, replication timing is correlated with mutational frequencies, and changes in timing are observed in various types of cancer6,7,8. Despite these observations, the mechanisms and determinants of replication timing establishment and regulation are still largely unknown, and the role it plays in development and disease is undetermined. In addition, until recently the genome-wide replication timing changes that occur throughout vertebrate development had only been examined in cell culture models.
Zebrafish, Danio rerio, are well suited to study replication timing in vivo during development, as a single mating pair can yield of hundreds of embryos that develop rapidly with many similarities to mammalian development9,10. Furthermore, throughout zebrafish development, there are changes to the cell cycle, chromatin organization, and transcriptional programs that share relationships with DNA replication timing11. Zebrafish are also an excellent genetic model, as they are particularly amenable to manipulation by transgenesis, mutagenesis, and targeted mutations, and genetic screens have identified many genes required for vertebrate development12. Therefore, zebrafish can be used to identify genes involved in replication timing establishment and maintenance and to observe the effects of deregulating replication timing on vertebrate development. Transgenic lines can also be used to assess replication timing from individual cell types isolated at different developmental timepoints or in disease conditions. Importantly, there are various zebrafish models of human disease that can be used to investigate the role of replication timing in disease formation and progression9,13,14.
Recently, the first replication timing profiles were generated from zebrafish, establishing it as a model system to study replication timing in vivo15. To accomplish this, cells were collected from zebrafish embryos at multiple stages of development and in a cell type isolated from adult zebrafish. Cells were then sorted by FACS (fluorescence-activated cell sorting) based on DNA content to isolate G1 and S phase populations. Using the G1 sample as a copy number control, copy number variations in S phase populations were determined and used to infer relative replication timing16. Changes in replication timing can then be directly compared between different developmental samples and cell types and this was used to determine changes in replication timing that occur in vivo throughout vertebrate development. This method offers several advantages over other genomic methods, chiefly that it does not require labeling with thymidine analogs or immunoprecipitation of DNA4,6.
Here is detailed the protocols to profile genome-wide DNA replication timing at high-resolution in zebrafish. These protocols have been used to determine relationships with genomic and epigenetic features in the zebrafish genome, as well as profiling changes in these relationships that occur throughout development. These protocols are also easily adapted to study changes in replication timing in mutant strains of zebrafish and in disease models. Additionally, these methods provide a foundation that can be expanded upon to study replication timing in specific cell types, by first sorting out the individual cell types from the zebrafish. The zebrafish can serve as an excellent in vivo model system to study replication timing and to ultimately reveal the biological functions of this important epigenetic trait.
All animals were handled in strict accordance with protocols approved by the Oklahoma Medical Research Foundation Institutional Animal Care and Use Committee.
1. Setting up adult zebrafish for breeding
2. Timed matings – collecting, sorting, and housing zebrafish embryos for experiments
3. Dechorionate, deyolk, and fix zebrafish embryos
NOTE: This section of the protocol is designed for embryos prior to 48 h post fertilization (hpf). There is no need to remove the chorions of embryos at later stages of development (after 48 hpf), as they often naturally fall off. There is no need to deyolk or remove the chorions of fish older the 5 days post fertilization (dpf).
4. Staining DNA and FACS sorting embryos
NOTE: This section of the protocol is designed for embryos at 1 dpf.
5. DNA isolation, RNase treatment, and DNA purification
6. Preparation of DNA libraries and next generation sequencing
NOTE: A G1 copy number reference sample for each biological source is required for each sequencing run (i.e. WT, mutant, transgenic, cell line, etc). Compare all S phase samples from the same biological source in the same sequencing run to the same G1 reference. Run at least two biological replicates of each sample to ensure consistency between samples.
7. Analysis of sequencing data
NOTE: The instructions in this section are intended as a guideline for analysis. Use additional methods for sequencing alignment, filtering, processing, etc. This section of the protocol will deal with the preferred method of analysis in this work. If additional methods are used, adjust the parameters and functions to suit those purposes. The commands below are entered in Ubutnu or Mac terminal, with the appropriate packages installed.
Using published replication timing data, representative replication timing profiles and quality control measures are provided15. The initial steps of processing involve aligning the sequencing data to the genome, calculating read length and genome coverage statistics, and filtering low quality, unpaired, and PCR duplicate reads. Read statistics for a typical zebrafish sequencing sample are shown in Figure 2. After filtering, read counts are determined in variable-sized windows and the data are smoothed and normalized. Typical smoothed/normalized replication timing profiles of a representative zebrafish chromosome for biological replicates are shown in Figure 3A. The profiles for biological and experimental replicates should be visually very similar (see Figure 3A) and also display high correlation along the length of the chromosome (Figure 3B). They should also display high correlation between timing values genome-wide (Figure 3C). Autocorrelation, a method to assess pattern continuity, should be used to evaluate the degree of structure in the data set (Figure 3D). The autocorrelation for structured mature timing programs should be high at short distances and gradually decrease as the distance increases. Samples that show high reproducibility between biological and experimental replicates and a strong autocorrelation signal should be considered as high quality samples and can be combined to obtain a higher coverage sample.
Figure 1: Sorting zebrafish embryos for replication timing analysis. (A) Gated population of all cells sorted for forward scatter area (FSC-A) by side scatter area (SSC-A). The colors represent the density of bins in the dot plot with high-to-low density represented by colors ranging from red-to-green-to-blue. (B) Gated population of the FSC x SSC gated cells sorted for SSC-A by propidium iodide area (PI-A). (C) Gated population of the SSC x PI gated cells sorted for SSC-A by side scatter width (SSC-W). (D) Histogram of all gated populations displaying a stereotypical cell cycle profile. Sort gates for cells in G1 and S phase are displayed by the grey shaded boxed with black lines. (E) Backgated G1 and S phase populations show that the initial gating captured the majority of cells. Please click here to view a larger version of this figure.
Figure 2: Sequencing read statistics for a typical zebrafish sample. (A) Reap mapping quality from statistics of mapQ values. (B) Histogram of the distribution of insert sizes for all sequencing reads. (C) Read statistics for total mapped reads, reads containing low quality flag, reads with pairs mapping to a different chromosome (pair diff chr), reads with pair beyond distance threshold (distant pair), and PCR duplicates. (D) Representative numbers of total mapped, unmapped, and unpaired reads, coverage, and read resolution for a typical sequencing run of a zebrafish sample. Please click here to view a larger version of this figure.
Figure 3: Representative zebrafish replication timing results and quality control measures. (A) Replication timing along the length of a representative chromosome for two biological replicates of 28 hpf embryos (Siefert, 2017). (B) Correlation between replication timing values in 2 Mb windows along the length of a representative chromosome for the biological replicates of 28 hpf embryos shown in Figure 3A. (C) Genome-wide correlation between replication timing values for biological (rep1 and rep2) and experimental (rep3) replicates of 28 hpf embryos. The color map represents Pearson's correlation coefficient. (D) Autocorrelation for a structured mature timing program displays high autocorrelation at close distances that decreases gradually over increasingly longer distances. Please click here to view a larger version of this figure.
Zebrafish provide a new and unique in vivo model system to study DNA replication timing. When timed matings are performed as detailed in this experimental protocol, thousands of embryos can be collected in a single day for experiments. These embryos develop synchronously through precisely timed and distinctly characterized stages of development. Zebrafish can be easily and accurately staged by morphology using a stereomicroscope, as zebrafish embryos develop externally and are optically clear. This protocol details the use of zebrafish embryos to study replication timing throughout vertebrate development.
Areas of the protocol that may present problems include ensuring synchronous development of zebrafish embryos, loss of cells during preparation, FACS sorting, and library preparation. The timing of zebrafish development is temperature dependent, therefore use zebrafish adapted to light/dark cycles and housed at 28.5 °C for experiments. To ensure synchronous development, perform precisely timed matings, and collect embryos in very small time-windows (10 min or less). It is critical to maintain embryos at 28.5 °C and ensure they spend minimal time at ambient temperature. The timing of zebrafish development is also highly dependent on the density of embryos in a given area. It is critical to not house embryos at a density higher than 100 embryos per 10 cm plate, or they will not develop properly. The optimal time to remove dead and unfertilized embryos is when they have progressed to the 4-16 cell stage and can easily be identified as fertilized (use "Stages of Embryonic Development of the Zebrafish"17 as a guide). Dead embryos typically appear as black balls and unfertilized embryos appear as 1-cell. Work as quickly as possible to reduce the time at ambient temperature.
When dechorionating embryos, it is important to ensure all chorions are removed from the embryos. Keep embryos in pronase solution until most chorions have come off. Discard any remaining embryos with chorions intact or remove their chorions with forceps. Pipette embryos using a slow and smooth motion. Do not pipette rapidly as it will subject the cells to undue stress and result in loss of cells. It is also important not to use a P200 as this can cause increased shear stress that results in breakage of cells. Conversely, a plastic transfer pipette may not have a small enough bore or provide enough shear stress to adequately disrupt the yolk sack and disaggregate the cells. Alternative pipettes could be used provided the bore and shear would be equivalent to the optimal P1000.
When FACS sorting 28 hpf, if a stereotypical cell cycle profile is not obtained, adjust the voltage on the PI laser so that the G1 peak is centered around 50K. The gains for the FSC and SSC may also need to be adjusted such that the gating and cell populations appear similar to Figure 1. If there is still difficulty, the ND filter should be switched, and the FSC and SSC voltages adjusted to ensure the majority of cells are within the gate. It should be obvious whether there is PI staining, as there will be a range that approximates doubles for the PI-A. The histogram should show two clear peaks, a large peak at 2N DNA for the G1 cells, and a smaller peak at double the intensity for the 4N DNA content of G2/M cells. If this is still not obtained there may either be low numbers of intact cells, problems with the PI staining, or problems with fixation and the protocol will need to be optimized accordingly.
Cells from early embryos (prior to 10 hpf) will not give typical cell cycle profiles, as a high percentage of these cells are in S phase. These samples can be treated as S phase samples, and compared to a G1 reference from 24 hpf embryos. Embryos between 10 hpf and 24 hpf may give intermediate cell cycle profiles and can be sorted if desired. The S-phase population can also be identified and sorted by incorporating thymidine analogs such as EDU (5-Ethynyl-2'-deoxyuridine) or BrdU (Bromodeoxyuridine) in brief pulses prior to fixation, however, this is not necessary for accurate determination of replication timing.
Ideally the library preparation should be started with 1 µg of DNA. Theoretically ~300,000 sorted cells would yield about 1 µg of DNA (estimating ~3.3 pg/nucleus), however, there is always loss throughout the protocol. 500,000 to 1 million sorted cells should give more than 1 µg of DNA. A typical 24 hpf embryo will have approximately 18,000 cells, 10-15% of which will be in S-phase. Embryos at earlier stages of development will have considerably less cells, but will have higher percentages of cells in S-phase.
Performing accurate purification and size selection using magnetic beads is critical to eliminating unwanted elements from the library preparation. Follow the manufacturer's instructions carefully and additional optimization may be required. A small amount of PCR dimers is typically not a major problem, but adapter dimers will sequence very efficiently so they should be minimized. This can be achieved by adjusting the initial adapter to DNA ratio, and performing accurate size selection during library preparation.
Single-end sequencing may also be appropriate for different experimental needs. Single-end is cheaper and provides most of the required information since this approach relies on counting reads; paired-end provides a higher ability to remove PCR and optical duplicates and to resolve areas of repetitive sequences and other structural variations (which are common in the zebrafish genome). The analysis portion below will deal only with paired-end sequencing reads. Shorter sequencing reads may also be appropriate. This will provide a higher number of reads at a lower cost, but the reads may be more difficult to map. The analysis portion below is optimized for 100 bp reads, and adjustments may be necessary if shorter read length is used.
This protocol can also easily be adapted for additional uses of the zebrafish model. It has already been used for analysis of zebrafish tailfin fibroblasts (ZTF cells), a primary cell culture line isolated from adult zebrafish tailfin. To study isolated cell types from zebrafish, transgenic markers can be used to identify and isolate individual cell types before staining and sorting for DNA content. One potential strategy would be to use a transgenic line expressing a fluorescent marker driven by a tissue-specific promoter, and to first isolated cells by FACS sorting, followed fixation and sorting for DNA content. Additionally, the use of cell permeable DNA dyes may allow for simultaneous isolation of individual cell types as well as G1 and S phase populations. These adaptations to this protocol could also enable replication timing studies in other in vivo models.
An exciting possibility for the future use of this protocol is to study the role of replication timing in disease. There are several cancer models in zebrafish, some spontaneous and other inducible, that can be used to determine when changes in replication timing occur during oncogenesis. This will allow assessment of the role replication timing plays in tumorigenesis and disease progression. Furthermore, zebrafish are an excellent model for drug screening and can be used to identify drugs that affect replication timing regulation, which have the potential for use in cancer treatment.
The zebrafish is a promising new model to study replication timing. This protocol will afford many others the opportunity to utilize this model in studying the role of replication timing in development and disease. This protocol can be adapted for use with other in vivo developmental model systems, such as Drosophila melanogaster and Xenopus laevis, and look forward to future findings from these and other organisms.
The authors have nothing to disclose.
This work was supported by National Institute of General Medical Sciences of the National Institutes of Health through grants 5P20GM103636-02 (including Flow Cytometry core support) and 1R01GM121703, as well as awards from the Oklahoma Center for Adult Stem Cell Research.
NaCl | Fisher Scientific | BP358-10 | |
KCl | Fisher Scientific | P217-500 | |
CaCl2 | Fisher Scientific | C79-500 | |
MgSO4 | EMD Millipore | MMX00701 | |
NaHCO3 | Fisher Scientific | BP328-500 | |
Pronase | Sigma | 10165921001 | protease solution |
Phosphate buffered saline (PBS) | Sigma | D1408 | |
Ethanol (EtOH) | KOPTEC | V1016 | |
Bovine serum albumin (BSA) | Sigma | A9647-100G | |
Propidium Iodide (PI) | Invitrogen | P3566 | |
Tris-HCl | Fisher Scientific | BP153-500 | |
EDTA | Sigma | E9844 | |
SDS | Santa Cruz | sc-24950 | |
Proteinase K | NEB | P8107S | |
Phenol:Chloroform | Sigma | P3803-100ML | |
Sodium acetate | J.T.Baker | 3470 | |
Glycogen | Ambion | AM9510 | |
RNase A | Thermo Scientific | EN0531 | |
Quanit-iT | Invitrogen | Q33130 | Reagents for fluorescence-based DNA quantification |
Covaris AFA microTUBE | Covaris | 520045 | specialized tube for sonication |
Covaris E220 Sonicator | Covaris | E220 | focused ultrasonicator |
Agilent 4200 Tapestation | Agilent | G2991AA | automated electrophoresis machine |
D1000 ScreenTape | Agilent | 5067-5582 | Reagents for automated electrophoresis machine |
NEBNext Ultra DNA Library Prep Kit for Illumina | NEB | Cat#E7370L | DNA library preparation kit |
NEBNext Multiplex Oligos Kit for Illumina (Index Primers Set 1) | NEB | Cat#E7335S | multiplex oligos for DNA library preparation kit |
NEBNext Multiplex Oligos Kit for Illumina (Index Primers Set 2) | NEB | Cat#E7500S | additional multiplex oligos for DNA library preparation kit |
NEBNext Library Quant Kit for Illumina | NEB | E7630L | quantification kit for library preparation |
Agencourt AMPure XP beads | Beckman Coulter | A63882 | magnetic beads |
Illumina HiSeq 2500 | Illumina | SY–401–2501 | next generation DNA sequencing platform |
40 µm Falcon Nylon Cell Strainer | Fisher Scientific | 08-771-1 | |
VWR Disposable Petri Dish 100 x 25 mm | VWR | 89107-632 | |
6.0 mL Syringe for Nichiryo Model 8100 | VWR | 89078-446 | |
Posi-Click Tubes, 1.7 mL, Natural Color | Denville Scientific | C2170 (1001002) | Dnase/Rnase free |
Vortex Genie 2 | Scientific Industries | SI-0236 | |
Wash Bottles | VWR | 16650-022 | Low-Density Polyethylene, Wide Mouth |
Strainer | VWR | 470092-440 | 6.9 cm, fine mesh |
Corssing tank | Aquaneering | ZHCT100 | individual breeding tank |
iSpawn | Techniplast | N/A | large breeding tank |
FACSAria II | BD biosciences | N/A | cell sorting machine |
Wild M5a steromicroscope | Wild Heerbrugg | N/A | dissecting microscope |
Qubit 3 Fluorometer | Thermo Scientific | Q33216 | quantitative fluorescence-based method for determining DNA concentration |
Matlab | Mathworks | version 2017a | |
Matlab Statistics Toolbox | Mathworks | version 11.1 | |
Matlab Curve Fitting Toolbox | Mathworks | version 3.5.5 |