Contamination during the genomic sequencing of microscopic organisms remains a large problem. Here, we show a method to sequence the genome of a tardigrade from a single specimen with as little as 50 pg of genomic DNA without whole genome amplification to minimize the risk of contamination.
Tardigrades are microscopic animals that enter an ametabolic state called anhydrobiosis when facing desiccation and can return to their original state when water is supplied. The genomic sequencing of microscopic animals such as tardigrades risks bacterial contamination that sometimes leads to erroneous interpretations, for example, regarding the extent of horizontal gene transfer in these animals. Here, we provide an ultralow input method to sequence the genome of the tardigrade, Hypsibius dujardini, from a single specimen. By employing rigorous washing and contaminant exclusion along with an efficient extraction of the 50 ~ 200 pg genomic DNA from a single individual, we constructed a library sequenced with a DNA sequencing instrument. These libraries were highly reproducible and unbiased, and an informatics analysis of the sequenced reads with other H. dujardini genomes showed a minimal amount of contamination. This method can be applied to unculturable tardigrades that could not be sequenced using previous methods.
Tardigrades are microscopic animals that can enter an ametabolic state called anhydrobiosis when facing desiccation. They recover by the absorption of water1,2. In the ametabolic state, tardigrades are capable of tolerating various extreme environments, which include extreme temperatures3 and pressures4,5, a high dosage of ultraviolet light6, X-rays, and gamma rays7,8, and cosmic space9. Genomic data is an indispensable foundation for the study of molecular mechanisms of anhydrobiosis.
Previous attempts to sequence the genome of tardigrades have shown signs of bacterial contamination10,11,12,13,14. Genomic sequencing from such small organisms requires a large number of animals and is prone to bacterial contamination; therefore, we have previously established a sequencing protocol using an ultralow input method starting from a single specimen of tardigrade, to minimize the risk of contaminations15. Using these data, we have further conducted a high-quality resequencing and reassembly of the genome of H. dujardini16,17. Here we describe in detail this method for genomic sequencing from a single tardigrade individual (Figure 1). The validation of this sequencing method is beyond the focus of this paper and has already been thoroughly discussed in our previous report16.
This method is comprised of two parts: the isolation of a single tardigrade with the lowest contamination possible, and the high-quality extraction of pictogram levels of DNA. The tardigrade is starved and rinsed thoroughly with water, as well as antibiotics, and observed under a microscope with 500X magnification to ensure the removal of any bacterial contamination. Previous estimates and measurements show that a single individual of tardigrade contains approximately 50 – 200 pg of genomic DNA16, which is extracted by cracking the chitin exoskeleton by freeze-thaw cycles or by manual homogenization. This genomic DNA is submitted to library construction and sequenced on a DNA sequencing instrument. An additional informatics analysis shows high-quality sequencing, as well as low levels of contamination in comparison to previous tardigrade sequencing projects.
1. Preparation
2. Sample Preparation and Contaminant Exclusion
3. Homogenization and DNA Extraction
4. Library Construction Sequence
Temperature | Time | Cycles |
72 ˚C | 3 minutes | |
85 ˚C | 2 minutes | |
98 ˚C | 2 minutes | |
98 ˚C | 20 seconds | 4 Cycles |
67 ˚C | 20 seconds | |
72 ˚C | 40 seconds | |
98 ˚C | 20 seconds | 16 Cycles |
72 ˚C | 50 seconds | |
4 ˚C | Hold |
Table 1: PCR conditions.
5. Quality Check, Quantification, and Sequence of the DNA
NOTE: A quality check is not conducted prior to this step due to the low amount of DNA.
6. Computational Analysis
Contaminant Exclusion:
This protocol involves a thorough washing of the tardigrade and a sterilization with antibiotics treatment to minimize contamination. It also involves a visual checking process to ensure the completeness of these processes. A microscope image made during the validation (step 2.4 of the protocol) is shown in Figure 2. When observed at a 500X magnification, bacterial cells can be seen as small particles that move around the tardigrade individual.
Validation of the DNA Library Quality:
The total amount of the constructed DNA-Seq library is approximately 109.5 ng (7.3 ng/µL x 15 µL)16. To validate the length distribution of the fragmentation, an electrophoresis pattern should be similar to Figure 3. As we set the fragmentation size to 550 bp with a DNA shearing system, the library should be 550 – 600 bp, including the sequencing adaptors. It can be observed that the majority of the sequence library is contained between 200 – 1000 bp and is consistent between replicates (N1 – N4).
Sequence Data Analysis:
The DNA sequencing generated roughly 20- to 25-M paired reads per run. The validation of the quality was conducted using FastQC (Figure 4). The distribution of the quality along the sequenced read is typical of a 300-bp paired run.
Figure 1: Workflow of this protocol. This figure shows a summary of this protocol. Please click here to view a larger version of this figure.
Figure 2: Representative photo of a bacteria-free tardigrade. This figure shows images of a contaminated (left) tardigrade and cleaned (right) tardigrade (Hypsibius dujardini), along with further magnified images (bottom). Rod-shaped cells around the tardigrade are contaminants and are indicated with an arrow. The scale bar indicates 100 µm. Please click here to view a larger version of this figure.
Figure 3: Validation of the fragment length distribution of the constructed DNA library. This panel shows the distribution of the sequencing library size. The purple and green lines indicate the upper and lower markers at 1,500 and 25 bp, respectively. L = Ladder, S = 1 sample/run, N1 – N4 = 4 replicates/run. Please click here to view a larger version of this figure.
Figure 4: Example of the validation of the DNA-Seq quality with FastQC. DNA-Seq data were submitted to FastQC to validate the sequence performance. A representative result for DRR055040 per base sequence quality is shown (DDBJ Sequence Read Archive DRA00445516). (A) This panel shows the forward reads (R1). (B) This panel shows the reverse reads (R2). Please click here to view a larger version of this figure.
Bacterial contamination poses a threat to the genomic sequencing of microscopic organisms. While previous studies on tardigrade genome sequencing have filtered out contamination using extensive informatics methods12,20, we sequenced the genome from a single individual to minimize the risk of contaminations. Since an individual tardigrade contains approximately 50 – 200 pg of genomic DNA16 and is encased in a thick layer of chitin exoskeleton, the exclusion of contaminants and high-quality DNA extraction are the critical points in this protocol. Existing tardigrades cultures are not aseptic, and those collected from the wild carry a lot of contaminants on the surface, as well as the remains of food in their intestines. Previous genome sequencing projects of tardigrades have sequenced 10,000 – 100,000 individuals collectively as one sample12,14, which means the results are very likely to be influenced by bacterial contaminants. In their report, Boothby et al. collected H. dujardini individuals by using their negative phototaxis behavior14, and the group did not employ any anti-bacterial methods.
To visually examine if there are contaminants, we incubated the tardigrade in antibiotics (penicillin/streptomycin) and examined the individual under a 500X microscope. By isolating a single individual and carefully inspecting it for any contaminants, we minimized the risk of possible contaminations. Low levels of contamination were confirmed from the sequencing data as well16. As for DNA extraction, we employed manual homogenization, as well as thermal homogenization18. By submitting the tardigrade individual to liquid nitrogen and 37 °C, cracks were induced in the chitin exoskeleton, and the lysis buffer was able to penetrate the body and lyse the cells. When the DNA yield remains lower than anticipated, both thermal and manual homogenization may be conducted to maximize the yield.
The method stated in this article has several limitations. First, homogenization by freeze-and-thaw cycles was applied from a study on nematodes; thus, the method may only be effective against ecdysozoa. Secondly, due to the amplification of DNA fragments during the DNA sequence library phase, the possibility of PCR errors cannot be ignored. Thus, the sequence data is not recommended for analysis that requires high-accuracy reads (i.e., SNP analysis). Furthermore, as we have stated in the protocol, the usage of the specified SNA-Seq kit shown in the Table of Materials is absolutely critical, due to the low amount of input DNA. This DNA library construction kit ligates Illumina adaptor sequences prior to the amplification; therefore, this library cannot be applied for long-read sequencing using PacBio or Nanopore technology. Finally, a quality check of the constructed DNA library during this protocol occurs only once, after the sequencing library construction. This is due to the low input of DNA since most DNA quantification and electrophoresis methods cannot detect 50 – 200 pg of DNA. Therefore, we have conducted quality checks, such as the electrophoresis (Figure 1) and fluorescence-based quantifications, only after the PCR amplification.
A full discussion of bioinformatics analyses of this data is beyond the scope of this article; however, we have briefly stated several analyses we have conducted. A quality check of the sequencing data with FastQC19 calculates the per-base qualities, sequence duplication, etc. Sequence data that have been validated can be submitted to the genome assembly. We have assembled a 132 Mb genome with MaSuRCA v3.1.321 and have compared the mapping statistics calculated with BWA22 and QualiMap23 of this DNA sequencing library with other H. dujardini genome assemblies16. Furthermore, we also have used this DNA sequencing data for the exclusion of contaminants in our study17, and have observed that the sequenced reads are distributed evenly throughout the genome.
Most projects on non-model organisms start from culturing enough sample material, as was the case with tardigrades24. Technical advances in culture techniques have enabled high quantities of tardigrade culture, but current culture methods are not yet aseptic, and since most tardigrades are still unculturable in labs, it has been nearly impossible to conduct genome or transcriptome sequencing. This DNA sequencing method from a single individual makes it possible to analyze rare tardigrade species, including marine species that have been studied less. By conducting comparative genomics at a wider phyletic area, a further understanding of anhydrobiosis mechanisms in tardigrades may be achieved.
The authors have nothing to disclose.
The authors thank Nozomi Abe, Yuki Takai, and Nahoko Ishii for their technical support in genomic sequencing. This work was supported by Grant-in-Aid for the Japan Society for the Promotion of Science (JSPS) Research Fellow, KAKENHI Grant-in-Aid for Young Scientists (No.22681029), and KAKENHI Grant-in-Aid for Scientific Research (B), No. 17H03620 from the JSPS, by a Grant for Basic Science Research Projects from The Sumitomo Foundation (No.140340), and partly by research funds from the Yamagata Prefectural Government and Tsuruoka City, Japan. Chlorella vulgaris used to feed the tardigrades was provided courtesy of Chlorella Industry Co. LTD.
SZ61 microscope | OLYMPUS | ||
BactoAgar | Difco Laboratories | 214010 | |
Penicillin Streptomycin (10,000 U/mL) | Gibco by life technologies | 15140-148 | |
VHX-5000 System | Keyence | ||
0.2mL Silicone coating tube | Bio Medical Science | BC-bmb20200 | |
Quick-DNA Microprep Kit | ZYMO Research | D3021 | Use of this kit is absolutey critical; see step 3.1 |
1.5 mL microtube | greiner bio-one | 616-201 | See 4.1.1 |
HIgh speed refrigerated micro centrifuge | TOMY | MX-307 | |
Covaris M220 | Covaris Inc. | 4482277 | |
ThruPLEX DNA-Seq kit | Rubicon Genomics | CAT. NO. R400406 | Use of this kit is absolutey critical; see step 4.2 |
Thermal Cycler | Bioer Technology | TC-96GHbC | |
AMPure XP reagent | BECKMAN COULTER Life Science | A63881 | |
Ethanol | Wako | 054-027335 | |
EB buffer | QIAGEN | 19086 | |
2200 TapeStation | Agilent | G2965AA | |
D1000 Reagents | Agilent | 5067-5583 | |
D1000 ScreenTape | Agilent | 5067-5582 | |
Qubit dsDNA BR Buffer/Reagent | ThermoFisher Scientific | Q32850 | |
Cubee Mini-Centrifuge | RecenttecGenereach | R5-AQBD01aqbd | |
MiSeq 600 cycle v3 | Illumina Inc. | MS-102-3003 | |
MiSeq Sequencer | Illumina Inc. | SY-410-1003 |