Here, we describe a protocol to obtain amplicon sequence data of soil, rhizosphere, and root endosphere microbiomes. This information can be used to investigate the composition and diversity of plant-associated microbial communities, and is suitable for the use with a wide range of plant species.
The intimate interaction between plant host and associated microorganisms is crucial in determining plant fitness, and can foster improved tolerance to abiotic stresses and diseases. As the plant microbiome can be highly complex, low-cost, high-throughput methods such as amplicon-based sequencing of the 16S rRNA gene are often preferred for characterizing its microbial composition and diversity. However, the selection of appropriate methodology when conducting such experiments is critical for reducing biases that can make analysis and comparisons between samples and studies difficult. This protocol describes in detail a standardized methodology for the collection and extraction of DNA from soil, rhizosphere, and root samples. Additionally, we highlight a well-established 16S rRNA amplicon sequencing pipeline that allows for the exploration of the composition of bacterial communities in these samples, and can easily be adapted for other marker genes. This pipeline has been validated for a variety of plant species, including sorghum, maize, wheat, strawberry, and agave, and can help overcome issues associated with the contamination from plant organelles.
Plant-associated microbiomes consist of dynamic and complex microbial communities comprised of bacteria, archaea, viruses, fungi, and other eukaryotic microorganisms. In addition to their well-studied role in causing plant disease, plant-associated microbes can also positively influence plant health by improving tolerance to biotic and abiotic stresses, promoting nutrient availability, and enhancing plant growth through the production of phytohormones. For this reason, particular interest exists in characterizing the taxa that associate with plant root endospheres, rhizospheres, and the surrounding soil. While some microbes can be cultured in isolation on laboratory generated media, many cannot, in part because they may rely on symbiotic relationships with other microbes, grow very slowly, or require conditions that cannot be replicated in a lab environment. Because it circumvents the need for cultivation and is relatively inexpensive and high-throughput, sequence-based phylogenetic profiling of environmental and host-associated microbial samples has become a preferred method for assaying microbial community composition.
The selection of appropriate sequencing technologies provided by various next generation sequencing (NGS) platforms1 is dependent on the users' needs, with important factors including: desired coverage, amplicon length, expected community diversity, as well as sequencing error-rate, read-length, and the cost-per-run/megabase. Another variable that needs to be considered in amplicon-based sequencing experiments is what gene will be amplified and what primers will be used. When designing or choosing primers, researchers are often forced to make tradeoffs between the universality of amplification and the taxonomic resolution achievable from the resulting amplicons. For this reason, these types of studies often chose primers and markers that selectively target specific subsets of the microbiome. Evaluating the composition of bacterial communities is commonly accomplished by sequencing one or more of the hypervariable regions of the bacterial 16S rRNA gene2,3. In this study, we describe an amplicon based sequencing protocol developed for a NGS platform that targets the 500 bp V3-V4 region of the bacterial 16S rRNA gene, which allows for broad amplification of bacterial taxa while also providing sufficient variability to distinguish between different taxa. Additionally, this protocol can easily be adapted for the use with other primer sets, such as those targeting the ITS2 marker of fungi or the 18S rRNA subunit of eukaryotes.
While other approaches such as shotgun metagenomics, metatranscriptomics, and single-cell sequencing, offer other advantages including resolved microbial genomes and more direct measurement of community function, these techniques are typically more expensive and computationally intensive than the phylogenetic profiling described here4. Additionally, performing shotgun metagenomics and metatranscriptomics on root samples yields a large percentage of reads belonging to the host plant genome, and methods to overcome this limitation are still being developed5,6.
As with any experimental platform, amplicon-based profiling can introduce a number of potential biases which should be considered during the experimental design and data analysis. These include the methods of sample collection, DNA extraction, selection of PCR primers, and how library preparation is performed. Different methods can significantly impact the amount of usable data generated, and can also hinder the efforts to compare results between studies. For example, the method of removing rhizosphere bacteria7 and the use of different extraction techniques or choice of DNA extraction kits8,9 have been shown to significantly impact downstream analysis, which leads to different conclusions regarding which microbes are present and their relative abundances. Since amplicon-based profiling can be customized, making comparisons across studies can be challenging. The Earth Microbiome Project has suggested that researchers studying complex systems such as the plant-associated microbiome would benefit from the development of standardized protocols as a means of minimizing the variability caused by the application of different methods between studies10,11. Here, we discuss many of the above topics and offer suggestions as to best practices where appropriate.
The protocol demonstrates the process of collecting soil, rhizosphere, and root samples from Sorghum bicolor and extracting DNA using a well-established DNA isolation kit11. Additionally, our protocol includes a detailed amplicon sequencing workflow, using a commonly utilized NGS platform, to determine the structure of the bacterial communities12,13,14. This protocol has been validated for the use in a wide range of plant hosts in a recent published study of the roots, rhizosphere, and associated-soils of 18 monocot species including Sorghum bicolor, Zea mays, and Triticum aestivum15. This method has also been validated for use with other marker genes, as demonstrated by its successful application to studying the fungal ITS2 marker gene in studies of the agave microbiome16,17 and strawberry microbiome18.
1. Collection and Separation of Root Endosphere, Rhizosphere, and Soil Samples
2. DNA Extraction
NOTE: Throughout steps 2 and 3, clean gloves sterilized with ethanol should be worn at all times and all work should be performed on a surface sterilized with ethanol.
3. Amplicon Library Preparation and Submission
Performing the recommended protocol should result in a dataset of indexed paired-end reads that can be matched back to each sample and assigned to either a bacterial operational taxonomic units (OTU) or exact sequence variant (ESV, also referred to as amplicon sequence variant (ASV) and sub-operational taxonomic unit (sOTU)), depending on downstream analysis. In order to obtain high-quality sequence data, care must be taken at each step to maintain consistency between samples and minimize the introduction of any potential bias during the sample processing or library preparation. After collecting, processing, and extracting DNA from samples (steps 1 and 2), the resulting eluate should appear clear and free of organics that would inhibit amplification. While purity can be verified by measuring each DNA sample via a microvolume spectrophotometer, we have found that the soil DNA extraction kit reliably removes all contaminants. As a result of the predictable DNA quality, quantification methods that rely on fluorescence-based dyes that specifically bind DNA are more appropriate than those based on UV absorbance19,20,21. Prior to PCR amplification, soil and rhizosphere samples average around 10 ng/µL DNA, while root samples typically have a mean concentration of approximately 30 ng/µL (Table 2).
Following the amplification of the environmental DNA (step 3), success or failure can be determined by measuring the concentration of the PCR product via benchtop fluorometer reagents on a plate reader, if available, or manually (Table 2). In our experience, successful amplifications that result in high-quality amplicon data yield greater than 15 ng/µL PCR products. If there are multiple failures on a plate, the positional arrangement within the plant and the sample type of failed samples may help determine the problem. For instance, if they are all adjacent on the plate, it may indicate pipette error, whereas if they are all in the same row or column, it could suggest issues with a specific primer. If they all belong to the same sample type, it might suggest problems with sample processing or DNA extraction.
It is important to check the compatibility of the universal PNAs with your specific plant system bioinformatically during experimental design in order to verify that they will block amplification of chloroplast and mitochondrial 16S genes. Following the amplification step, it is not clear whether the PNAs successfully bound to mitochondrial and chloroplast templates; this is only revealed after sequencing (Figure 3). To help ensure that the PNAs will effectively block contaminant amplification, an alignment of the PNA sequence to each chloroplast and mitochondrial 16S rRNA gene (there may be multiple copies) for the plant host being investigated should not reveal any mismatches. Even a single mismatch to the 13 bp PNA sequence, especially in the middle of the PNA clamp, can drastically reduce the effectiveness, as in the case of the provided chloroplast PNA sequence and the chloroplast 16S rRNA gene of Lactuca sativa (lettuce) (Figure 3).
Since an equal amount of amplified DNA is pooled per sample, there should be an approximately even number of reads obtained per sample after sequencing and sorting reads based on their barcoded index (Figure 4). The majority of these reads should match to bacterial taxa. Any eukaryotic, mitochondrial, or chloroplast matches should be discarded. Depending on the analysis pipeline and taxonomic database chosen, chloroplast and mitochondrial reads can mistakenly be assigned to bacterial lineages, often Cyanobacteria and Rickesttia, respectively (Figure 3). A degree of manual curation is often prudent to check for these common mis-assignments. Specific details will depend on the choice of analysis, but relative abundance profiles should generally be similar (no significant difference) among biological replicates and significantly different between soil, rhizosphere, and root samples (Figure 5). It is important to note, however, that while there may be no significant difference between biological replicates, it is important to collect at least three replicates per.
Methods for interpreting the data obtained in these experiments are hotly debated amongst microbial ecologists. Until recently, amplicon sequence analysis has been dependent upon grouping reads into OTUs. However, these are problematic because: 1) they are based on a somewhat arbitrary threshold of 97% similarity, 2) diversity is often underestimated, and 3) there can be low taxonomic resolution. Recently developed tools such as DADA2, Deblur, and UNOISE222,23,24 are able to sort reads into ESVs, which solves some problems presented when using OTUs. Caveats to using ESVs include: 1) artificial increases in diversity due to the differences in rRNA copies within a species, and 2) increased sensitivity to PCR and sequencing errors25,26.
Figure 1: Separation of root and rhizosphere fractions. Flowchart displaying the steps for separating the rhizosphere from the root samples, followed by washing the roots with sterile water to remove any remaining rhizoplane organisms. Please click here to view a larger version of this figure.
Figure 2: Example of stock primer layout for amplification and distribution within plates. Stock primers (Table 1) can be prepared in strip tubes for optimal distribution within 96-well plates (each strip of primers is represented by a different color; purple for forward primers 1 – 8, orange for forward primers 9 – 16, blue for reverse primers 1 – 12, and green for reverse primers 13 – 16.) In this case, 16 forward and 16 reverse primers can be distributed efficiently with a multi-channel pipette such that each well has a unique barcode combination. Please click here to view a larger version of this figure.
Figure 3: Results that suggest chloroplast PNA is ineffective. Representative result from rhizosphere ("Rhizo"), root, and soil samples from lettuce that were non-treated (NT) or treated (VT) with a biological soil amendment. The PNA sequence used to block chloroplast contamination of most plants is GGCTCAACCCTGGACAG27. However, lettuce contains a mismatch in the chloroplast 16S ribosomal RNA gene (GGCTCAACTCTGGACAG). This renders the PNA ineffective, resulting in a high relative abundance of reads that match to Cyanobacteria in rhizosphere and root samples. Please click here to view a larger version of this figure.
Figure 4: Distribution of read counts among samples in a library. Bar chart showing number of read counts (y-axis) from different samples (bars, x-axis), matched by the barcode combination in the read. The number of reads per sample can vary based on how many samples are in the library; this subset was sequenced in a library of 192 samples. Please click here to view a larger version of this figure.
Figure 5: Relative abundance of the top 12 classes in root, rhizosphere, and soil communities. Stacked bar chart showing relative abundance of classes present in a representative 16S dataset containing 6 replicates for each sample type (bulk soil, rhizosphere, and root endosphere). Please click here to view a larger version of this figure.
Table 1: Primers for amplifying the V3-V4 region of the 16S rRNA gene. Primers are composed of, sequentially: an adapter for a common NGS platform, a unique barcode, the primer for NGS, a spacer region of variable length to shift the frame for sequencing, and a universal PCR primer that amplifies either 341F or 785R of the 16S rRNA gene. The number of primers needed is dependent upon how many samples are sequenced per library; a combination of 16 forward and 16 reverse primers is sufficient for 244 samples (256 primer combinations with 12 used for blank wells during PCR (Figure 2)). Please click here to download this file.
Table 2: Normalization of randomized DNA samples prior to and following amplification. Example worksheet listing samples in a randomized order and indicating their location on a single 96-well plate, which also determines the primer combination assigned to it. Formulas in the bottom row describe calculations for adding 100 ng of each sample to the normalized plate, plus the volume of water to reach 20 µL. Following amplification, the volume of 100 ng of each successful product is calculated and added to a final pool. The volume of "blank" PCR product to add to the final pool is the average of the other samples. Please click here to download this file.
Supplemental Figure 1: Approximation of minimum root biomass during sample collection. When collecting roots, try to collect at least 500 mg of tissue. Here, roots collected from a young sorghum plant (left, in both A and B) and a young rice plant (right) are shown next to (A) and inside (B) 50 mL conical tubes. Both samples weigh approximately 1 g, however, it is important to note that this weight includes rhizosphere and root, and the rhizosphere weight is, in this case, approximately half the total weight. Please click here to download this file.
This protocol demonstrates an established pipeline for exploring root endosphere, rhizosphere, and soil microbial community compositions, from field sampling to sample processing and downstream sequencing. Studying root-associated microbiomes presents unique challenges, due in part to the inherent difficulties in sampling from soil. Soils are highly variable in terms of physical and chemical properties, and different soil conditions can be separated by as little as a few millimeters28,29. This can lead to the samples which are collected from adjacent sampling sites having considerably different microbial community compositions and activities30,31. Thus, using soil core collectors and shovels to maintain consistent sampling depths and homogenization prior to DNA extraction are essential to the reproducibility within root microbiome studies. It is also essential to efficiently separate the rhizosphere and root fractions; using a harsh method of root surface sterilization can potentially lyse endophytes within roots prior to DNA extraction, while a more conservative wash may not remove all microbes from the root surface7. Another key factor that can negatively impact or disrupt sequencing results is bacterial contamination, which can come from many sources and is sometimes impossible to distinguish from the sampled environmental bacteria32,33. For this reason, careful sterilization of sampling tools, experimental materials, and working environments are vital in order to avoid contamination.
After sampling, obtaining high quality DNA is a high priority for successful downstream analyses. In our experience, DNA extraction from field grown root samples through alternative methods, such as through CTAB-based extraction, often contain substantially greater quantities of humic acids and other compounds compared to rhizosphere and soil samples. These compounds can prevent the enzymatic activity of the DNA polymerase during PCR amplification, even at low concentrations34,35. Using DNA extraction kits designed for soils on root samples, as opposed to a CTAB extraction followed by a phenol chloroform clean-up, can effectively rid samples of humic acids and will result in high quality DNA36,37,38,39. Accordingly, we recommend using a commercially available DNA extraction kit for root samples as well. It should be noted that the goal is to obtain microbial genomic DNA from plant roots. Thus, thorough and consistent root grinding is important to break down the plant tissue and lyse the microbial cells to release microbial DNA without introducing bias between samples due to the variation in grinding pressure and time.
Following careful extraction of DNA from samples, there are two main sources for problems during amplification: 1) contamination of plant tissues with plant endosymbionts (chloroplast and mitochondria) and 2) selection of 16S rRNA region to amplify. The amplification from chloroplast or mitochondria 16S rRNA sequences can generate >80% of the sequences in root samples40, and more in leaf tissues, though the amount of contamination is dependent on the choice of primers. Thus, PNA clamps are necessary during the PCR step to suppress plant host chloroplast and mitochondrial 16S contamination27,41. However, different plant species can have variation in the chloroplast and mitochondrial 16S sequence27; therefore, it is essential to confirm the sequence of the chloroplast and mitochondrial 16S genes of the plant being studied prior to library sequencing, in order to determine if alternate PNA oligos are needed (Figure 3). Additionally, the 16S rRNA gene consists of nine hypervariable regions flanked by nine conservative regions; different results can be obtained from the same community depending upon which hypervariable region is amplified42. Previous studies have found the V4 region to be one of the most reliable for assigning taxonomy43 and it has been used for other extensive microbiome surveys11. Lengthening the target to the V3-V4 region is suggested here to increase variability and improve taxonomic resolution.
In this protocol, we demonstrated a pipeline to perform 16S rRNA amplicon sequencing via next generation sequencing (NGS) for studying microbial community compositions of environmental samples12. We recommend using amplicon sequencing as a tool for phylogenetic profiling, because it is relatively inexpensive, high-throughput, and does not require extensive computational expertise or resources to analyze. While our method focuses on analyzing the bacterial fraction of the microbiome, it can easily be adapted to investigate fungi. The protocol is identical through step 2, and the only difference in step 3 is what primers would be used during the amplification. However, it is worth nothing that amplicon based profiling is not without limitations. By sequencing a single marker gene, no information is obtained regarding the functional capacity of the community. Additionally, the taxonomic resolution can be quite low, especially when sequencing from environments with a high percentage of uncharacterized microbes. However, sequencing technologies are rapidly evolving, and we anticipate the potential to deal with some of these shortcomings by adapting this protocol for use with other sequencing platforms. Finally, as mentioned in the introduction, shotgun metagenomics and metatranscriptomics can easily be performed on soil and rhizosphere samples, and methods to eliminate plant contamination from plant tissues are currently being explored. Experimental designs which pair amplicon-based approaches and other metagenomic techniques can be particularly effective in complex communities where high species diversity and uneven representation of taxa can prevent shotgun data from accurately characterizing the less dominant members.
The authors have nothing to disclose.
This work was funded by the USDA-ARS (CRIS 2030-21430-008-00D). TS is supported by the NSF Graduate Research Fellowship Program.
0.1-10/20 µL filtered micropipette tips | USA Scientific | 1120-3810 | Can substitute with equivalent from other suppliers. |
1.5 mL microcentrifuge tubes | USA Scientific | 1615-5510 | Can substitute with equivalent from other suppliers. |
10 µL multi-channel pipette | Eppendorf | 3122000027 | Can substitute with equivalent from other suppliers. |
10 µL, 100 µL, and 1000 µL micropipettes | Eppendorf | 3120000909 | Can substitute with equivalent from other suppliers. |
100 µL multi-channel pipette | Eppendorf | 3122000043 | Can substitute with equivalent from other suppliers. |
1000 µL filtered micropipette tips | USA Scientific | 1122-1830 | Can substitute with equivalent from other suppliers. |
2 mL microcentrifuge tubes | USA Scientific | 1620-2700 | Can substitute with equivalent from other suppliers. |
2 mm soil sieve | Forestry Suppliers | 60141009 | Can substitute with equivalent from other suppliers. |
200 µL filtered micropipette tips | USA Scientific | 1120-8810 | Can substitute with equivalent from other suppliers. |
25 mL reservoirs | VWR International LLC | 89094-664 | Can substitute with equivalent from other suppliers. |
50 mL conical vials | Thermo Fisher Scientific | 352098 | Can substitute with equivalent from other suppliers. |
500 mL vacuum filters (0.2 µm pore size) | VWR International LLC | 156-4020 | |
96-well microplates | USA Scientific | 655900 | |
96-well PCR plates | BioRad | HSP9631 | |
Agencourt AMPure XP beads | Thermo Fisher Scientific | NC9933872 | Instructions for use: https://www.beckmancoulter.com/wsrportal/ajax/downloadDocument/B37419AA.pdf?autonomyId=TP_DOC_150180&documentName=B37419AA.pdf |
Aluminum foil | Boardwalk | 7124 | Can substitute with equivalent from other suppliers. |
Analytical scale with 0.001 g resolution | Ohaus Pioneer | PA323 | Can substitute with equivalent from other suppliers. |
Bioruptor Plus ultrasonicator | Diagenode | B01020001 | |
Bovine Serum Albumin (BSA) 20 mg/mL | New England Biolabs | B9000S | |
Centrifuge | Eppendorf | 5811000908 | Including 50mL and 96-well plate bucket adapters |
Cryogenic gloves | Millipore Sigma | Z183490 | Can substitute with equivalent from other suppliers. |
DNeasy PowerClean kit (optional) | Qiagen Inc. | 12877-50 | Previously MoBio |
DNeasy PowerSoil kit | Qiagen Inc. | 12888-100 | Previously MoBio |
Dry ice | Any | NA | |
DynaMag-2 magnet | Thermo Fisher Scientific | 12321D | Do not substitute |
Ethanol | VWR International LLC | 89125-188 | Can substitute with equivalent from other suppliers. |
Gallon size freezer bags | Ziploc | NA | Can substitute with equivalent from other suppliers. |
Gemini EM Microplate Reader | Molecular Devices | EM | Can use another fluorometer that reads 96-well plates from the top. |
K2HPO4 | Sigma-Aldrich | P3786 | |
KH2PO4 | Sigma-Aldrich | P5655 | |
Lab coat | Workrite | J1367 | Can substitute with equivalent from other suppliers. |
Liquid N2 | Any | NA | Can substitute with equivalent from other suppliers. |
Liquid N2 dewar | Thermo Fisher Scientific | 4150-9000 | Can substitute with equivalent from other suppliers. |
Milli-Q ultrapure water purification system | Millipore Sigma | SYNS0R0WW | |
Mini-centrifuge | Eppendorf | 5404000014 | |
Molecular grade water | Thermo Fisher Scientific | 4387937 | Can substitute with equivalent from other suppliers. |
Mortars | VWR International LLC | 89038-150 | Can substitute with equivalent from other suppliers. |
Nitrile gloves | Thermo Fisher Scientific | 19167032B | Can substitute with equivalent from other suppliers. |
Paper towels | VWR International LLC | BWK6212 | Can substitute with equivalent from other suppliers. |
PCR plate sealing film | Thermo Fisher Scientific | NC9684493 | |
PCR strip tubes | USA Scientific | 1402-2700 | |
Pestles | VWR International LLC | 89038-166 | Can substitute with equivalent from other suppliers. |
Plastic spatulas | LevGo, Inc. | 17211 | |
Platinum Hot Start PCR Master Mix (2x) | Thermo Fisher Scientific | 13000014 | |
PNAs – chloroplast and mitochondrial | PNA Bio | NA | Make sure to verify sequence bioinformatically |
Protective eyewear | Millipore Sigma | Z759015 | Can substitute with equivalent from other suppliers. |
Qubit 3.0 Fluorometer | Thermo Fisher Scientific | Q33216 | |
Qubit dsDNA HS assay kit | Thermo Fisher Scientific | Q32854 | |
Rubber mallet (optional) | Ace Hardware | 2258622 | Can substitute with equivalent from other suppliers. |
Shears or scissors | VWR International LLC | 89259-936 | Can substitute with equivalent from other suppliers. |
Shovel | Home Depot | 2597400 | Can substitute with equivalent from other suppliers. |
Soil core collector (small diameter: <1 inch) | Ben Meadows | 221700 | Can substitute with equivalent from other suppliers. |
Spray bottles | Santa Cruz Biotechnology | sc-395278 | Can substitute with equivalent from other suppliers. |
Standard desalted barcoded primers (10 µM) (Table 1) | IDT | NA | 4 nmole Ultramer DNA Oligo with standard desalting. NGS adapter and sequencing primer (Table 1) are designed for use with Illumina MiSeq using v3 chemistry. |
Thermocycler | Thermo Fisher Scientific | E950040015 | Can substitute with equivalent from other suppliers. |
Triton X-100 | Sigma-Aldrich | X100 | Can substitute with equivalent from other suppliers. |
Weigh boats | Spectrum Chemicals | B6001W | Can substitute with equivalent from other suppliers. |