Here we present a detailed protocol of (A) the identification of a natural product with antibiotic activity, (B) the purification of the compound, (C) the first model of its biosynthesis, (D) genome sequencing/-mining and the (E) verification of the biosynthetic gene cluster.
Streptomyces strains are known for their capability to produce a lot of different compounds with various bioactivities. Cultivation under different conditions often leads to the production of new compounds. Therefore, production cultures of the strains are extracted with ethyl acetate and the crude extracts are analyzed by HPLC. Furthermore, the extracts are tested for their bioactivity by different assays. For structure elucidation the compound of interest is purified by a combination of different chromatography methods.
Genome sequencing coupled with genome mining allows the identification of a natural product biosynthetic gene cluster using different computer programs. To confirm that the correct gene cluster has been identified, gene inactivation experiments have to be performed. The resulting mutants are analyzed for the production of the particular natural product. Once the correct gene cluster has been inactivated, the strain should fail to produce the compound.
The workflow is shown for the antibacterial compound polyketomycin produced by Streptomyces diastatochromogenes Tü6028. Around ten years ago, when genome sequencing was still very expensive, the cloning and identification of a gene cluster was a very time-consuming process. Fast genome sequencing combined with genome mining accelerates the trial of cluster identification and opens up new ways to explore biosynthesis and to generate novel natural products by genetic methods. The protocol described in this paper can be assigned to any other compound derived from a Streptomyces strain or another microorganism.
Natural products from plants and microorganisms have always been an important source for clinical drug development and research. The first antibiotic Penicillin was discovered in 1928 from a fungus by Alexander Fleming1. Nowadays, many more natural products are used in clinical treatment.
One genus known for its capability of producing various kinds of secondary metabolites with different bioactivities is Streptomyces. Streptomyces are Gram-positive bacteria and belong to the class of Actinobacteria and the order Actinomycetales. Almost two-thirds of the clinically used antibiotics are derived from Actinomycetales, mainly from Streptomyces, like amphotericin2, daptomycin3 or tetracycline4. Two Nobel Prizes have been awarded in the field of Streptomyces antibiotic research. The first one went to Selman Waksman for the discovery of Streptomycin, the first antibiotic effective against tuberculosis.5 In 2015, as part of the Nobel Prize in Physiology and Medicine, the discovery of avermectin from S. avermitilis was awarded as well. Avermectin is used for the treatment of parasitic diseases6,7.
The traditional approach for the discovery of natural products in microorganisms such as Streptomyces generally involves cultivation of the strain under different growth conditions, as well as extraction and analysis of secondary metabolites. Bioactivity assays (e.g. assays for antibacterial and anticancer activity) are performed to detect the activity of the compound. Finally, the compound of interest is isolated and the chemical structure is elucidated.
The structures of natural products are often composed of single moieties which are forming complex molecules. There are a few, but limited, major biosynthetic pathways leading to building blocks, which are used for the biosynthesis of natural products. The major biosynthetic pathways are the polyketide pathways, pathways leading to terpenoids and alkaloids, pathways using amino acids, and pathways leading to sugar moieties. Each pathway is characterized by a set of specific enzymes8. Based on the structure of the compound, these biosynthetic enzymes can be predicted.
Nowadays, the detailed structural analysis of a compound in combination with next generation sequencing and bioinformatic analysis can help to identify the responsible biosynthetic gene cluster. The cluster information opens up new ways for further natural product research. This includes heterologous expression to increase the yield of the natural product, targeted compound modification by gene deletion or alteration and combinatorial biosynthesis with genes from other pathways.
Polyketomycin was isolated independently from the culture broth of two strains, Streptomyces sp. MK277-AF19 and Streptomyces diastatochromogenes Tü602810. The structure was elucidated by NMR and X-ray analysis. Polyketomycin is composed of a tetracyclic decaketid and a dimethyl salicylic acid, linked by the two deoxysugar moieties β-D-amicetose and α-L-axenose. It displays cytotoxic and antibiotic activity, even against Gram-positive multidrug-resistant strains such as MRSA11.
A genomic cosmid library of S. diastatochromogenes Tü6028 was generated and screened many years ago. Using specific gene probes the polyketomycin gene cluster with a size of 52.2 kb, containing 41 genes, was identified after several months of intense work12. Recently, a draft genome sequence of S. diastatochromogenes was obtained leading to the fast identification of the polyketomycin biosynthetic gene cluster. In this overview, a method helping to identify a natural product and elucidate its biosynthetic gene cluster will be described, using polyketomycin as an example.
Here we explain the single steps which lead from a natural product to its biosynthetic gene cluster shown for polyketomycin produced by Streptomyces diastatochromogenes Tü6028. The protocol comprises the identification and purification of a natural product with antibiotic properties. Further structural analysis and comparison with results from genome mining lead to the identification of the biosynthetic gene cluster. This procedure can be applied to any other compound derived from a Streptomyces strain or any other microorganism.
1. Identification of a Natural Product with Antibiotic Property
Figure 1: LC/MS Analysis of Polyketomycin. (A) HPLC chromatogram (λ = 430 nm) of the crude extract after cultivation of Streptomyces diastatochromogenes Tü6028 for 6 days. Polyketomycin has a retention time of 25.9 min (B) UV/vis spectra of Polyketomycin. (C) Mass spectra of Polyketomycin in the negative modus. Main peak with m/z 863.2 [M-H]–.Please click here to view a larger version of this figure.
2. Large Scale Extraction, Purification and Structure Elucidation of the Compound
Figure 2: Workflow for Structure Elucidation. The process comprises (1) cultivation of the strain, (2) extraction, (3) purification by solid phase extraction (SPE), thin layer chromatography (TLC), preparative high-performance liquid chromatography (HPLC), size exclusion chromatography (SEC) and (4) structure elucidation by mass analysis (MS), nuclear magnetic resonance (NMR) and X-ray measurements. Please click here to view a larger version of this figure.
3. Propose Biosynthetic Model of the New Isolated Compound
Figure 3: Structure of Polyketomycin Divided into Single Building Blocks. Polyketomycin is composed of a tetracyclic decaketid (PKS type II) and a dimethyl salicylic acid (iterative PKS type I), linked by the two deoxysugar moieties β-D-amicetose and α-L-axenose (NDP-glucose-4,6-dehydratase and two glycosyltransferase required). Please click here to view a larger version of this figure.
4. Genome Sequencing/Mining
Figure 4: antiSMASH Output of Polyketomycin Biosynthetic Gene Cluster and Overview of Other Clusters in S. diastatochromogenes Tü6028. (A) Overview of predicted biosynthetic gene clusters in the genome of S. diastatochromogenes Tü6028; (B) Cluster 2 Polyketomycin biosynthetic gene cluster with targeted genes. Please click here to view a larger version of this figure.
5. Verification of the Biosynthetic Gene Cluster
Figure 5: Verification of a Gene Cluster by Single Crossover. (A) Native gene leads to the translation of a functional protein; (B) Cloning of the gene with internal deletion into a suicide vector leads to a single crossover resulting in a frame shift in the targeted gene and subsequent translation of non-functional protein; (C) Cloning of an internal fragment of the gene into a suicide vector leads to a truncation of the gene and subsequent translation of a non-functional protein. oriT: origin of transfer; antibioticR: antibiotic resistance. Please click here to view a larger version of this figure.
Figure 6: HPLC Analysis of S. diastatochromogenes with Inactivated pokPI Gene. HPLC chromatogram (λ = 430 nm) of crude extract of S. diastatochromogenes WT (top) and mutant strain with interrupted pokPI-Gen (below). The mutant strain does not produce polyketomycin anymore. Please click here to view a larger version of this figure.
In this overview we describe the single steps from identification of an antibiotic leading to its biosynthetic gene cluster. Many years ago we cloned a cosmid library, packaged them into phages, transduced E. coli host cells, and had to screen thousands of colonies to identify the clones having overlapping regions of polyketomycin cluster. Sequencing of the cosmids was also a difficult and expensive process12.
In order to conduct further studies on the strain we sequenced the whole genome of Streptomyces diastatochromogenes Tü6028. With the draft genome sequence we easily identified the biosynthetic gene cluster of polyketomycin and other clusters encoding promising compounds. Figure 7 compares the "old" method of identifying the biosynthetic cluster by cloning of a cosmid library and elaborative screening, and the "new" method by whole genome sequencing with subsequent genome mining on a rough time scale. The new sequencing technologies and new genome mining programs speed up the whole process.
Figure 7: Comparison of the "Old" and "New" Method of Assigning a Biosynthetic Gene Cluster. The "old" method comprises cloning of a cosmid library with selection of positive clones and sequencing of the respective cosmid(s) (above); The "new" method includes whole genome sequencing and -mining to identify all secondary metabolite gene clusters located on the genome of the strain (below). Duration of single steps are shown on a rough time scale. Please click here to view a larger version of this figure.
In this lab a genomic cosmid library of Streptomyces diastatochromogenes was generated and screened many years ago, resulting in the identification of the polyketomycin gene cluster through an extremely time-consuming process. Characterization of single genes were possible using targeted gene deletions and analysis of the resulting mutants12. Recently, a draft genome sequence of S. diastatochromogenes was obtained allowing the fast identification of the polyketomycin biosynthetic gene cluster. We could easily detect the biosynthetic genes, although the draft genome sequence contains still many contigs. The described process can be achieved within months. However, the procedure comprises many steps. Single steps may fail several times, preventing progression to subsequent steps:
The genus Streptomyces is known for its capacity to produce bioactive compounds. While they carry often more than 20 biosynthetic gene clusters, usually only one or two compounds are produced under laboratory conditions. The application of the OSMAC approach (cultivation of one strain under different conditions) to wake up silent gene clusters may sometimes not be enough. Genetic manipulation of regulatory genes, such as the pleiotropic regulator genes adpA44 and bldA45,46, is also an effective method to activate the production of other secondary metabolites.
For the elucidation of the compound's structure, e.g. by NMR analysis, usually more than 2 mg of purified compound is necessary. Therefore, fermentation of more than 10 L culture is often required. Without a fermenter that is able to maintain oxygen, pH and temperature conditions, it might be challenging in a small lab to handle this amount of culture and subsequent extraction. During purification, the compound may be changed due to oxidation, radiation or temperature. Also, the more purification steps are used, the higher the chance of degradation.
When analyzing the structure of the natural product, and the clusters in the genome, sometimes it is not that easy to identify the appropriate cluster. Firstly, if there is only a draft genome sequence some part of the cluster may be missing. Secondly, not all genes which are required for the biosynthesis are in the cluster. Thirdly, sometimes a cluster is split into two parts separated from each other by many kilobases. Fourthly, it may be difficult to decide which one is the appropriate gene cluster. In case of large PKS type I or NRPS systems, where it is possible to calculate the number of extender units based on the number of modules, or even identify the single extender units by analysis of the selecting domains, it turns up easily. However, in the case of iteratively working enzymes the prediction of the synthesized compounds is often not possible, especially if the strain has more than 40 gene clusters. Fifthly, nature is highly complex and full of yet unknown compounds. Often the biosynthesis is a mixture of different pathways. If the new compound is not identified yet, or not related to another compound, it may be difficult to identify the cluster, to propose a biosynthesis model and to prove it.
Once the cluster is identified, the single crossover technique is a good and quick method to verify the hypothesis. PCR, cloning into a suicide vector, conjugation, selection of positive clones, and production assay are the only steps required. One disadvantage of this technique is that the integration of the vector into the chromosome is not stable due to further recombination events. Therefore, in order to further analyze single genes, precise gene deletions are required. Also it may be tricky to manipulate Streptomyces strains on the genetic level.
The described procedure can be assigned to any other compound produced by a Streptomyces strain or another microorganism. The knowledge about a biosynthetic gene cluster and its synthesized compound gives us further opportunities to modify already existing molecules with the aim of improving them for the fight against multidrug-resistant pathogens.
The authors have nothing to disclose.
S. Zhang is funded by China Scholarship Council. The authors are very grateful to former people working on polyketomycin project in this lab and Prof. Dr. Hans-Peter Fiedler, University of Tübingen, for providing the polyketomycin producer. The research was supported by the DFG (RTG 1976).
agar | Carl Roth GmbH + Co. KG, Karlsruhe, Germany | 5210.4 | |
agarose | Carl Roth GmbH + Co. KG, Karlsruhe, Germany | 6352.4 | |
D-mannitol | Carl Roth GmbH + Co. KG, Karlsruhe, Germany | 4175.1 | |
glucose | Carl Roth GmbH + Co. KG, Karlsruhe, Germany | 6780.1 | |
LB | Carl Roth GmbH + Co. KG, Karlsruhe, Germany | X964.3 | |
malt extract | Carl Roth GmbH + Co. KG, Karlsruhe, Germany | X976.2 | |
MgCl2 | Carl Roth GmbH + Co. KG, Karlsruhe, Germany | 2189.1 | |
Peptone | Carl Roth GmbH + Co. KG, Karlsruhe, Germany | ||
soy flour | W.Schoenemberger GmbH, Magstadt, Germany | Hensec-Vollsoja | |
tryptic soy broth | Carl Roth GmbH + Co. KG, Karlsruhe, Germany | X938.3 | Caso-Bouillon |
yeast extract | Carl Roth GmbH + Co. KG, Karlsruhe, Germany | 2363.2 | |
Solvents | |||
Acetic acid | Carl Roth GmbH + Co. KG, Karlsruhe, Germany | 3738.5 | |
Acetone | Carl Roth GmbH + Co. KG, Karlsruhe, Germany | 9372.2 | |
Acetonitrile | Avantor Performance Materials B.V., Deventer, The Netherlands | JT-9012-03 | |
Dichlorofrom | Fisher Scientific GmbH, Schwerte, Germany | 1530754 | |
DMSO (Dimethyl sulfoxide) | Carl Roth GmbH + Co. KG, Karlsruhe, Germany | 4720.1 | |
DMSO-d6 (Dimethyl sulfoxide-d6) 99.9atom%D | ARMAR Chemicals, Döttingen, Switzerland | 15200.204 | |
Ethyl acetate | Carl Roth GmbH + Co. KG, Karlsruhe, Germany | 6784.4 | |
Hydrochloric acid | Carl Roth GmbH + Co. KG, Karlsruhe, Germany | 6331.4 | |
Methanol | Carl Roth GmbH + Co. KG, Karlsruhe, Germany | 7342.1 | |
Enzymes | |||
restriction enzymes | New England Biolabs GmbH, Frankfurt am Main, Germany | ||
polymerase | New England Biolabs GmbH, Frankfurt am Main, Germany | ||
DNA Polymerase I Large (Klenow) Fragment | Promega GmbH, Mannheim, Germany | ||
Antibiotics | |||
apramycin | AppliChem GmbH, Darmstadt, Germany | A7682.0005 | |
fosfomycin | Sigma-Aldrich Chemie GmbH, Taufkirchen, Germany | P5396-50G | |
kanamycin | Carl Roth GmbH + Co. KG, Karlsruhe, Germany | T832.2 | |
Plasmid/Vectors | |||
pKC1132 | Bierman et al. 1992 | ||
Primer | |||
pokPI_for | TGATGGTGCCGCTGGCCATGG | Primer to amplify fragment containing pokPI gene | |
pokPI_rev | AGCGTTCACTGTTCCGCCCGAC | ||
Bacterial strains | |||
Bacillus subtilis COHN ATCC6051 | Gram-positive test strain | ||
Escherichia coli ET12567 pUZ8002 | MacNeil et al., 1992 | strain for conjugation | |
Escherichia coli XL1 Blue | Agilient Technologies, Santa Clara, USA | Gram-negative test strain + cloning host | |
Streptomyces diastatochromogenes Tü6028 | Paululat et al., 1999 | Polyketomycin producer | |
Online services | |||
antismash (Antibiotics and Secondary Metabolite Analysis Shell) | http://antismash.secondarymetabolites.org/ | Detection of secondary metabolite gene cluster | |
BLAST (Basic Local Alignment Search Tool) | http://blast.ncbi.nlm.nih.gov/Blast.cgi | finds regions of similarity between biological sequences | |
GenDB | https://www.uni-giessen.de/fbz/fb08/Inst/bioinformatik/software/gendb | Annotation of ORFs | |
MiBIG (Minimum Information about a Biosynthetic Gene cluster) | http://mibig.secondarymetabolites.org/ | Database of biosyntetic gene clusters | |
NaPDoS | http://napdos.ucsd.edu/ | Detection of seconary metabolite gene cluster | |
NCBI Prokaryotic Genome Annotation Pipeline | http://www.ncbi.nlm.nih.gov/genome/annotation_prok/ | Annotation of ORFs | |
NRPSpredictor | http://nrps.informatik.uni-tuebingen.de/ | Detection of NRPS domains | |
Prokka (rapid prokaryotic genome annotation) | http://www.vicbioinformatics.com/software.prokka.shtml | Annotation of ORFs | |
RAST (rapid annotation using subsystems technology) | http://rast.nmpdr.org/ | Annotation of ORFs | |
other programs | |||
Chem Station Rev. A.09.03 | Agilent Technologies, Waldbronn, Germany | Handling program for HPLC | |
Clone Manager Suite 7 | Scientific and Educational Software, Cary, USA | Designing Cloning Experiment | |
Newbler v2.8 | Roche Diagnostics | Alignment of sequencing reads | |
Machines | |||
Centrifuge Avanti J-6000, Rotor JA-10 | Beckman Coulter GmbH, Krefeld, Germany | ||
HPLC/MS | Agilent Technologies, Waldbronn, Germany | ||
_Autosampler: G1313A | Agilent Technologies, Waldbronn, Germany | ||
_Pre-column: XBridge C18 (20 mm x 4.6 mm; Particle size: 3.5 µm) | Agilent Technologies, Waldbronn, Germany | ||
_Column:Xbridge C18 (100 mm × 4.6 mm; Particle size: 3.5 μm) | Agilent Technologies, Waldbronn, Germany | ||
_semi-prep Pre-Column: Zorbax B-C18 (9.4 x 150 mm; Particle size: 5 µm) | Agilent Technologies, Waldbronn, Germany | ||
_semi-prep Column: Zorbax B-C18 (9.4 x 20 mm; Particle size: 5 µm) | Agilent Technologies, Waldbronn, Germany | ||
_Degasser: G1322A | Agilent Technologies, Waldbronn, Germany | ||
_Quarternary pump: G1311A | Agilent Technologies, Waldbronn, Germany | ||
_Diode array detector (DAD )G1315B (λ = 254 nm and 400 nm) | Agilent Technologies, Waldbronn, Germany | ||
_Quadrupole mass detector (MSD) G1946D(2-3000 m/z) | Agilent Technologies, Waldbronn, Germany | ||
rotary evaporator | |||
_heating bath Hei-VAP Value/G3 | Heidolph Instruments GmbH & Co.KG, Schwabach, Germany | ||
_vacuum pump system SC 920 G | KNF Global Strategies AG, Sursee, Switzerland | ||
other material | |||
Sephadex LH20 | GE Healthcare, | ||
Chromafil PVDF-45/15MS (pore size 0.45 µm; filter Ø15 mm) | MACHEREY-NAGEL GmbH & Co. KG, Düren, Germany | ||
SPE column Oasis HLB 20 35 cc (6g) | Waters GmbH, Eschborn, Germany | ||
E. coli | Carl Roth GmbH + Co. KG, Karlsruhe, Germany | Medium: LB, Composition: LB, Amount to 1 L H2O: 20 g | |
Bacillus | Carl Roth GmbH + Co. KG, Karlsruhe, Germany | Medium: LB, Composition: LB, Amount to 1 L H2O: 20 g | |
Streptomyces sp. | Carl Roth GmbH + Co. KG, Karlsruhe, Germany | Medium: TSB, Composition: CASO Boullion, Amount to 1 L H2O: 30 g | |
fungus | Carl Roth GmbH + Co. KG, Karlsruhe, Germany | Medium: YPD, Composition: Yeast extract, Amount to 1 L H2O: 10 g | |
fungus | Carl Roth GmbH + Co. KG, Karlsruhe, Germany | Medium: YPD, Composition: Peptone, Amount to 1 L H2O: 20 g | |
fungus | Carl Roth GmbH + Co. KG, Karlsruhe, Germany | Medium: YPD, Composition: Glucose, Amount to 1 L H2O: 20 g | |
for agar plates add 2 % agar |