In this protocol, a method for gene mining and sequence analysis of purine nucleosidase (PN, EC:3.2.2.1) based on RNA-Seq was described. ProtProm analysis was applied to show the unique secondary and tertiary structures of PN. Furthermore, the PN gene was cloned from transcriptome to verify the reliability of RNA-Seq results.
Caterpillar fungus (Ophiocordyceps sinensis) is one of the most valued fungal Traditional Chinese medicine (TCM), and it contains plenty of active ingredients such as adenosine. Adenosine is considered as a biologically effective ingredient that has a variety of anti-tumor and immunomodulatory activities. In order to further elucidate the mechanism of purine nucleosidase (PN) in adenosine biosynthesis, a gene encoding PN was successfully mined and further analyzed based on the RNA-Seq database of caterpillar fungus. The full-length cDNA of PN was 855 bp, which encoded 284 amino acids. BLAST analysis showed the highest homology of 85.06% with nucleoside hydrolase in NCBI. ProtProm analysis showed that the relative molecular weight was 30.69 kDa and the isoelectric point was 11.55. The secondary structure of PN was predicted by Predict Protein; the results showed that alpha helix structure accounted for 28.17%, strand structure accounted for 11.97%, and loop structure accounted for 59.86%. Moreover, PN gene was further cloned from transcriptome and detected by agarose gel electrophoresis for verification. This study provides more sufficient scientific basis and new ideas for the genetic regulation of adenosine biosynthesis in fungal TCM.
Fungal Traditional Chinese medicine (TCM) has abundant species resources1,2. Caterpillar fungus (Ophiocordyceps sinensis) is a well-known fungal TCM and is regarded as a source of innovative drugs3,4. Caterpillar fungus is a worm and fungus combined mixture that is found on the Tibetan plateau in southwestern China, where Hirsutella sinensis is parasitic on the caterpillar body5. Currently, H. sinensis is reported as the only anamorph of caterpillar fungus according to molecular and morphological biology evidence6,7, and it has less associated toxicity and similar clinical efficacy compared to wild caterpillar fungus8. It was revealed that H. sinensis possesses a variety of biologically effective ingredients, such as nucleosides, polysaccharides, and ergosterols, with extensive pharmacological effects such as repairing a liver injury9,10,11. Adenosine is a typical active ingredient isolated from caterpillar fungus, and it is a kind of purine alkaloid12. Adenosine has a variety of biological activities: anti-tumor, antibacterial, and immunomodulatory activities13,14. Unfortunately, the biosynthetic mechanism of adenosine as well as the key genes involved is still unclear15,16.
Adenosine mainly shows its anti-tumor effect through immunosuppressive actions in the tumor microenvironment17. It was reported that adenosine showed immunosuppressive functions, which was critical to initiate tissue repair after injury and to protect tissues against excessive inflammation18,19. Moreover, it was demonstrated that adenosine-mediated repression of immunity could severely impair cancer immunosurveillance as well as promote tumor growth20. Thus, it is urgent to study the mechanism of adenosine biosynthesis for its wide application in anti-tumor.
It was reported that a complete view of expressed genes and their expression levels could be systematically conducted by next-generation sequencing of transcriptome21. Furthermore, transcriptome sequencing and analysis was applied to predict the genes involved in the biosynthetic pathway of the active ingredients, and further investigate the interaction of different biosynthetic pathways22. Purine nucleosidase (PN, EC 3.2.2.1) is a class of nucleosidase with substrate specificity for purine nucleosides, which can hydrolyze the glycoside bonds of purine nucleosides into sugars and bases23. It typically plays important roles in adenosine biosynthesis. It was reported that the biosynthetic pathway of adenosine in fungal TCM was predicted; qPCR and gene expression showed that the increased adenosine accumulation is a result of down-regulation of PN gene, indicating that the PN gene may play an important role in adenosine biosynthesis15. Therefore, the mechanism of PN in adenosine biosynthesis must be urgently clarified. However, the sequence information and protein structure of PN as well as other key genes involved in adenosine biosynthesis of fungal TCM have not been further studied.
In this study, a novel sequence of PN gene was mined from RNA-Seq data of caterpillar fungus and verified by gene cloning. Furthermore, the molecular characteristics and protein structure of PN were comprehensively analyzed, which could provide new directions and ideas for the gene regulation of adenosine biosynthesis.
NOTE: A strain of anamorph of caterpillar fungus (H. sinensis) was deposited in our laboratory. Escherichia coli DH5 were preserved by Shenzhen Hospital, Beijing University of Chinese Medicine.
1. Preparing for RNA-Seq
2. Gene mining of purine nucleosidase
3. Bioinformatic analysis
4. Gene cloning and construction of recombinant plasmid
The ORF sequence of PN gene was 855 bp in length, which encoded 284 amino acids with a calculated molecular mass of 30.69 kDa and a predicted isoelectric point of 11.55, indicating that PN is an alkaline protein. Application of SignalP4.0 Server was conducted to identify signal peptide, and the results indicated that PN has no signal peptides. Moreover, the results of BLASTP search indicated that PN originated from caterpillar fungus shared the highest identity (85.06%, E value = 1e-88) with nucleoside hydrolase from Purpureocillium lilacinum (OAQ81830.1). Furthermore, the ClustalX program was applied to perform multiple sequence alignment of PN and the results were shown in Figure 1, which revealed that 11–166 amino acids were the conserved amino acid sequences of inosine/uridine hydrolase domain. Subsequently, the result of phylogenetic tree showed that PN from caterpillar fungus shared the closest phylogenetic relationship with other nucleoside hydrolase from entomogenous fungus such as Purpureocillium lilacinum (OAA82129.1, XP 018708456.1) based on the amino acid sequences similarity (Figure 2). Meanwhile, the analysis result of InterPro Scan revealed that PN had a catalytic domain of inosine/uridine-preferring nucleoside hydrolase (IPR023186).
Subsequently, PN protein secondary structure was predicted by Predict Protein, the results were shown in Figure 3, indicating that alpha helix structure accounted for 28.17%, strand structure accounted for 11.97%, and loop structure accounted for 59.86%. The tertiary structure of PN protein was constructed by Swiss-model simulation (Figure 4), and the results were similar to the ones predicted by Predict Protein. According to CDS online analysis software, PN belongs to nucleoside hydrolase family and catalyzes the hydrolysis of all of the commonly occurring purine and pyrimidine nucleosides into ribose and the associated base but has a preference for inosine and uridine as substrates.
The ORF of PN gene was amplified by PCR; the PCR products were detected by agarose gel electrophoresis (Figure 5). The results indicated that PCR products with the correct sizes were successfully amplified.
Figure 1: Multiple alignment of amino acid sequences for PN from fungal TCM and other nucleoside hydrolases. The sequences were those from Trichoderma guizhouense (OPB46800.1), Purpureocillium lilacinum (OAQ81830.1), and Purpureocillium lilacinum (XP_018180602.1). Please click here to view a larger version of this figure.
Figure 2: Phylogenetic tree of PN showing the relationship with other species on amino acid sequences of nucleoside hydrolase. Phylogenetic tree was constructed with MEGA 4.0 with the method Neighbor-Joining. Test of inferred phylogeny was Bootstrap for 1,000 replications. Please click here to view a larger version of this figure.
Figure 3: Prediction of secondary structure for PN. Blue stands for strand, and dark red stands for helix. Please click here to view a larger version of this figure.
Figure 4: The tertiary structure of PN protein predicted by Swiss-model. The family type of PN belongs to inosine/uridine-preferring nucleoside hydrolase, which has a preference for inosine and uridine as substrates. Please click here to view a larger version of this figure.
Figure 5: Agarose gel electrophoresis of PN gene cloned from the transcriptome of caterpillar fungus. Lane M: Trans2K Plus II DNA Marker; lane 1, PCR products of PN gene. Please click here to view a larger version of this figure.
Human health is facing a series of major medical problems such as tumor, cardiovascular, and cerebrovascular diseases26,27. TCM has been regarded as the source of research and development of innovative medicine, because of its rich species resources and diverse structure and functions of active ingredients28,29. Caterpillar fungus is a fungal parasite on the larvae of Lepidoptera, and it is an invigorant in Chinese tradition and considered as one of the best invigorants with Panax and Pilose antlers30. A variety of active ingredients such as adenosine, sterols, nucleosides, terpenes, and peptides can be extracted from TCM29,31. The active ingredients have a variety of physiological activities and structural types, and can be used as a source for the research and application of innovative drugs32.
So far, there were many reports on the pharmacological effects of adenosine. However, the studies on the adenosine biosynthesis as well as the genes involved in were few16,33. Nevertheless, KEGG annotation of functional genes in Cordyceps militaris was carried out, and biosynthetic pathway of adenosine was speculated; it was found that 5'-nucleotidase may be a key gene in adenosine biosynthesis33. Other studies speculated the biosynthetic pathway of adenosine; it was indicated that adenosine kinase and 5'-nucleotidase genes were involved in the phosphorylation as well as dephosphorylation processes in metabolic pathway of adenosine34,35. In addition, the biosynthetic pathway of adenosine in fungal TCM was predicted; PN gene was proved to play an important role in adenosine biosynthesis since down-regulation of PN gene was consistent with adenosine accumulation15. Unfortunately, the key genes involved in adenosine biosynthesis were lacking in-depth mining and analysis. Therefore, it is urgent to conduct the study of gene mining and sequence analysis of the key genes involved in adenosine biosynthesis.
Generally, the development of biotechnology requires more and more genetic resources36. Compared to traditional methods of gene mining, including microbial screening for obtaining genetic resources by molecular biological37, metagenomic techniques for mining new genetic resources38, and cloning of natural protein sequence after purification39, the protocol of gene mining applied in this study is more efficient and accurate. Furthermore, the focus of this paper is on how to perform gene mining and sequence analysis of functional enzyme involved in biosynthesis of active ingredients based on RNA-Seq. This protocol could be very helpful to study the biosynthesis mechanism of other active ingredients of TCM. At the same time, other researchers could also refer to this protocol to mine functional proteins with research value and conduct in-depth research on them. However, this protocol also has some limitations. Firstly, gene mining relies on annotated RNA-Seq data, and RNA-Seq appears to be somewhat costly. Secondly, the results of sequence analysis based on bioinformatics analysis are predictive and need to be further verified by experiments.
In conclusion, the protocol of gene mining and sequence analysis provided an important theoretical basis to study the mechanism of adenosine biosynthesis, as well as the key role of PN in adenosine biosynthesis. Taken collectively, this study also would provide a more adequate scientific basis for gene regulation of adenosine biosynthesis and provide a new idea for promoting the modern industrial development of active ingredients in TCM.
The authors have nothing to disclose.
This study was supported by National Natural Science Foundation of China (31871244, 81973733, 81803652), Natural Science Foundation of Guangdong Province (2019A1515011555, 2018A0303100007), Shenzhen Foundation of Health and Family Planning Commission (SZBC2018016), Special Fund for Economic and Technological Development of Longgang District of Shenzhen City (LGKCYLWS2020064, LGKCYLWS2019000361).
RNase-free DNase I | TaKaRa | 2270B | |
PolyATtract mRNA Isolation Systems | Promega | III | |
Random hexamer-primers | Thermo Scientific | SO142 | |
NEBNext1 Ultra RNA Library Prep Kit | NEB | E7530S | |
PCR extraction kit | QiaQuick | ||
Agarose | TransGen Biotech | GS201-01 | |
High-throughput sequencer | Illumina | HiSeq™ 4,000 | |
LTF Viewer | LTF | V5.2 | |
ORF program | NCBI | ||
ProtParam tool | SIB Swiss Institute of Bioinformatics | ||
SignalP Server | DTU Health Tech | 5.0 | |
BLAST | NCBI | ||
Clustal X program | UCD Dublin | ||
MEGA | Center for Evolutionary Medicine and Informatics | 4.0 | |
InterProScan | European Molecular Biology Laboratory | ||
Predict Protein | Technical University of Munich | ||
WISS-MODEL | Swiss Institute of Bioinformatics | ||
Primer Express | Applied Biosystems | 3.0 | |
EcoRI | NEB | R0101V | |
NotI | NEB | ER0591 | |
pMD18-T Vector | TaKaRa | 6011 | |
agarose | Sigma-Aldrich | GS201-01 | |
Trans2K® Plus II DNA Marker | Sigma-Aldrich | BM121-01 | |
6×DNA Loading Buffer | Sigma-Aldrich | GH101-01 | |
GelStain | Sigma-Aldrich | GS101-02 | |
50 x TAE | Sigma-Aldrich | T1060 | |
Gel imaginganalysis system | Syngene | G:BOX F3 | |
E. coli JM109 | Promega | ||
T4 DNA ligase | EarthOx | BE004A-02 | |
pPIC9K | Genloci | GP0983 |
.