IR-TEx explores insecticide resistance-related transcriptional profiles in the species Anopheles gambiae. Provided here are full instructions for using the application, modifications for exploring multiple transcriptomic datasets, and using the framework to build an interactive database for collections of transcriptomic data from any organism, generated in any platform.
IR-TEx is an application written in Shiny (an R package) that allows exploration of the expression of (as well as assigning functions to) transcripts whose expression is associated with insecticide resistance phenotypes in Anopheles gambiae mosquitoes. The application can be used online or downloaded and used locally by anyone. The local application can be modified to add new insecticide resistance datasets generated from multiple -omics platforms. This guide demonstrates how to add new datasets and handle missing data. Furthermore, IR-TEx can be completely and easily recoded to use-omics datasets from any experimental data, making it a valuable resource to many researchers. The protocol illustrates the utility of IR-TEx in identifying new insecticide resistance candidates using the the microsomal glutathione transferase, GSTMS1, as an example. This transcript is upregulated in multiple pyrethroid resistant populations from Côte D'Ivoire and Burkina Faso. The identification of co-correlated transcripts provides further insight into the putative roles of this gene.
The ability to measure the expression of large numbers of transcripts simultaneously through microarray platforms and RNAseq technology has resulted in the generation of vast datasets associating transcript expression with a particular phenotype in both model and non-model organisms. These datasets are an extremely rich resource for researchers, the power of which can be increased by combining relevant sets in a big data integration approach. However, this methodology is limited to those with particular bioinformatics skills. Described here is a program, IR-TEx (previously published by Ingham et al.1) that is written in an R package called Shiny2 and allows users with little bioinformatics training to integrate and interrogate these datasets with relative ease.
IR-TEx, found at http://www.lstmed.ac.uk/projects/IR-TEx, was written to explore transcripts associated with insecticide resistance in Anopheles gambiae, the major African malaria vector1. Malaria is a parasitic disease caused by Plasmodium species, transmitted between humans through the bites of female Anopheles mosquitoes. Targeting the mosquito vector with insecticides has proven to be the most effective means of preventing malaria-related morbidity and mortality in Africa. The scaling up of tools (i.e., long lasting insecticidal nets) has also been pivotal in the dramatic reductions in malaria cases since 20003. With a very limited number of insecticides available, there is strong evolutionary pressure on the mosquitoes, and resistance is now widespread in African malaria vectors4.
Additionally, target site mutations5 and metabolic clearance of insecticides6,7 remain the primary studied mechanisms of resistance, but other potent resistant mechanisms are now emerging1. Many of these new mechanisms have not previously been associated with insecticide resistance but have been detected by searching for common patterns of gene expression across multiple resistant populations using the IR-TEx app and subsequently functionally validated by genomics approaches1.
Described here is a step-by-step approach to using IR-TEx, both on the web and when installed locally. The protocol describes how new insecticide resistance datasets can be integrated into the existing package and explains how to operate with missing data. Finally, it describes how to use this software with other -omics datasets that are unrelated to insecticide resistance, thus combining data from varying -omics approaches while also operating with missing values and normalization so that data are comparable.
1. Using the IR-TEx web application
2. Downloading and implementing IR-TEx locally
3. Modifying IR-TEx for use with different datasets
Using the Fold_Changes.txt file included with IR-TEx, we compared transcripts that were significantly differentially expressed in resistant Anopheles coluzzii and Anopheles gambiae datasets to susceptible controls from Côte D'Ivoire and Burkina Faso. This yielded 18 transcripts of interest (Table 1; this search can be performed using Excel, R, or other programs). Two of these, an ATPase (AGAP006879) and α-crystallin (AGAP007160), have been previously reported, with the former having a significant effect on pyrethroid resistance1. In addition to these two transcripts, two detoxification transcripts, GSTMS1 (FCµ = 1.95 and 1.85) and UGT306A2 (FCµ = 2.29 and 2.28) were present.
qPCR validation of two of these transcripts (GSTMS1, a detoxification transcript; and AGAP009110-RA, an unknown, mosquito-specific transcript containing a β-1,3-glucan binding domain) were performed as previously described1. Analysis was performed using primer sets described in Additional File 3 and showed that these transcripts were significantly upregulated in a multiresistant population from Côte D'Ivoire (Tiassalé) and another from Burkina Faso (Banfora), compared to the lab-susceptible N'Gousso (Figure 4A).
As both transcripts showed significant upregulation in each of the resistant populations, RNAi-induced knockdown was performed on mosquitoes from the LSTM laboratory Tiassalé colony. This colony originates from Côte D'Ivoire and is resistant to all major classes of insecticide used in public health, as previously described1,10. Attenuation of expression of GSTMS1 resulted in a significant increase (p = 0.021) in mortality after deltamethrin exposure compared to GFP-injected controls, demonstrating the importance of this transcript in pyrethroid resistance (Figure 4B). Conversely, AGAP009110-RA knockdown resulted in no significant (p = 0.082) change in mortality after exposure (Figure 4B).
GSTMS1 is a microsomal GST and is one of three found in A. gambiae mosquitoes11. Although members of the epsilon and delta classes of GSTs have been previously implicated in insecticide detoxification12,13,14, this is the first evidence to our knowledge for a role of microsomal GSTs in pyrethroid resistance15. To explore the putative function of this transcript in Anopheles gambiae sl mosquitoes, the expression and correlation in IR-TEx were identified. GSTMS1 was significantly overexpressed in 20 out of 21 datasets available for these species, with the exception of Bioko Island. In each location, the overexpression was less than five-fold compared to the susceptible populations (Figure 5).
As microsomal GSTs have largely been ignored as potential insecticide detoxifiers, little is known about their role in insecticide resistance15. By exploring the co-correlation of other transcripts, putative functions can be elucidated through the assumption of coregulation or involvement in the same pathways. To maximize power in the correlation network, all microarray datasets present in IR-TEx were selected, and an |r| of >0.75 was selected. Table 2 shows the output from IR-TEx.
These transcripts are enriched in oxioreductase activity and glucose/carbohydrate metabolism in DAVID's functional annotation tool8. Both glucose-6-phosphate dehydrogenase and cytathione gamma-lyase maintain the level of glutathione in mammalian cells16,17 and thus link directly with GSTMS1, a glutathione-S-transferase. Catalase is a fast-acting oxidative stress responder that protects cells from reactive oxygen species damage, a byproduct of pyrethroid exposure. Valacyclovir hydrolase is a hydrolase that may play a role in detoxification in mammalian cells18. CYP4H17 is also present in the correlation network. Cytochrome p450s are direct metabolizers of pyrethroid insecticides, and these breakdown products can be further metabolized by GSTs. Finally, CYP4H17 has been implicated in pyrethroid resistance in A. funestus19. Taken together, these data strongly support a role for GSTMS1 in xenobiotic detoxification.
Figure 1: Log2 fold change of AGAP002865-RA in all datasets. The x-axis details the different datasets, information for which can be found in Supplementary Table 1 in a previous publication1, and the y-axis shows the log2 fold change in the transcript of interest. The light-grey dotted lines indicate approximate thresholds for significance, taken here to be a fold change of <0.8 or fold change of >1.2. The dotted black line indicates a fold change of 1 (i.e., no difference in expression between the resistant and susceptible populations). Please click here to view a larger version of this figure.
Figure 2: Distribution of microarrays showing significant differential expression of AGAP002865-RA in resistant populations. Fold changes are represented in a traffic light system: green fold change of <1, orange fold change of >1, and red fold change of >5. Only datasets with significant (p ≤ 0.05) differential expression are shown. Please click here to view a larger version of this figure.
Figure 3: Correlation networks of AGAP001076-RA (CYP4G16). Pairwise correlations are calculated across all transcripts across the 31 microarray datasets, with a user-defined cut-off applied. Shown here is (A) |r| > 0.9 and (B) |r| > 0.8. All transcripts displayed on the graph meet this criterion and follow the expression changes of AGAP001076-RA. Please click here to view a larger version of this figure.
Figure 4: mRNA expression and phenotype upon attenuation of GSTMS1 and AGAP009110-RA. (A) mRNA expression of GSTMS1 and AGAP009110-RA in two multi-resistant An. coluzzii populations from Côte D'Ivoire and Burkina Faso, respectively. Levels were compared to the lab-susceptible An. coluzzii N'Gousso. Significance levels calculated by ANOVA with a post-hoc Dunnett's test. (B) RNAi-induced attenuation of both transcripts compared to GFP-injected controls. GSTMS1 attenuation shows significant increase in mortality after deltamethrin exposure (calculated by ANOVA with a post-hoc Tukey test; *p ≤ 0.05, **p ≤ 0.01). Please click here to view a larger version of this figure.
Figure 5: Expression of GSTMS1 で Anopheles gambiae and Anopheles coluzzii populations. Map showing the significantly differential expression of GSTMS1 in available microarray datasets. GSTMS1 was found to be significantly differential in 20 out of 21 microarray datasets. Please click here to view a larger version of this figure.
Transcript ID | Description | Burkina Faso | Côte D'Ivoire |
AGAP006879-RA | ATPase | 27.94 | 43.05 |
AGAP007160-RB | a-crystallin | 11.49 | 10.58 |
AGAP007160-RC | a-crystallin | 11.14 | 10.38 |
AGAP007160-RA | a-crystallin | 9.78 | 9.84 |
AGAP009110-RA | Unknown | 9.26 | 5.96 |
AGAP007780-RA | NADH dehydrogenase | 10.49 | 3.77 |
AGAP006383-RA | oligosaccharyltransferase complex subunit beta | 3.69 | 5.57 |
AGAP007249-RB | Flightin | 4.61 | 3.86 |
AGAP003357-RA | RAG1-activating protein 1-like protein | 4.31 | 4.05 |
AGAP007249-RA | Flightin | 4.48 | 3.46 |
AGAP001998-RA | mRpS10 | 3.46 | 2.85 |
AGAP007589-RA | UGT306A2 | 2.29 | 2.28 |
AGAP000165-RA | GSTMS1 | 1.95 | 1.85 |
AGAP002101-RA | isoleucyl-tRNA synthetase | 0.57 | 0.59 |
AGAP002969-RA | asparaginyl-tRNA synthetase | 0.45 | 0.45 |
AGAP004199-RA | solute carrier family 5 (sodium-coupled monocarboxylate transporter), member 8 | 0.35 | 0.48 |
AGAP004684-RA | rRNA-processing protein CGR1 | 0.36 | 0.22 |
AGAP006414-RA | Cht8 | 0.024 | 0.36 |
Table 1: Transcripts significantly differential in the same fold change direction across Burkina Faso and Côte D'Ivoire populations. Transcript ID, gene description, and average fold change for each dataset from the two countries representing An. coluzzii and An. gambiae populations.
Correlation | Systematic Name | Transcript Type |
1 | AGAP000165-RA | GSTMS1 |
0.82 | AGAP004904-RA | Catalase |
0.76 | AGAP007243-RA | 26S protease regulatory subunit 8 |
0.79 | AGAP008358-RA | CYP4H17 |
0.76 | AGAP009436-RA | Valacyclovir hydrolase |
0.75 | AGAP010739-RA | Glucose-6-phosphate 1-dehydrogenase |
0.85 | AGAP011172-RA | cystathionine gamma-lyase |
0.76 | AGAP012678-RA | Glucose-6-phosphate 1-dehydrogenase |
Table 2: Transcripts co-correlated with GSTMS1. The table shows output of the correlation network for GSTMS1 on IR-TEx with |r| of >0.75. The table shows the Spearman's correlation, transcript ID, and gene description for each co-correlated transcript.
Additional File 1: Output file from A-MEXP-2196 array analyzed on limma. The file originates from a Met knockdown compared to a GFP control array, described in more detail in ArrayExpress (E-MTAB-4043) and another previous publication1. Columns represent AGAP identifier (SystematicName), log fold change (logFC), log expression values (AveExpr), t-statistic (t), uncorrected p-value (P.Value), adjusted p-value (adj.P.Val), and B statistic (B)20. For the purposes of this file, the mosquitoes are Anopheles coluzzi from Côte D'Ivoire and are unexposed to insecticides, with a collection latitude and longitude of -5.4 and 6.0, respectively. Please click here to view this file (Right click to download).
Additional File 2: Output file from RNAseq experiment. RNAseq analysis taken from Uyhelji et al.9 describing changes in the transcriptome of Anopheles mosquitoes when exposed to 50% salinity. This file is adapted from Table S2 of the publication and includes AGAP identifier (SystematicID), raw fold change (Fold_Change), and adjusted p-value (q_value). Please click here to view this file (Right click to download).
Additional File 3: Primer list for representative results. AGAP identifier, gene name, dsRNA forward, dsRNA reverse, qPCR forward, and qPCR reverse primer sets for each transcript. Please click here to view this file (Right click to download).
Supplemental coding File 1. Please click here to view this file (Right click to download).
Supplemental coding File 2. Please click here to view this file (Right click to download).
Supplemental coding File 3. Please click here to view this file (Right click to download).
Big data transcriptomics produces lists of thousands of transcripts that are differentially expressed for each experimental condition. Many of these experiments are performed on related organisms and phenotypes and are almost exclusively analyzed as independent experiments. Utilizing these rich data sources by examining the data holistically and without theoretical assumptions will 1) lead to the identification of new candidate transcripts and 2) prevent the discarding of valuable data simply because there is too much information to validate in vivo1.
IR-TEx provides users with a limited bioinformatics background with the ability to easily examine multiple datasets, visualize changes in the datasets, and download the associated information1. Although IR-TEx does not support searching for more than one transcript in each search, users can examine the associated Fold_Changes.txt files simply by using Excel, R, or other appropriate programs. Further utility of IR-TEx stems from the use of correlation networks to predict transcript function, input of hypothetical proteins or transcripts with unknown functions and use of downstream software to search for enrichments1.
In the example demonstrated in this protocol, IR-TEx is used according to its original function. Here, it allows exploration of transcripts associated with insecticide resistance and visualization of the distribution of over- and under-expression through mapping graphics. Transcripts of interest are validated in vivo to determine whether the over- or under-expression of given transcripts contributes to an observed phenotype1 (e.g., insecticide resistance). It was demonstrated here, as previously reported1, that a dataset can be used in a hypothesis-driven approach to identify transcripts of interest on a country-specific basis. IR-TEx can then be used to 1) explore expression of the transcript and 2) contextualize the transcript's function by applying a pairwise correlation network across all transcripts contained in each -omics dataset. Here, GSTMS1 was shown to be co-correlated with a number of other transcripts implicated in detoxification. This data (along with knockdown of the transcript that resulted in a significant increase in mortality after insecticide exposure) demonstrates the importance of this transcript in xenobiotic clearance.
IR-TEx represents a valuable resource for exploring insecticide resistance-related transcripts on the web or using local applications. This protocol demonstrates how to modify IR-TEx for different -omics platforms as well as completely new data. The guide illustrates how to use IR-TEx to integrate data from multiple -omics platforms and datasets with missing data as well as how to recode IR-TEx simply so it is useful for anyone researching transcriptomic datasets.
The authors have nothing to disclose.
This work was funded by an MRC Skills Development Fellowship to V.I. (MR/R024839/1) and Royal Society Challenge Grant (CH160059) to H.R.
Laptop with browser | Any | – | – |
R Program | The R Project for Statistical Computing | – | https://www.r-project.org/ |
R Studio | R Studio | – | https://www.rstudio.com/ |