This article focuses on the identification of high-confident interaction datasets between host and pathogen proteins using a combination of two orthogonal methods: yeast two-hybrid followed by a high-throughput interaction assay in mammalian cells called HT-GPCA.
Significant efforts were gathered to generate large-scale comprehensive protein-protein interaction network maps. This is instrumental to understand the pathogen-host relationships and was essentially performed by genetic screenings in yeast two-hybrid systems. The recent improvement of protein-protein interaction detection by a Gaussia luciferase-based fragment complementation assay now offers the opportunity to develop integrative comparative interactomic approaches necessary to rigorously compare interaction profiles of proteins from different pathogen strain variants against a common set of cellular factors.
This paper specifically focuses on the utility of combining two orthogonal methods to generate protein-protein interaction datasets: yeast two-hybrid (Y2H) and a new assay, high-throughput Gaussia princeps protein complementation assay (HT-GPCA) performed in mammalian cells.
A large-scale identification of cellular partners of a pathogen protein is performed by mating-based yeast two-hybrid screenings of cDNA libraries using multiple pathogen strain variants. A subset of interacting partners selected on a high-confidence statistical scoring is further validated in mammalian cells for pair-wise interactions with the whole set of pathogen variants proteins using HT-GPCA. This combination of two complementary methods improves the robustness of the interaction dataset, and allows the performance of a stringent comparative interaction analysis. Such comparative interactomics constitute a reliable and powerful strategy to decipher any pathogen-host interplays.
The increasing amount of data collected to generate protein-protein interaction maps opens perspectives to further understand pathogen infections. As global understanding of pathogen infections begins to emerge, it provides access to the range of perturbations induced by pathogens proteins when connecting the human proteome 1. It thereby offers a way to comprehend how pathogens manipulate the host cell machinery. In particular, the mapping of several viral-host interaction networks revealed that viral proteins preferentially target host proteins that are highly connected in the cellular network (hubs), or that are central to many paths in a network (bottlenecks proteins) 2-4. These interactions allow viruses to manipulate important cellular processes, which is instrumental to replicate and produce infectious progeny. Comparative interactions mapping were recently conducted on related viruses with the goal to extract information relative to pathogenesis 4. Furthermore, studies of the host-pathogens interactions landscape have been extended to many pathogens 5. The targeted cell factors identification provides insights into the strategies used by pathogens to infect cells and allows detection of potential pathogenic markers.
The yeast two-hybrid system is the most popular method to identify binary interactions since genetic screening is an efficient and sensitive tool for high-throughput mapping of protein-protein interactions. The approach proposed here further improves individual Y2H screenings by assessing a protein of multiple pathogens strain variants, thus providing access to a comparative overview of pathogen-host protein-protein interactions. Moreover, a major limitation of yeast two-hybrid screening lies in a high false-negative rate, since it recovers about 20% of total interactions 6. This implies that interactions detected with only a subset of pathogens variants might have escaped detection with the others. Therefore, individual partners emerging from all the two-hybrid screenings are further challenged for interaction with the complete range of strains studied, providing comparative interaction datasets. Since it was demonstrated that combining different methodologies strongly increases the robustness of protein-protein interaction datasets 7, this validation is performed in mammalian cells by a newly developed protein-fragment complementation assay called HT-GPCA 8. This cell-based system allows the detection of protein interactions by complementation of the Gaussia princeps split luciferase and has been designed to be compatible with a high-throughput format. Thanks to its luminescence-based technology and its low background noise compared with other fluorescence-based assay, HT-GPCA shows a strikingly high sensitivity.
Overall, this approach constitutes an efficient tool to generate a comprehensive mapping of host-pathogen interplays, which represents the first step toward a global understanding of the host cell hijacking.
1. Yeast Two-hybrid Screenings
2. Matrix building for HT-GPCA
HT-GPCA is a mammalian cell-based assay where both pathogen and host proteins are transiently expressed fused to complementary fragments of the split Gaussia princeps luciferase. Upon interaction of the partners, this enzyme is reconstituted allowing the measure of its enzymatic activity. The interaction intensity is thus deduced from a Normalized Luminescence Ratio (see below and Figure 1)
Note that plasmid constructs can be inversed, i.e. pathogens proteins in pSPICA-N1 and host protein into pSPICA-N2.
3. High-Throughput Gaussia princeps Luciferase-Based Protein Complementation Assay (HT-GPCA)
A major strength of HT-GPCA lies in its high sensitivity, as illustrated by the assessment of false positive and false negative rates for the HPV E2 protein in Figure 2 (adapted from reference 13). To determine the false negative rate, known interactions of E2 from HPV16 were assessed by HT-GPCA (Figure 2A). Four out of 18 interactions were not recovered (corresponding to a 22% false negative rate). The false positive interactions were measured to be 5.8% using 12 HPV E2 proteins against a random set of cellular proteins (Figure 2B). Furthermore, the strong specificity of the method is highlighted in Figure 2C showing that a single point mutation known to interfere with E2 binding to its cell partner BRD4 annihilates the NLR ratio (Figure 2C).
This large-scale comparative interactomic approach has been recently successfully applied to three early proteins of the Human Papillomaviruses (HPV): E2, E6 and E7 originating from different genotypes representative of the HPV natural diversity in tropisms and pathologies 13,14. For each of the early proteins, hierarchical clustering of the interaction profiles mostly recapitulates HPV phylogeny, proving the robustness of the approach and the pertinence of interaction datasets (Figure 3 adapted from references 13 and 14). It can be used to correlate specific virus-host interaction profiles with pathological traits, thereby giving clues of the involvement of viral proteins in pathogenesis. Genotype-specific interactions can be extracted, which potentially correspond to tropism or pathogenic biomarkers as shown in Figure 4 for the E6 proteins of HPV (Figure 4 adapted from reference 14).
Furthermore, functional insights can emerge from such large-scale analyses. For example, in the interactomic study of the HPV E2 proteins, the functional enrichment analysis led to the identification of a prominent family consisting in cellular transcription factors, in line with the primary role of E2 as a viral transcriptional regulator. It also revealed an unexpected functional targeting of the intracellular trafficking machinery, opening new perspectives for the role of E2 in HPV pathogenesis 13.
Overall, the comparative interatomic studies performed with the HPV early proteins greatly improve the comprehension of HPV pathogenesis by correlating specific interactions profiles to phenotypic differences.
Figure 1. Identification of protein-protein interactions using HT-GPCA. Both pathogen and host proteins are fused to an inactive half of the Gaussia princeps luciferase (GL2 and GL1 parts, respectively). When the proteins are interacting, a luciferase activity is induced by the proximity of both halves of the enzyme. The restored Gaussia luciferase activity is calculated from a Normalized Luminescence Ratio (NLR) as shown on the right. Figure adapted from reference 8. Click here to view larger figure.
Figure 2. HT-GPCA characteristics. (A) 18 interactions extracted from the literature involving the E2 protein from HPV16 and cellular proteins were tested by HT-GPCA in order to estimate the false-negative rate associated with this technique. The NLR ratio are represented as a color gradient (heat map) with a scale from black (no interaction) to light blue (strong interaction). Four known interactions were not recovered (red arrows) giving a rough estimate of the false-negative rate around 22%. (B) 12 HPV E2 proteins were tested with 10 randomly picked cellular proteins, corresponding to a priori negative interactions, in order to determine the false-positive rate of the HT-GPCA, which scored around 5.8%. (C) The interaction between the BRD4 cellular protein and HPV16 or HPV18 E2 proteins is known to depend on the amino acids I73 and I77 respectively. When these amino acids are mutated, the HT-GPCA NLR drops with a 5-time decrease demonstrating a high specificity of detecting interactions. Figure adapted from reference 13. Click here to view larger figure.
Figure 3. Comparison between experimental interaction-based dendrograms and HPV phylogenetic trees. The phylogeny of HPV based on the early proteins E2 (A), E6 (B) or E7 (C) sequences are represented on the left of each graph. Interaction dendrograms were obtained by hierarchical clustering of the interaction profiles of each protein and are represented on the right. A significant congruence was observed between both tree types, with similar clustering of the pathological types (colored rectangles), proving the pertinence of the interaction datasets. Figure adapted from references 13 and 14. Click here to view larger figure.
Figure 4. Schematic representation of the E6 main cellular targets. Cellular targets were sorted based on the interaction intensity and grouped according to interacting HPV type. This representation allows the identification of specific biomarkers such as FADD, which interacts only with the E6 proteins of oncogenic HPV. Such targeting must be crucial for the carcinogenic trait of all oncogenic HPV and thus constitutes a good candidate to be used either as a surrogate marker for HPV infection or as a therapeutic target. Figure adapted from reference 14.
Independently, yeast-two hybrid and mammalian interaction assays, such as GST pull-down, LUMIER or MAPPIT, have proven to be effective tools to detect protein-protein interactions, but are limited by the high rate of false-positive and false-negative interactions associated with these techniques 15. Moreover, evidence is growing that combining orthogonal methods increases the reliability of the obtained interaction dataset 7. The recent development of the HT-GPCA technique described here has not only improved the sensitivity of detecting binary protein-protein interaction in mammalian cells, but has also offered the possibility to handle a large amount of interactions at once.
The strategy described here improves the coverage of individual Y2H screenings by probing interactions of multiple strain variants. Moreover, retesting interactions with an orthogonal detection method provides stringent comparative interaction datasets, by overcoming limitations and escaping artifacts inherent to the two-hybrid methodology. Indeed, since HT-GPCA takes place in mammalian cells, the post-translational modifications and natural subcellular localization of the proteins are unaffected. Our approach can thus produce in a single step interaction maps that are improved when compared to those primarily based on yeast two-hybrid for interaction sensing 4.
At this point although, we would recommend being careful when studying trans-membrane proteins, since the detection of interactions may depend on the localization of the Gaussia fragments on either side of the membrane. In these cases, it could be necessary to fuse the Gaussia fragments in the C-terminal end of one or both partners. Additionally, the luciferase expression could be influenced by numerous parameters such as transfection efficiency and cell number. Therefore, we would strongly recommend following the normalization procedure described in the protocol section in order to take into account experimental variations.
By using ordered arrays of ORF taken from the increasing collection of the Human ORFeome, we foresee a point in the near future where HT-GPCA could be used directly as a screening method. This should first drastically improve the screening coverage since each protein pairs would then be tested, and second, it should facilitate the automation. However, a more drastic high throughput expansion of this assay should be brought by the development of an in vitro HT-GPCA assay using in vitro expression systems based on human cell extracts. This should also enable to monitor protein-protein interactions in a controlled biochemical environment.
Lastly, upcoming developments are expected to improve the visualization of complex-mediated luminescence in living cells. In that case, the HT-GPCA could be used to detect dynamic interactions in the context of the addition of drugs or complex inhibitors thus allowing live monitoring of the disruption of an interaction.
In all, this approach constitutes an efficient outline for any studies of pathogen-host interplay and we feel that comparative interactome analyses could provide a solid framework to understand microbe hijacking of host cells.
The authors have nothing to disclose.
This work was supported in part by funding from the Institut Pasteur and by grants from the Ligue nationale contre le Cancer (grants R05/75-129 and RS07/75-75), the Association pour la Recherche sur le Cancer (grants ARC A09/1/5031 and 4867), and the Agence Nationale de la Recherche (ANR07 MIME 009 02 and ANR09 MIEN 026 01). M.M was a recipient of a M.E.N.R.T fellowship.
Name of Reagent/Material | Company | Catalog Number | Comments |
Yeast strains | Clontech | ||
Minimal SD base | US biological | D9500 | |
Amino acids | Sigma | ||
3-amino-1,2,4-Triazole (3-AT) | Acros organics | 264571000 | |
Zymolase | Seikagaku | 120491 | |
DMEM | Gibco-Life Technologies | 31966 | |
Fetal bovine serum | BioWest | S1810 | |
Phosphate buffer Saline (PBS) | Gibco-Life Technologies | 14190 | |
Penicillin-Streptomycin | Gibco-Life Technologies | 15140 | |
Trypsin-EDTA | Gibco-Life Technologies | 25300 | |
Renilla luciferase assay | Promega | E2820 | |
White culture plate | Greiner Bio-One | 655083 | |
96-wellPCR plates | 4titude | 4t-i0730/C | |
Incubator (30 °C) | Memmert | ||
Incubator (37 °C) | Heraeus | ||
Luminometer | Berthold | Centro XS-LB 960 |