Biochemical and structural analyses of glycosylated proteins require relatively large amounts of homogeneous samples. Here, we present an efficient chemical method for site-specific glycosylation of recombinant proteins purified from bacteria by targeting reactive Cys thiols.
Stromal interaction molecule-1 (STIM1) is a type-I transmembrane protein located on the endoplasmic reticulum (ER) and plasma membranes (PM). ER-resident STIM1 regulates the activity of PM Orai1 channels in a process known as store operated calcium (Ca2+) entry which is the principal Ca2+ signaling process that drives the immune response. STIM1 undergoes post-translational N-glycosylation at two luminal Asn sites within the Ca2+ sensing domain of the molecule. However, the biochemical, biophysical, and structure biological effects of N-glycosylated STIM1 were poorly understood until recently due to an inability to readily obtain high levels of homogeneous N-glycosylated protein. Here, we describe the implementation of an in vitro chemical approach which attaches glucose moieties to specific protein sites applicable to understanding the underlying effects of N-glycosylation on protein structure and mechanism. Using solution nuclear magnetic resonance spectroscopy we assess both efficiency of the modification as well as the structural consequences of the glucose attachment with a single sample. This approach can readily be adapted to study the myriad glycosylated proteins found in nature.
Store operated calcium (Ca2+) entry (SOCE) is the major pathway by which immune cells take up Ca2+ from the extracellular space into the cytosol. In T lymphocytes, T cell receptors located on the plasma membrane (PM) bind antigens which activate protein tyrosine kinases (reviewed in 1,2,3). A phosphorylation cascade leads to the activation of phospholipase-γ (PLCγ) which subsequently mediates the hydrolysis of membrane phosphatidylinositol 4,5-bisphosphate (PIP2) into diacylglycerol and inositol 1,4,5-trisphosphate (IP3). IP3 is a small diffusible messenger which binds to IP3 receptors (IP3R) on the endoplasmic reticulum (ER) thereby opening this receptor channel and permitting Ca2+ to flow down the concentration gradient from the ER lumen to the cytosol (reviewed in 4). Receptor signaling from G protein coupled and tyrosine kinase receptors in a variety of other excitable and non-excitable cell types lead to the same production of IP3 and activation of IP3Rs.
Due to the finite Ca2+ storage capacity of the ER, the IP3-mediated release and resultant increase in cytosolic Ca2+ is only transient; however, this depletion of the ER luminal Ca2+ profoundly effects stromal interaction molecule-1 (STIM1), a type-I transmembrane (TM) protein mostly found on the ER membrane 5,6,7. STIM1 contains a lumen-oriented Ca2+ sensing domain made up of an EF-hand pair and sterile α-motif (EFSAM). Three cytosolic-oriented coiled-coil domains are separated from EFSAM by the single TM domain (reviewed in 8). Upon ER luminal Ca2+ depletion, EFSAM undergoes a destabilization-coupled oligomerization 7,9 which causes structural rearrangements of the TM and coiled-coil domains 10. These structural changes culminate in a trapping of STIM1 at ER-PM junctions 11,12,13,14 through interactions with PM phosphoinositides 15,16 and Orai1 subunits 17,18. Orai1 proteins are the PM subunits which assemble to form Ca2+ channels 19,20,21,22. The STIM1-Orai1 interactions at ER-PM junctions facilitate an open Ca2+ release activated Ca2+ (CRAC) channel conformation which enables the movement of Ca2+ into the cytosol from the high concentrations of the extracellular space. In immune cells, the sustained cytosolic Ca2+ elevations via CRAC channels induce the Ca2+-calmodulin/calcineurin dependent dephosphorylation of the nuclear factor of activated T-cells which subsequently enters the nucleus and begins transcriptional regulation of genes promoting T-cell activation 1,3. The process of CRAC channel activation by STIM1 23,24 via agonist-induced ER luminal Ca2+ depletion and the resulting sustained cytosolic Ca2+ elevation is collectively termed SOCE 25. The vital role of SOCE in T-cells is evident by studies demonstrating that heritable mutations in both STIM1 and Orai1 can cause severe combined immunodeficiency syndromes 3,19,26,27. EFSAM initiates SOCE after sensing ER-luminal Ca2+ depletion via the loss of Ca2+ coordination at the canonical EF-hand, ultimately leading to the destabilization-coupled self-association 7,28,29.
Glycosylation is the covalent attachment and processing of oligosaccharide structures, also known as glycans, through various biosynthetic steps in the ER and Golgi (reviewed in 30,32,33). There are two predominant types of glycosylation in eukaryotes: N-linked and O-linked, depending on the specific amino acid and atom bridging the linkage. In N-glycosylation, glycans are attached to the side chain amide of Asn, and in most cases, the initiation step occurs in the ER as the polypeptide chain moves into the lumen 34. The first step of N-glycosylation is the transfer of a fourteen-sugar core structure made up of glucose (Glc), mannose (Man), and N-acetylglucosamine (GlcNAc) (i.e. Glc3Man9GlcNAc2) from an ER membrane lipid by an oligosaccharyltransferase 35,36. Further steps, such as cleavage or transfer of glucose residues, are catalyzed in the ER by specific glycosidases and glycosyltransferases. Some proteins that leave the ER and move into the Golgi can be further processed 37. O-glycosylation refers to the addition of glycans, usually to the side chain hydroxyl group of Ser or Thr residues, and this modification occurs entirely in the Golgi complex 33,34. There are several O-glycan structures which can be made up of N-acetylglucosamine, fucose, galactose, and sialic acid with each monosaccharide added sequentially 33.
While no specific sequence has been identified as prerequisite for many types of O-glycosylation, a common consensus sequence has been associated with the N-linked modification: Asn-X-Ser/Thr/Cys, where X can be any amino acid except Pro 33. STIM1 EFSAM contains two of these consensus N-glycosylation sites: Asn131-Trp132-Thr133 and Asn171-Thr172-Thr173. Indeed, previous studies have shown that EFSAM can be N-glycosylated in mammalian cells at Asn131 and Asn171 38,39,40,41. However, previous studies of the consequences of N-glycosylation on SOCE have been incongruent, suggesting suppressed, potentiated or no effect by this post-translational modification on SOCE activation 38,39,40,41. Thus, research on the underlying biophysical, biochemical, and structural consequences of EFSAM N-glycosylation is vital to comprehending the regulatory effects of this modification. Due to the requirement for high levels of homogeneous proteins in these in vitro experiments, a site-selective approach to covalently attach glucose moieties to EFSAM was applied. Interestingly, Asn131 and Asn171 glycosylation caused structural changes that converge within the EFSAM core and enhance the biophysical properties which promote STIM1-mediated SOCE 42.
The chemical attachment of glycosyl groups to Cys thiols has been well-established by a seminal work which first demonstrated the utility of this enzyme-free approach to understanding the site-specific effects of glycosylation on protein function 43,44. More recently and with respect to STIM1, the Asn131 and Asn171 residues were mutated to Cys and glucose-5-(methanethiosulfonate) [glucose-5-(MTS)] was used to covalently link glucose to the free thiols 42. Here, we describe this approach which not only uses mutagenesis to incorporate site specific Cys residues for modification, but also applies solution nuclear magnetic resonance (NMR) spectroscopy to rapidly assess both modification efficiency and structural perturbations as a result of the glycosylation. Notably, this general methodology is easily adaptable to study the effects of either O– or N-glycosylation of any recombinantly produced protein.
1. Polymerase chain reaction (PCR)-mediated site-directed mutagenesis for the incorporation of Cys into a bacterial pET-28a expression vector.
2. Uniform 15N-labeled protein expression in BL21 ΔE3 Escherichia coli .
NOTE: Different recombinant proteins require different expression conditions. The following is the optimized procedure for expression of the human STIM1 EFSAM protein.
3. Purification of recombinant protein from E. coli.
NOTE: Different recombinant proteins require distinct purification procedures. The following is the protocol for 6×His-tagged EFSAM purification from inclusion bodies expressed from the pET-28a construct.
4. Chemical attachment of glucose-5-MTS to protein by dialysis.
5. Solution NMR assessment of modification efficiency and structural perturbations.
The first step of this approach requires the mutagenesis of the candidate glycosylation residues to Cys residues which can be modifiable using the glucose-5-MTS. EFSAM has no endogenous Cys residues, so no special considerations need to be made prior to the mutagenesis. However, native Cys residues must be mutated to non-modifiable residues prior to performing the described chemistry. To minimally effect the native structure, we suggest performing a global sequence alignment of the protein of interest and determining which other residues are found most frequently at the endogenous Cys position(s). Cys mutation to these other residues which occur naturally in other organisms may have the least impact on protein structure. If the endogenous Cys residue is strictly conserved, we suggest mutating to Ser which is the most structurally similar to Cys. Figure 1 shows a typical PCR mutagenesis gel evaluating the success of the PCR reaction, with the amplified DNA sample demonstrating several-fold higher intensity than a control amount of unamplified template pET-28a DNA which was used for PCR. The next steps include template DNA digestion and transformation into E. coli for plasmid repair. After plasmid propagation in liquid culture, plasmid isolation and confirmation of the mutation(s) by sequencing, the mutated vector may be used for protein expression. Figure 2A shows a typical elution profile of EFSAM from the anion exchange column relative to increasing NaCl concentrations. Figure 2B shows the purity of EFSAM on Coomassie blue-stained SDS-PAGE gels.
After acquiring pure protein, a series of dialysis steps is used to attach the glucose moiety via the MTS reaction with the free thiol. Figure 3A shows an image of the typical setup of a small protein volume sealed in the dialysis membrane by membrane clips and contained in a large 1 L beaker containing the buffer of interest. An initial check of the success of the modification may be performed by mass spectrometry. Figure 3B shows a representative electrospray mass spectrum of EFSAM modified at a single Cys thiol. Following establishment of the protocol for a specific protein, modification efficiency and structural perturbations can be assessed from a single uniformly 15N-labeled sample. The 1H-15N-HSQC spectrum is acquired before and after the addition of the reducing agent DTT (Figure 4A). Calculations of the modification efficiency can be made via a comparison of the amide peak intensities in the modified and reduced spectra as detailed in protocol step 5.8 (Figure 4B). Finally, when chemical shift assignments are known for a protein, the CSPs which correlate with the structural changes can be calculated as detailed in step 5.9 (Figure 4C).
Figure 1: DNA agarose gel showing amplification check of template vector with mutagenic primers.
The image shows a 1.0% (w/v) agarose gel with DNA marker (M), vector control (VC) and PCR-amplified template (PCR). The DNA was separated by electrophoresis at 120 V for 45 min in 0.5x TAE buffer. A total of 0.5 ng of VC was loaded, equivalent to the amount of template loaded into the PCR lane. The gel was stained using ethidium bromide (~0.5 μg/mL) for 20 min prior to visualization under UV light (302 nm). The gel shows a high level of amplified DNA close to the expected size of the vector (black arrowhead). The second band in the PCR lane running between the 1,000 and 1,500 bp marker bands likely represents a non-specifically amplified PCR product. The intensity level of amplified DNA must be higher than the VC intensity level to be deemed successful. Several other DNA dyes can be used as less mutagenic, safer alternatives to ethidium bromide staining (see for example 57). Please click here to view a larger version of this figure.
Figure 2: Typical chromatographic purification and purity check for STIM1 EF-SAM.
(A) Anion exchange chromatography elution profile of STIM1 EF-SAM. After manual binding EF-SAM to the anion exchange column (Q FF) at basic pH and low NaCl concentration with a syringe, and AKTA FPLC (GE Healthcare) is used to elute the protein with a NaCl gradient. The elution is monitored by the AKTA using the UV 280 nm signal over a 0-60% (v/v) gradient of 1 M NaCl. (B) Coomassie blue-stained SDS-PAGE gel of elution fractions from (A). The denaturing protein gel reveals that EF-SAM elutes in two major peaks at ~250 mM and ~450 mM NaCl. The purification protocol yields > 95% pure EF-SAM as evidenced by the lack of any contaminant band showing up in the Coomassie blue-stained gel. Please click here to view a larger version of this figure.
Figure 3: Dialysis setup and confirmation of in vitro protein glycosylation.
(A) Typical dialysis setup used for in vitro attachment of glucose to the Cys thiol via the MTS reactivity. The image shows ~1.5 mL of protein contained within dialysis tubing buffered against ~ 1 L of experimental buffer. It is important that the buffer is constantly stirred to ensure complete exchange. The image shows a microcentrifuge tube clipped to the excess dialysis tubing to prevent sinking of the dialysis bag and damage by the rotating stir bar. (B) Electrospray ionization mass spectrum of the modified Asn171Cys EF-SAM protein. Mass spectrometry is a convenient and accurate approach to assess whether the modification procedure was successful. A typical mass chromatogram is shown with the theoretical and measured masses of unmodified and modified Asn171Cys EF-SAM indicated. The majority of the sample mass corresponds to a macromolecule which is within ~1.3 Da of the expected theoretical mass of glucose-conjugated Asn171Cys EF-SAM. The data in (B) is replotted and modified from 42. Please click here to view a larger version of this figure.
Figure 4: Solution NMR assessment of modification efficiency and structural perturbations from a single NMR sample.
(A)1H-15N-HSQC spectral overlay of glucose-conjugated Asn131Cys EFSAM before (red crosspeaks) and after (black crosspeaks) the addition of 15 mM DTT. The overlay clearly shows several residue-specific amide chemical shift changes indicative of both modification of the protein and structural perturbations. The red box shows the location of the Asn131Cys amide. (B) Zoomed view of the 1H-15N-HSQC region containing the Asn131Cys amide. The intensity of the Asn131Cys amide peak in the modified spectrum (IM) is divided by the intensity in the reduced (unmodified) spectrum (IR) for the calculation of efficiency (shown). A calculation of the mean efficiency of several effected residues provides a better estimate of efficiency, including an error estimate. The mean efficiency is shown for Asn131Cys EFSAM based on 5 residues (i.e. 129-133). (C) Normalized chemical shift perturbations caused by glucose conjugation to the Asn131Cys EFSAM protein. The set of HSQC experiments collected on a single sample before and after supplementing with reducing agent not only provides a convenient estimate of modification efficiency by peak intensity analysis [shown in (B)], but also provides data for evaluation of the structural changes associated with the modification. Glucose conjugation causes the largest perturbations localized near position 131; however, this analysis reveals perturbations which are unexpected solely based on sequence proximity, indicating the value in this analysis. The data in (C) are replotted and modified from 42. Please click here to view a larger version of this figure.
STIM1 mutationa | directionb | DNA sequencec | |
Asn131Cys | forward | 5’-GTCATCAGAAGTATACTGTTGGACCGTGGATGAGG-3’ | |
Asn131Cys | reverse | 5’-CCTCATCCACGGTCCAACAGTATACTTCTGATGAC-3’ | |
Asn171Cys | forward | 5’-CCAAGGCTGGCTGTCACCTGCACCACCATGACAGGG-3’ | |
Asn171Cys | reverse | 5’-CCCTGTCATGGTGGTGCAGGTGACAGCCAGCCTTGG-3’ | |
aSTIM1 amino acid numbering based on NCBI accession AFZ76986.1. | |||
bThe ‘reverse’ primer corresponds to the reverse complement sequence of the ‘forward’ primer. | |||
cThe underlined codon triplet corresponds to the Cys mutation. |
Table 1. Example oligonucleotide (primer) sequences for Asn to Cys mutagenesis within the pET-28a STIM1 EFSAM construct.
Protein glycosylation is a post-translational modification where sugars are covalently attached to polypeptides primarily through linkages to amino acid side chains. As many as 50% of mammalian proteins are glycosylated 54, where the glycosylated proteins can subsequently have a diverse range of effects from altering biomolecular binding affinity, influencing protein folding, altering channel activity, targeting molecules for degradation and cellular trafficking, to name a few (reviewed in33). The important role of glycosylation in mammalian physiology is evident by the several hundreds of proteins evolved to build the full diversity of mammalian glycan structures 33. Altered N– and O-glycosylation patterns have been associated with numerous disease states including prostate (increased and decreased), breast (increased and decreased), liver (increased), ovarian (increased), pancreatic (increased) and gastric cancers (increased) 55. Furthermore, glycosylation of Tau, huntingtin, α-synuclein has been found to regulate toxicity of these proteins associated with Alzheimer's, Huntington's and Parkinson's diseases 56, and a group of congenital disorders of glycosylation have been identified resulting from heritable defects in enzymes which mediate glycosylation 54. Thus, understanding the precise biophysical, biochemical and structural effects of glycosylation has the potential to tremendously impact our understanding of protein regulation and function in health and disease.
The ten sugar building blocks which lead to the diversity of glycan structures found in the mammalian glycome include fucose, galactose, glucose, N-acetylgalactosamine, N-acetylglucosamine, glucuronic acid, iduronic acid, mannose, sialic acid and xylose 33. While N-glycosylation invariably links an N-acetylglucosamine sugar directly to the protein, O-glycosylation can result from any of N-acetylgalacotose, N-acetylglucosamine, xylose, fucose, glucose or galactose covalently linked to the polypeptide. To begin to understand how these sugars immediately adjacent to the protein surface affect the biophysical and structural properties, we describe herein an approach to site-selectively attach sugars to Cys residues via the thiols engineered into the protein sequence. Here, the residues that are endogenously glycosylated are replaced by Cys and modified in vitro via a simple chemical approach. In this manner, single and multiple glycosylation sites may be assessed to tease out the contribution of each specific site as well as the cumulative modifications to the folding and stability as well as the overall structure and function of the protein.
Recently, this approach was successfully used with EFSAM to individually and cumulatively assess the role of the Asn131 and Asn171 N-glycosylation sites 42. Mutation to Cys and covalent attachment of glucose to the Asn131 or Asn171 sites revealed a decreased Ca2+ binding affinity and suppressed stability. When the two sites were simultaneously modified with the glucose attachment, the decreases in binding affinity and stability were potentiated leading to enhanced oligomerization propensity in vitro. Structurally, the approach described herein showed that the Asn131 or Asn171 modifications mutually perturb the core α8 helix located on the SAM domain, immediately adjacent to the EF-hand pair. This structural analysis expounds how glucose modifications on the surface of the protein lead to a converging and potentiated structural change within the EF-hand:SAM interface which ultimately destabilizes the protein and enhances SOCE 42.
While the application of this site-selective approach helped shed light on how a monosaccharide close to the surface of EFSAM effects folding, stability and structure, this procedure can easily be modified to attach longer carbohydrates specific to ER, Golgi and PM localization (i.e. glycosylation states of different maturity), provided there is a reliable source for these carbohydrates containing functional groups which can link to thiols such as MTS. MTS is preferable since the thiol modification is reversible using a reducing agent and a reference spectrum can be easily acquired. This approach can also be adapted to link other post-translational moieties to the protein such as lipids. At the same time, there are several limitations to this approach which should be considered. First, the method relies on mutation of glycosylation sites to Cys which may affect structure, stability and folding even in the absence of any glucose modification. Similarly, native Cys residues in the protein must also be mutated to prevent glucose attachment at non-glycosylated sites. Additionally, the addition of Cys residues often promotes inclusion body formation in bacteria due to Cys crosslinking and misfolding, making purification more challenging. Nevertheless, this site-selective Cys-crosslinking approach described herein provides a controlled means to tease out the structural, biochemical and biophysical effects of specific glycosylation sites in experiments which require high levels of homogeneous protein. The effects of non-native Cys residues on the structure, stability and folding can be simply ascertained in the absence of any modifications by comparison to wild-type protein attributes 42. Taken together with functional data obtained in eukaryotic cells which express modification-blocking mutant versions of the protein (e.g. Asn-to-Ala), the presently described approach will yield new insights into the structural mechanisms of protein regulation by post-translational modifications.
This research was supported by the Natural Sciences and Engineering Research Council of Canada (05239 to P.B.S.), Canadian Foundation for Innovation/Ontario Research Fund (to P.B.S.), Prostate Cancer Fight Foundation – Telus Ride for Dad (to P.B.S.) and Ontario Graduate Scholarship (to Y.J.C. and N.S.).
Phusion DNA Polymerase | Thermo Fisher Scientific | F530S | Use in step 1.3. |
Generuler 1kb DNA Ladder | Thermo Fisher Scientific | FERSM1163 | Use in step 1.6. |
DpnI Restriction Enzyme | New England Biolabs, Inc. | R0176 | Use in step 1.8. |
Presto Mini Plasmid Kit | GeneAid, Inc. | PDH300 | Use in step 1.16. |
BL21 DE3 codon (+) E. coli | Agilent Technologies, Inc. | 230280 | Use in step 2.1. |
DH5a E. coli | Invitrogen, Inc. | 18265017 | Use in step 1.9. |
0.22 mm Syringe Filter | Millipore, Inc. | SLGV033RS | Use in step 2.3. |
HisPur Ni2+-NTA Agarose Resin | Thermo Fisher Scientific | 88221 | Use in step 3.3. |
3,500 Da MWCO Dialysis Tubing | BioDesign, Inc. | D306 | Use in step 3.8, 3.16, 4.2, 4.5 and 4.6. |
Bovine Thrombin | BioPharm Laboratories, Inc. | SKU91-055 | Use in step 3.9. |
5 mL HiTrap Q FF Anion Exchange Column | GE Healthcare, Inc. | 17-5156-01 | Use in step 3.11. |
Glucose-5-MTS | Toronto Research Chemicals, Inc. | G441000 | Use in step 4.1. |
Vivaspin 20 Ultrafiltration Centrifugal Concentrators | Sartorius, Inc. | VS2001 | Use in step 3.11, 4.2, 4.5 and 4.6. |
PageRuler Unstained Broad Protein Ladder | Thermo Fisher Scientific | 26630 | Use in step 3.7, 3.10 and 3.15 |
HiTrap Q FF Anion Exchange Column | GE Healthcare, Inc. | 17-5053-01 | Use in step 3.12. |
AKTA Pure Fast Protein Liquid Chromatrography System | GE Healthcare, Inc. | 29018224 | Use in step 3.14. |
600 MHz Varian Inova NMR Spectrometer | Agilent Technologies, Inc. | Use in step 5.2 and 5.5. |