We present a method for the purification, detection, and identification of diGly peptides that originate from ubiquitinated proteins from complex biological samples. The presented method is reproducible, robust, and outperforms published methods with respect to the level of depth of the ubiquitinome analysis.
The posttranslational modification of proteins by the small protein ubiquitin is involved in many cellular events. After tryptic digestion of ubiquitinated proteins, peptides with a diglycine remnant conjugated to the epsilon amino group of lysine ('K-ε-diglycine' or simply 'diGly') can be used to track back the original modification site. Efficient immunopurification of diGly peptides combined with sensitive detection by mass spectrometry has resulted in a huge increase in the number of ubiquitination sites identified up to date. We have made several improvements to this workflow, including offline high pH reverse-phase fractionation of peptides prior to the enrichment procedure, and the inclusion of more advanced peptide fragmentation settings in the ion routing multipole. Also, more efficient cleanup of the sample using a filter-based plug in order to retain the antibody beads results in a greater specificity for diGly peptides. These improvements result in the routine detection of more than 23,000 diGly peptides from human cervical cancer cells (HeLa) cell lysates upon proteasome inhibition in the cell. We show the efficacy of this strategy for in-depth analysis of the ubiquitinome profiles of several different cell types and of in vivo samples, such as brain tissue. This study presents an original addition to the toolbox for protein ubiquitination analysis to uncover the deep cellular ubiquitinome.
The conjugation of ubiquitin to proteins marks them for degradation by the proteasome and is a crucial process in proteostasis. The C-terminal carboxyl group of ubiquitin forms an isopeptide bond with the lysine ε-amino group of the target protein1,2. In addition, ubiquitin can be attached to other ubiquitin modules, resulting in the formation of homogeneous (i.e., K48 or K11) or branched (i.e., heterogeneous or mixed) polyubiquitin structures1,3. The most well-known function of ubiquitin is its role in proteasomal degradation, mediated by K48-linked polyubiquitin. However, it has become clear that both mono- as well as polyubiquitination also play roles in many processes that are independent of degradation by the proteasome. For instance, K63-linked chains have nondegradative roles in intracellular trafficking, lysosomal degradation, kinase signaling, and the DNA damage response4,5. The other six linkage types are less abundant and their roles are still largely enigmatic, although first indications about their functions in the cell are emerging, largely because of the development of novel tools to enable linkage-specific detection6,7.
Mass spectrometry has become an indispensable tool for proteome analyses and nowadays thousands of different proteins from virtually any biological source can be identified in a single experiment. An additional layer of complexity is presented by posttranslational modifications (PTMs) of proteins (e.g., phosphorylation, methylation, acetylation, and ubiquitination) which can modulate protein activity. Large-scale identification of PTM-bearing proteins has also been made possible by developments in the mass spectrometry field. The relatively low stoichiometry of peptides bearing PTMs compared to their unmodified counterparts presents a technical challenge and biochemical enrichment steps are generally necessary prior to the mass spectrometry analysis. Over the past two decades, several different specific enrichment methods have been developed for the analysis of PTMs.
Because of the multifaceted roles of protein ubiquitination in the cell, there is a great demand for the development of analytical methods for the detection of ubiquitination sites on proteins8. The application of mass spectrometric methods has led to an explosion of the number of identified ubiquitination sites in fruit fly, mouse, human, and yeast proteins9,10,11,12,13,14. A major step was presented by the development of immunoprecipitation based enrichment strategies at the peptide level using antibodies directed against the K-ε-GG remnant motif (also referred to as 'diglycine' or 'diGly'). These diGly peptides are produced upon digestion of ubiquitinated proteins using trypsin as the protease15,16.
Here, we present an optimized workflow to enrich for diGly peptides using immunopurification and subsequent detection by Orbitrap mass spectrometry. Using a combination of several modifications of existing workflows, especially in the sample preparation and mass spectrometry stages, we can now routinely identify more than 23,000 diGly peptides from a single sample of HeLa cells treated with a proteasome inhibitor and ~10,000 from untreated HeLa cells. We have applied this protocol to lysates from both unlabeled and stable isotope labeling with amino acids in cell culture (SILAC) labeled HeLa cells as well as to endogenous samples such as brain tissue.
This workflow presents a valuable addition to the repertoire of tools for the analysis of ubiquitination sites in order to uncover the deep ubiquitinome. The following protocol describes all steps of the workflow in detail.
All methods described here have been approved by the Institutional Animal Care and Use Committee (EDC) of Erasmus MC.
1. Sample preparation
2. Offline peptide fractionation
3. Nanoflow LC-MS/MS
4. Data analysis
Ubiquitinated proteins leave a 114.04 Da diglycine remnant on the target lysine residue when the proteins are digested with trypsin. The mass difference caused by this motif was used to unambiguously recognize the site of ubiquitination in a mass spectrometry experiment. The strategy that we describe here is a state-of-the-art method for the enrichment and subsequent identification of diGly peptides by nanoflow LC-MS/MS (Figure 1A). In this study, both cultured cells and in vivo material were used as the biological source of proteins, but this protocol is compatible with any source of proteins. Following the steps in the protocol it should be straightforward to identify 10,000-25,000 diGly peptides from 2-20 mg of protein input. To increase the extent of protein ubiquitination in cells, a proteasome inhibitor such as bortezomib or MG132 can be added a few hours prior to harvesting the cells. If no proteasome inhibitor was used, the numbers of identified diGly peptides tended to be significantly lower (30-40%).
We made several improvements to existing protocols. First, a crude fractionation into three fractions based on reversed-phase chromatography and subsequent elution at high pH is performed to reduce the complexity of the peptide mixture. These fractions show a very low overlap in peptide identifications, and comparable numbers of diGly peptides should be identified per fraction (Figure 2). This results in high numbers of unique diGly peptides identified in each of those fractions. Importantly, one of the fractions (typically the second) should contain ubiquitin's own K48 modified tryptic diGly peptide LIFAGK(GG)QLEDGR (m/z 730.39). This is by far the most abundant peptide in the immunoprecipitated fraction and is characterized by the intense and broad peak in the LC chromatogram (Figure 1B). This is a benchmark chromatographic peak and if it is absent from the chromatogram the IP was most likely unsuccessful.
Another improvement is the adaptation of the DDA analysis procedure is the ion routing multipole in the mass spectrometer. In conventional top N data-dependent acquisition (DDA), N peaks from the MS1 spectrum are selected for fragmentation. This fragmentation scheme starts with the highest intensity peak first, followed by the peak of second highest intensity, and so on. In an alternative fragmentation scheme, the least intense peak is selected first, followed by the second least intense peak, etc. The rationale behind this order of selection is that there be sufficient time to fragment very low abundant peptides as well. In fact, we found that the number of peptide identifications increases when the "highest first" and "lowest first" DDA runs were combined compared to a duplicate LC-MS analysis with standard DDA settings (i.e., highest first). For more comprehensive ubiquitinome profiling, it is therefore recommended to combine the LC-MS runs with "highest first" and "lowest first" fragmentation regimes in the data analysis procedure. This "lowest first" strategy can produce more than additional 4,000 unique diGly peptides, which were not detected when only the conventional DDA regime was used (Figure 2).
Finally, additional IP's of the flowthrough after the first IP can produce another ~2,500 unique diGly peptides (Figure 2).
Articles in the literature on ubiquitination profiling typically report around 10,000 identified diGly peptides12,21. Here, of all diGly peptides identified over three biological replicate screens, >9,000 were present in all three, while >17,000 were present in at least two out of three replicates (Figure 3). Typically, following the protocol described here one should identify >21,000 unique diGly peptides from one 15-20 mg protein sample using one standard batch of CST antibody beads. In terms of purity and selectivity the ratio between identified diGly peptides and unmodified peptides should always be >0.5. The number of diGly peptide identifications was highly dependent on the amount of protein input material. An IP performed with only 1 mg of input material produced roughly 2,500 diGly peptide identifications, while with 10 mg of protein input material >15,000 diGly peptide identifications were produced. Table 1 lists the expected number of identified diGly peptides for each condition. It should be noted that these numbers are only estimations and depend on the type of mass spectrometer used. Figure 4 shows the overlap between diGly peptide identifications with low, medium, and high amounts of input material.
In order to illustrate the added value of the improvements for ubiquitination site analysis described above, we also performed a quantitative ubiquitinomic analysis of SILAC labeled HeLa cells that were treated with the proteasome inhibitor bortezomib compared to untreated control cells in a duplicate label swap assay. More than half (>55%) of all identified peptides in the eluate upon IP were diGly peptides. Over 19,000 unique diGly peptides were identified, which is only slightly less than in a non-SILAC labeled sample. The reason for this may be the higher complexity of MS1 spectra in a SILAC assay because of the presence of peptide peak pairs. In the SILAC analysis relatively large differences were observed between the numbers of diGly peptides that were exclusively identified in the forward condition (i.e., bortezomib treated cells in the heavy channel, control cells in the light channel) and those exclusively identified in the reverse condition (i.e., bortezomib treated cells in the light channel, control cells in the heavy channel), in this case 1,752 versus 6,356 (Supplementary Table 1). When operating the MaxQuant software in the "multiplicity = 2" (i.e., two-channel SILAC) mode, 7,555 diGly peptides were identified with zero intensity in the heavy channel (virtually all coming in the reverse experiment) and a non-zero intensity value in the light channel. In contrast, no single diGly peptides was identified with a non-zero intensity value in the heavy channel accompanied by a zero intensity value in the light channel. When a MaxQuant analysis on the same data set was performed in the "multiplicity = 1" mode with the diGly moiety and the labeled amino acid combined into one single variable modification, many heavy diGly peptide variants were identified, even when no light counterpart of that peptide could be detected. The most likely explanation for this is the inability of the software to cope with the identification of diGly peptides that are exclusively present in the heavy channel. This phenomenon is likely to occur extensively, because the inhibition of the proteasome will trigger the formation of novel ubiquitination sites. Ticking the "requantify" option check box in MaxQuant, which was developed to deal with these issues and should corrects for this, seems to be insufficient to avoid this issue completely. Obviously, the far majority of diGly peptides are being upregulated or formed de novo upon proteasome inhibition, because over two thirds of the peptide pool have H:L ratios of at least 1.5 (Figure 5B).
Finally, we applied this ubiquitination site analysis method to an in vivo tissue sample. To assess the effectiveness, we extracted approximately 32 mg of protein from fresh mouse brain (~10% of the wet tissue weight is protein). The brain material was not treated with proteasome inhibitors or any other reagent that could boost overall protein ubiquitination. From this sample, 10,871 unique diGly peptides were identified (Supplementary Table 2). All diGly peptides identified in this tissue originated from endogenous sites of ubiquitination in a steady-state situation. No treatment to boost global ubiquitination (e.g., proteasome inhibition) was imposed. We therefore hypothesize that these ubiquitination sites at least partially arise from (poly)ubiquitination involved in non-proteasome mediated cellular signalling events, which is in agreement with ideas proposed in the literature5,16,22.
In conclusion, the method described here allows for the in-depth exploration of the ubiquitinome in a reproducible manner. For an overview of typical results obtained with this procedure see Van Der Wal et al.23.
Figure 1: Experimental overview. (A) Overview of the experimental approach. Samples were prepared, trypsinized, and fractionated into three fractions using reverse-phase chromatography with high pH elution. One batch of commercial α-diGly peptide antibody beads was split into six equal fractions and the three peptide fractions were then loaded on three of the bead fractions. The diGly peptides were immunopurified, eluted, and collected, and the flowthrough was subsequently transferred to the three remaining fresh beads fractions. The collected diGly peptides were analyzed by mass spectrometry on a Lumos Orbitrap mass spectrometer according to a two-tier scheme combining one cycle in which the most intense peaks were first selected for peptide fragmentation and the next cycle in which the least intense peaks were selected first. The complete set of nLC-MS/MS runs were then analyzed using MaxQuant. (B) One of the fractions should contain ubiquitin's own K48 modified tryptic diGly peptide LIFAGK(GG)QLEDGR (m/z 730.39). This is by far the most abundant peptide in the immunoprecipitated fraction and was characterized by the intense and broad peak in the LC chromatogram between 50-55 min on a 120 min gradient. If this benchmark peak is absent from the chromatogram the IP was most likely unsuccessful. This figure has been modified from Van Der Wal et al.23. Please click here to view a larger version of this figure.
Figure 2: Numbers of diGly peptides detected for each of the three improvement steps. (A) Effect of crude fractionation prior to immunoprecipitation. The overlap between diGly peptide populations identified in the three separate fractions is shown. (B) Effect of the first and second incubation steps. (C) Results of the adjusted peptide fragmentation regime. This figure has been modified from Van Der Wal et al.23. Please click here to view a larger version of this figure.
Figure 3: DiGly peptides detected in three biological replicates of bortezomib treated cells showing the amount of overlap between the runs. This figure has been modified from Van Der Wal et al.23. Please click here to view a larger version of this figure.
Figure 4: Overlap of identified diGly peptides from analyses with low (4 mg), medium (10 mg), and high (40 mg) total protein input amounts. Please click here to view a larger version of this figure.
Figure 5: Detection of diGly peptides in SILAC labeled cells. (A) Numbers of peptides detected in the forward and reverse conditions of the SILAC labeled HeLa cells for multiplicity settings 1 and 2. (B) Scatterplot of diGly peptide SILAC ratios in Bortezomib (Btz) treated HeLa cells. Only peptides that were identified and quantified in both forward and reverse experiments are shown. This figure has been modified from Van Der Wal et al.23. Please click here to view a larger version of this figure.
Condition | Amount of input material (mg) | Expected number of identified diGly peptides |
Untreated HeLa cells | 10 | 7,500 |
Proteasome inhibitor treated HeLa cells | 1 | 2,500 |
2 | 5,000 | |
10 | 15,000 | |
20 | 20,000 | |
40 | >25,000 | |
Tissue (mouse brain) | 30 | >10,000 |
Table 1: Expected numbers of diGly peptide identifications for different conditions. These numbers are only estimations and depend on the experimental settings used.
Supplementary Table 1. Please click here to view this table (Right click to download).
Supplementary Table 2. Please click here to view this table (Right click to download).
The protocol described here was applied to samples from various biological sources, such as cultured cells and in vivo tissue. In all cases we identified thousands of diGly peptides, provided that the total protein input amount was at least 1 mg. The enrichment using specific antibodies is highly efficient, given that only at most 100-150 very low abundant diGly peptides were identified from whole cell lysates if no enrichment procedures for ubiquitinated proteins or diGly peptides were applied. Obviously, sensitive mass spectrometry is a prerequisite for obtaining high numbers of diGly identifications. Although we have successfully used several different mass spectrometers, we found the Orbitrap Tribrid Lumos to be the most sensitive one that gave the highest yields.
The offline RP chromatography with high pH elution should be tested before the IP's are carried out. The overlap between the fractions in terms of peptide identifications should be as low as possible for an optimal experiment. After the IP, one of the fractions should contain ubiquitin's own K48 modified tryptic diGly peptide LIFAGK(GG)QLEDGR (m/z 730.39). This is by far the most abundant peptide in the immunoprecipitated fraction and is characterized by the intense and broad peak in the LC chromatogram between 50-55 min on a 120 min gradient (Figure 1B). If this benchmark peak is absent from the chromatogram, the IP was most likely unsuccessful.
It is important to analyze the immunoprecipitated diGly tryptic peptides immediately after the IP procedure, so the time between the IP and analysis should be kept to a minimum. During that time, peptides should preferably be stored in a glass vial instead of plastic tubes. Leaving peptides in plastic tubes for too long, either at RT or at -20 °C, may result in precipitation of peptides and/or sticking to the plastic tube wall. This will ultimately affect the sensitivity of the analysis.
Although there have been reports in the literature about the potential misinterpretation of iodoacetamide adducts as ubiquitination sites because of their identical 114.04 Da mass shifts24, we have not found any indication of this with our preparations of immunoprecipitated tryptic peptides. First, the side effects of using iodoacetamide (IAM) are minimal in our hands using the alkylation-reduction protocol described above. Second, the antibody is specific for peptides with a diglycine remnant. Peptides with two iodoacetamide moieties covalently added to lysine residues should not be enriched in this protocol. Third, the majority of the peptides in the immunoprecipitated fraction are upregulated upon proteasome inhibition, as exemplified on the SILAC experiment described above (Figure 5). Because the population of ubiquitinated proteins is affected as a result of this treatment, it is highly likely that these affected peptides are indeed diGly peptides resulting from ubiquitinated proteins.
Finally, this protocol could be used in combination with a recently published multiplexed quantitative strategy using TMT19. Obviously, although the published diGly peptide numbers are somewhat lower than those obtained using the present protocol, the ability to relatively quantify up to 16 samples simultaneously is a huge advantage. Combining these methods will allow researchers to perform large-scale quantitative ubiquitinome studies in great depth.
The authors have nothing to disclose.
This work is part of the project "Proteins at Work", a program of the Netherlands Proteomics Centre financed by The Netherlands Organization for Scientific Research (NWO) as part of the National Roadmap Large-Scale Research Facilities (project number 184.032.201).
1,4-Dithioerythritol | Sigma-Aldrich | D8255 | |
3M Empore C18 Octadecyl disks | Supelco | 66883-U | product discontinued at Supelco; CDS Analytical is the new manufacturer (https://www.cdsanalytical.com/empore) |
Ammonium formate | Sigma-Aldrich | 70221 | |
Bortezomib | UBPbio | ||
CSH130 resin, 3.5 μm, 130 Å | Waters | ||
Dimethylsulfoxide (DMSO) | Sigma-Aldrich | 34869 | |
DMEM | ThermoFisher | ||
EASY-nanoLC 1200 | ThermoFisher | ||
FBS | Gibco | ||
GF/F filter plug | Whatman | 1825-021 | |
Iodoacetamide | Sigma-Aldrich | I6125 | |
Lysine, Arginine | Sigma-Aldrich | ||
Lysine-8 (13C6;15N2), Arginine-10 (13C6;15N4) | Cambridge Isotope Laboratories | ||
Lysyl Endopeptidase(LysC) | Wako Pure Chemicals | 129-02541 | |
NanoLC oven | MPI design, MS Wil GmbH | ||
N-Lauroylsarcosine sodium salt | Sigma-Aldrich | L-5125 | |
Orbitrap Fusion Lumos mass spectrometer | ThermoFisher | ||
Pierce BCA Protein Assay Kit | ThermoFisher / Pierce | 23225 | |
PLRP-S (300 Å, 50 µm) polymeric reversed phase particles | Agilent Technologies | PL1412-2K01 | |
PTMScan Ubiquitin Remnant Motif (K-ε-GG) Kit | Cell Signaling Technologies | 5562 | |
Sep-Pak tC18 6 cc Vac Cartridge | Waters | WAT036790 | Remove the tC18 material from the cartridge before filling the cartridge with PLRP-S |
Sodium deoxycholate | Sigma-Aldrich | 30970 | |
Tris-base | Sigma-Aldrich | T6066 | |
Tris-HCl | Sigma-Aldrich | T5941 | |
Trypsin, TPCK Treated | ThermoFisher | 20233 |