Protein Arginine (R)-methylation is a wide-spread post-translational modification regulating multiple biological pathways. Mass spectrometry is the best technology to globally profile the R-methyl-proteome, when coupled to biochemical approaches for modified peptide enrichment. The workflow designed for the high confidence identification of global R-methylation in human cells is described here.
Protein Arginine (R)-methylation is a widespread protein post-translational modification (PTM) involved in the regulation of several cellular pathways, including RNA processing, signal transduction, DNA damage response, miRNA biogenesis, and translation.
In recent years, thanks to biochemical and analytical developments, mass spectrometry (MS)-based proteomics has emerged as the most effective strategy to characterize the cellular methyl-proteome with single-site resolution. However, identifying and profiling in vivo protein R-methylation by MS remains challenging and error-prone, mainly due to the substoichiometric nature of this modification and the presence of various amino acid substitutions and chemical methyl-esterification of acidic residues that are isobaric to methylation. Thus, enrichment methods to enhance the identification of R-methyl-peptides and orthogonal validation strategies to reduce False Discovery Rates (FDR) in methyl-proteomics studies are required.
Here, a protocol specifically designed for high-confidence R-methyl-peptides identification and quantitation from cellular samples is described, which couples metabolic labeling of cells with heavy isotope-encoded Methionine (hmSILAC) and dual protease in-solution digestion of whole cell extract, followed by off-line High-pH Reversed Phase (HpH-RP) chromatography fractionation and affinity enrichment of R-methyl-peptides using anti-pan-R-methyl antibodies. Upon high-resolution MS analysis, raw data are first processed with the MaxQuant software package and the results are then analyzed by hmSEEKER, a software designed for the in-depth search of MS peak pairs corresponding to light and heavy methyl-peptide within the MaxQuant output files.
Arginine (R)-methylation is a post translational modification (PTM) that decorates around 1% of the mammalian proteome1. Protein Arginine Methyltransferases (PRMTs) are the enzymes catalyzing R-methylation reaction by the deposition of one or two methyl groups to the nitrogen (N) atoms of the guanidino group of the side chain of R in a symmetric or asymmetric manner. In mammals, PRMTs can be grouped into three classes-type I, type II, and type III-depending on their capability to deposit both mono-methylation (MMA) and asymmetric di-methylation (ADMA), MMA and symmetric di-methylation (SDMA) or only MMA, respectively2,3. PRMTs mainly target R residues located within glycine- and arginine-rich regions, known as GAR motifs, but some PRMTs, such as PRMT5 and CARM1, can methylate proline-glycine-methionine-rich (PGM) motifs4. R-methylation has emerged as a protein modulator of several biological processes, such as RNA splicing5, DNA repair6, miRNA biogenesis7, and translation2, fostering the research on this PTM.
Mass Spectrometry (MS) is recognized as the most effective technology to systematically study global R-methylation at protein-, peptide-, and site-resolution. However, this PTM requires some particular precautions for its high-confidence identification by MS. First, R-methylation is substoichiometric, with the unmodified form of the peptides being much more abundant than the modified ones, so that mass spectrometers operating in the Data Dependent Acquisition (DDA) mode will fragment high-intensity unmodified peptides more often than their lower-intensity methylated counterparts8. Moreover, most MS-based workflows for R-methylated site identification suffer from limitations at the bioinformatic analysis level. Indeed, the computational identification of methyl-peptides is prone to high False Discovery Rates (FDR), because this PTM is isobaric to various amino acid substitutions (e.g., glycine into alanine) and chemical modification, such as methyl-esterification of aspartate and glutamate9. Hence, methods based on the isotope labeling of methyl groups, such as Heavy Methyl Stable Isotope Labeling with Amino Acids in Cell culture (hmSILAC), have been implemented as orthogonal strategies for confident MS-identification of in vivo methylations, significantly reducing the rate of false positive annotations10.
Recently, various proteome-wide protocols to study R-methylated proteins have been optimized. The development of antibody-based strategies for the immuno-affinity enrichment of R-methyl-peptides has led to the annotation of several hundreds of R-methylated sites in human cells11,12. Furthermore, many studies3,13 reported that coupling antibody-based enrichment with peptide separation techniques such as HpH-RP chromatography fractionation can boost the overall number of methyl-peptides identified.
This article describes an experimental strategy designed for the systematic and high-confidence identification of R-methylated sites in human cells, based on various biochemical and analytical steps: protein extraction from hmSILAC-labeled cells, parallel double enzymatic digestion with Trypsin and LysargiNase proteases, followed by HpH-RP chromatographic fractionation of digested peptides, coupled with antibody-based immuno-affinity enrichment of MMA-, SDMA-, and ADMA-containing peptides. All affinity-enriched peptides are then analyzed by high-resolution Liquid Chromatography (LC)-MS/MS in DDA mode, and raw MS data are processed by MaxQuant algorithm for identificationof R-methyl-peptides. Finally, the MaxQuant output results are processed with hmSEEKER, an in-house developed bioinformatics tool to search pairs of heavy and light methyl-peptides. Briefly, hmSEEKER reads and filters methyl-peptides identifications from the msms file, then matches each methyl-peptide to its corresponding MS1 peak in the allPeptides file, and, finally, searches the peak of the heavy/light peptide counterpart. For each putative heavy-light pair, the Log2 H/L ratio (LogRatio), Retention Time difference (dRT), and Mass Error (ME) parameters are calculated, and doublets that are located within user-defined cut-offs are labeled as true positives. The workflow of the biochemical protocol is described in Figure 1.
1. Cell culturing and protein extraction (time: 3 – 4 weeks required)
2. Lysate digestion (indicative time required 2 hours)
3. Peptide purification (indicative time required 1 hour)
4. Coomassie-stained SDS-PAGE gel (indicative time required 2 hour)
5. Peptide lyophilization (indicative time 2 days)
6. Off-line HpH-RP chromatographic fractionation of peptides (indicative time 4 days)
7. R-methylated peptide immuno-affinity enrichment (indicative time 2 days)
8. Desalting and concentration of affinity-enriched methyl-peptides by C18 microcolumns (indicative time required 30 minutes)
9. Second enzymatic digestion (indicative time required 3 hours)
10. Desalting peptides (indicative time required 30 minutes)
11. Liquid chromatography-tandem mass spectrometry (LC-MS/MS) analysis (indicative time 5 days)
12. Running MaxQuant and hmSEEKER data analysis
The article describes a workflow for the high-confidence identification of global protein R-methylation, which is based on the combination of the enzymatic digestion of the protein extract with two distinct proteases in parallel, followed by HpH-RP liquid chromatography fractionation of proteolytic peptides and immuno-affinity enrichment of R-methyl-peptides with anti-pan-R-methyl antibodies (Figure 1).
The cells were grown in the presence of Methionine, either natural (Light, L, Met-0) or isotopically labeled (Heavy, H, Met-4). Upon full isotopic labeling, which was tested by MS analysis on a small aliquot of Met-4 only extract, the heavy and light cells were harvested and mixed 1:1 L/H proportion, as illustrated in Figure 1A. Upon 13CD3-methionine metabolic labeling, the methyl groups are added to the protein backbone from the methyl-donor S-adenosyl-methionine (SAM) and will be present in either the light or the heavy-isotope form21. Figure 1B describes the arginine methylation reaction carried out by the Protein Arginine Methyltransferases (PRMTs) family that catalyze the transfer of a methyl group from S-adenosyl methionine (SAM) to the guanidino nitrogen of arginine. If a single methyl group is placed on one of the terminal nitrogen atoms of arginine, mono-methylated arginine (MMA) is obtained. If two methyl groups are added on the same nitrogen atom of the guanidino group, asymmetric di-methylated arginine (ADMA) is generated, while if two methyl groups are placed on two different nitrogen atoms, symmetric di-methylated arginine (SDMA) is produced.
After mixing in 1:1 ratio light- and heavy-labeled cells, proteins were extracted and subjected to digestion by Trypsin and LysargiNase, in parallel. As displayed in Figure 2, the SDS-PAGE Coomassie-stained gel was used to verify efficient enzymatic digestion of total proteins in peptides (compare lanes I and II). Moreover, the efficiency of purification step performed by C18 Sep-Pak column was evaluated, confirming the absence of peptides in the flow-through of the C18 column (Figure 2, lane III) and in the first and second wash (Figure 2 lane IV and V, respectively), with their expected presence in the eluate (Figure 2, lane VI). Proper Met-4 incorporation in the heavy channel (Figure 2B) and correct 1:1 H/L mixing (Figure 2C) were evaluated.
Figure 3 displays the chromatogram from the off-line HpH-RP liquid chromatography fractionation of peptides and the subsequent non-contiguous concatenation of fractions. Peptides were detected by 215 nm UV while undigested proteins potentially remaining were evaluated by 280 nm UV. Below the chromatogram the fraction concatenation strategy is schematized, to reduce the 70 starting fractions to final 16, including the PRE and POST gradient fractions.
Anti-pan-R-methyl antibodies were used for the enrichment of R-methyl-peptides. These antibodies recognize the three types of R-methylation (MMA, SDMA, and ADMA) and they are commercially available as directly conjugated to agarose beads (see Table Material and Reagents for details). Table 1 lists all buffers and solutions used in this protocol.
After acquisition, each MS raw data was analyzed twice with MaxQuant, to identify light and heavy methylations in different search groups, with the rationale that methyl-peptides (heavy and light) will only be identified in a specific group. Searching heavy and light methylations separately improves the analysis by reducing the number of variable modifications introduced and by reducing the risk of false positive mixed labeled peptides. Once MaxQuant has assigned methyl-sites, hmSEEKER parses its output table to reconstruct possible pairs of heavy-light peaks13.
Figures 4 and 5 illustrate full MS spectra of peptides FELTGIPPAPR(me) (4) and NPPGFAFVEFEDPR(me) (5), which represent a True Positive and a False Positive methyl-peptide annotation, respectively. In Figure 4, the m/z differences observed between the three peaks are consistent with the presence of an enzymatically methylated residue (7.0082 Th between the unmodified and light-methylated; 2.0102 Th between the light and heavy forms of the methyl-peptide). The resulting hmSILAC doublets has a ME of 0.40 ppm, a dRT of 0.00 min, and a LogRatio of -0.41; these values are below the default thresholds employed by hmSEEKER to distinguish true and false doublets, which was previously estimated to be as follows: |ME| < 2 ppm, |dRT| < 0.5 min, and |LogRatio| < 1. In the second case illustrated in Figure 5, the m/z difference observed between the light-methylated peptide and its putative heavy counterpart deviates from the expected value by 0.0312 Th (ME = -37.28 ppm). Moreover, this doublet has a LogRatio of 2.50, which is outside the default LogRatio prediction interval (these cut-off values have been defined and discussed in13). In fact, in the MS/MS spectrum, the sequence of the peptide NPPGFAFVEFEDPR(me) resulted not fully covered and the assigned R-methylation could be interpreted also as a methyl-esterification on the glutamate or aspartate close to R.
The hmSEEKER workflow is schematized in Figure 6, whereas Table 4 provides a description of the output table produced by the this tool, to help the interpretation of the results: peptides that carry multiple modifications appear multiple times, each entry corresponding to a different methylation event on a given peptide; finally, the peak doublets are divided into three Classes: Matched doublets are the most confident, as the peptide was fragmented and identified in both the heavy and the light form.
Figure 1: Scheme of the experimental workflow and of enzymatic protein-R-methylation reactions. (A) Workflow diagram of biochemical protocol.Cells are grown in light (Met-0) and heavy (Met-4) Methionine containing medium for at least 8 doublings and light and heavy channels are mixed 1:1 proportion. Proteins are extracted and subjected to digestion with Trypsin or LysargiNase in parallel and fractionated by off-line HpH-RP liquid chromatography by collecting 70 fractions, finally combined into 16 fractions. R-methyl-peptides are enriched by anti-pan-R-methyl antibodies conjugated to agarose beads, that underwent second enzymatic digestion (Trypsin or LysargiNase, respectively), and analyzed by LC-MS/MS. Raw MS data are processed by MaxQuant algorithm for peptide and PTM identification. MaxQuant output data are then submitted for analysis by hmSEEKER bioinformatic tool, developed in-house for heavy and light methyl-peptide association. (B) Scheme of R-methylation reaction. The Guanidino group of arginine can be modified by the addition of one methyl-group, producing mono-methylated arginine (MMA) or by the addition of two methyl-groups, producing either symmetric (SDMA) or asymmetric (ADMA) di-methylated arginine. The reaction is catalyzed by enzymes of the Protein Arginine Methyltransferases (PRMTs) family, that transfer these methyl groups from S-Adenosyl-Methionine (SAM). After the methyl group transfer, SAM is reduced to S-adenosylhomocysteine (SAH). Please click here to view a larger version of this figure.
Figure 2: Controls of protocol critical steps. (A) SDS-PAGE Coomassie-stained gel for evaluation of proteolytic digestion efficiency. MW: molecular weight markers. I) 20 µg of total H/L protein extract prior to digestion quantified by BCA; II) digested peptides loaded in the same proportion as in I; III) Flow-through of C18 cartridge loaded in the same proportion as lane I; IV–V) first and second wash of the C18 cartridge with buffer A, loaded in the same proportion as I; VI) eluates from the C18 cartridge, loaded in the same proportion as I. (B) Met-4 incorporation rate analysis. The Met-4 incorporation in the heavy channel is evaluated by in-house developed script (available at https://bitbucket.org/EMassi/hmseeker/src/master/); rate = 1 indicates full incorporation (C) Gaussian distribution of H/L ratios for 1:1 mixing assessment. A normal distribution of Log2 H/L ratio is plotted considering ±2σ. Please click here to view a larger version of this figure.
Figure 3: HpH fraction concatenation scheme and representative R-methylated peptides enrichment assessment. (A) High pH-Reversed Phase fractionation chromatogram and scheme of the non-contiguous fraction concatenation. The chromatogram represents the HpH-RP separation profile of peptides detected at 215 nm UV (blue line), while the presence of undigested proteins was tracked in the 280 nm UV channel (red line). The light green line represents the concentration of Buffer B along the chromatographic run. The fraction pooling scheme is reported, depicting the strategy of non-contiguous concatenation of early-, mid-, and late-eluting fractions, from 70 to 16, including PRE and POST gradient fractions. (B) Representative Table summarizing the enrichment of R-methylated peptides. The table recapitulates the total number of peptides and the relative percentage of R-methylation enrichment comparing each IP on its Input. Please click here to view a larger version of this figure.
Figure 4: Example of true hmSILAC doublet. Mass spectrum of a true positive doublet. The peaks displayed correspond to peptide FELTGIPPAPR in the unmodified, light mono-methylated (CH3) and heavy mono-methylated (13CD3) forms, with charge 2+. The m/z differences observed between the three peaks are consistent with the presence of an enzymatically methylated residue. The table under the mass spectrum represents hmSEEKER output and contains the LogRatio, ME, and dRT parameters of the doublet. Please click here to view a larger version of this figure.
Figure 5: Example of false hmSILAC doublet. Mass spectrum of a negative in vivo methyl-peptide assignment. The peaks at 811.3849 m/z and 818.3927 m/z correspond to the unmodified and light mono-methylated forms of peptide NPPGFAFVEFEDPR, with charge 2+. The third peak could be assigned as the heavy-methyl-counterpart of the light methylated peptide, but the observed m/z shift differs from the expected shift by 0.0312 Th, which rules out this possibility. The table under the mass spectrum represents hmSEEKER output and contains the LogRatio, ME, and dRT parameters of the doublet. Please click here to view a larger version of this figure.
Figure 6: Schematic representation of data analysis workflow. (A) MaxQuant detects MS1 peaks in the raw data. (B) Peaks with an associated MS2 spectrum are processed by the database search engine Andromeda to obtain a peptide identification. (C) hmSEEKER reads MaxQuant peptide identifications and extracts methyl-peptides with Andromeda Score > 25, Delta Score > 12, and modifications with a Localization Probability > 0.75. (D) For each methyl-peptide that passes the quality filtering, hmSEEKER finds its corresponding MS1 peak in MaxQuant allPeptides table and then searches for its counterpart in the same table. (E) A doublet of peaks is defined by the difference in their retention time (RT), their intensity ratio (LogRatio), and the deviation between expected and observed delta mass (ME); these three parameters are used by hmSEEKER to distinguish true positives from false positives, as discussed13. (F) Finally, hmSEEKER produces lists of redundant and non-redundant doublets; the first includes predictions for all methyl-peptides, while the second is filtered so that when a peptide is identified multiple times, only the best scoring doublet is reported. Please click here to view a larger version of this figure.
Buffer | Volume | Composition |
L-methionine (L) solution | 10mL | 30mg/mL Light-Methionine in ultrapure water |
L-methionine (H) solution | 10mL | 30mg/mL Heavy-Methionine in ultrapure water |
Medium for cell culture | 500mL | DMEM with stabile glutamine and without methionine, 10%(v/v) dialyzed FBS, 1% (v/v) P/S, 1:1000 (v/v) L-methionine solution |
Lysis Buffer | 50mL | 9M Urea, 20mM HEPES pH 8.0;1% (v/v) Protease Inhibitor; 1% (v/v) Phosphatase Inhibitor in ultrapure water |
Ammonium Bicarbonate (AMBIC) solution | 50mL | 1M (NH4)2CO3 in ultrapure water |
DTT solution | 10mL | 1.25M DTT in ultrapure water |
IAA solution | 5mL | 109mM in ultrapure water |
Solvent A for Sep-Pak C18 | 50mL | 0.1% TFA in ultrapure water |
Solvent B for Sep-Pak C18 | 50mL | 0.1% TFA + 40% ACN in ultrapure water |
Wash solution for Sep-Pak C18 | 50mL | 0.1% TFA + 5% ACN in ultrapure water |
Buffer A for HpH fractionation | 500mL | 25 mM NH4OH in ultrapure water |
Buffer B for HpH fractionation | 500mL | 25 mM NH4OH+ 90% ACN in ultrapure water |
IP binding buffer 1x | 5mL | diluite 1:10 (v/v) in ultrapure water from 10x commercially stock solution available |
IP elution buffer | 50mL | 0.15% TFA in ultrapure water |
Buffer A for Stage-Tips | 50mL | 0.1% TFA in ultrapure water |
Buffer B for Stage-Tips | 50mL | 0.1% TFA + 40% ACN in ultrapure water |
Buffer C for Stage-Tips | 50mL | 0.1% TFA + 50% ACN in ultrapure water |
MS Solvent A | 250mL | 0.1% FA in ultrapure water |
MS Solvent B | 250mL | 0.1% FA + 80% ACN in ultrapure water |
Protease inhibitors cocktail | 5 mL | cOmplete, EDTA-free Protease Inhibitor Tablets (ROCHE) dissolved in ultrapure water according to the manufacture instruction |
Phosphatase inhibitors cocktail | 5 mL | PhosSTOP Tablets (ROCHE) dissolved in ultrapure water according to the manufacture instruction |
Table 1: Buffers and solutions composition. Lists of the buffers and solutions used in this protocol.
Parameters | Value |
Sample Loading (uL) | 2 |
Loading Flow Rate (uL/min) | 10 |
Gradient Flow Rate(nL/min) | 300 |
Linear Gradient | 3-30% B for 89min, 30-60% B for 5min, 60-95% B for 1min, 95% B for 5min |
Full Scan Resolution | 70,000 |
Number of most intense ions selected | 15 |
Relative Collision energy (%) (CID) | 28 |
Dynamic Exclusion (s) | 20.0 |
Table 2: LC-MS/MS setting. Parameters applied for the LC-MS/MS analysis of R-methyl-peptides on a high-resolution Quadrupole-Orbitrap Mass Spectrometer, coupled to a nano-flow ultra-high-performance liquid chromatography (UHPLC) system.
MQ Parameters Settings (ver 1.6.2.10) | |||
Setting | Action | ||
Configuration | |||
Modifications | Met4 | Add new modification. Set Composition to H(-3) Hx(3) Cx C(-1) and choose M as the specificity. | |
Methyl4 (KR) | Duplicate "Methyl (KR)", rename it and change composition to Cx H(-1) Hx(3) | ||
Dimethyl4 (KR) | Duplicate "Dimethyl (KR)", rename it and change composition to H(-2) Hx(6) Cx(2) | ||
Trimethyl4 (K) | Duplicate "Trimethyl (K)", rename it and change composition to Cx(3) H(-3) Hx(9) | ||
OxMet4 | Duplicate "Oxidation (M)" and rename it. | ||
Proteases | Lysarginase | Add new protease. Select the 'R' and 'K' columns. | |
When creating a new PTM or protease, click "Modify Table" to change the MaxQuant settings and then "Save Changes" to confirm the changes. Restart MaxQuant and the new options will be visible. | |||
Raw Files tab | |||
Parameters group | Separate the raw files into 2 groups (0 and 1) | ||
Group Specific Parameters | |||
Tip | Tip | Standard | |
Multiplicity | 1 | ||
Digestion | Enzyme | Trypsin or Lysarginase | |
Max. Missed Cleavages | Set to 3 | ||
Modifications | Variable modifications | Group 0 | Oxidation (M), Methyl (KR), Dimethyl (KR), Trimethyl (K) |
Group 1 | OxMet4, Methyl4 (KR), Dimethyl4 (KR), Trimethyl4 (K) | ||
Fixed Modifications | Group 0 | Carbamidomethylation | |
Group 1 | Carbamidomethylation and Met4 | ||
Global parameters | |||
Sequences | Fasta files | Load FASTA file | |
Identification | PSM FDR | Set to 0.01 | |
Min. Score for modified peptides | Set to 1 | ||
Min. Delta score for modified peptides | Set to 1 | ||
Advanced Identification | Second peptide search | Check off | |
Tables | Write allPeptides table | Check | |
Advanced | Calculate peak properties | Check | |
If not specified, leave the default parameter. | |||
MQ Parameters Settings for Incorporation Test | |||
Group Specific Parameters | |||
Tip | Tip | Standard | |
Multiplicity | 2 | ||
Max Labeled | 5 | ||
Heavy Label | Select Met4 | ||
Digestion | Enzyme | Trypsin or Lysarginase | |
Max. Missed Cleavages | Set to 3 | ||
Modifications | Variable modifications | Oxidation (M) | |
Fixed Modifications | Carbamidomethylation | ||
If not specified, leave the default parameter. |
Table 3: MaxQuant processing parameters. Group-specific and global parameters adjusted to the specific experiment described, are listed. All other parameters have been set as default, depending on the program version used.
Column name | Description |
Rawfile | Raw data file in which the doublet was identified |
H-Scan | Scan number of the Heavy counterpart |
L-Scan | Scan number of the light counterpart |
CLASS | Can have 3 values: |
Matched = Heavy and Light peptides are identified with the same sequence | |
Mismatched = Heavy and light peptides have the same aa sequence but there is a mismatch in the localization of the methylated site | |
Rescued = Only one peptide in the doublet is identified; its counterpart is an unidentified peak. | |
PEPTIDE | Peptide sequence |
SCORE | Peptide Andromeda Score |
RES | Modified residue |
POS | Position of the modified residue |
MOD | Modification |
LEAD PROTEIN | Protein the peptide belongs to |
GENE | Gene name corresponding to the protein |
PROBABILITY_TRUE | Probability of the doublet being a true hmSILAC doublet, calculated by the logistic regression model |
PREDICTION | 1 if the doublet is putative true, 0 if it's false |
H/L LOGRATIO | Log2 of the Heavy/Light Intensity ratio |
ME | Deviation between expected and observed mass difference |
DRT | Difference in retention time |
Table 4: hmSEEKER output results description. List of the column entries in the hmSEEKER output table, with a brief description of their content.
The high confidence identification of in vivo protein/peptide methylation by global MS-based proteomics is challenging, due to the risk of high FDR, with several amino acid substitutions and methyl-esterification occurring during sample preparation that are isobaric to methylation and can cause wrong assignments in the absence of orthogonal MS validation strategies. The substoichiometric nature of this PTM further complicates the task of global methyl-proteomics, but can be overcome with the selective enrichment of modified peptides10.
Here, a biochemical and analytical workflow is presented, which is designed to increase the efficiency and reliability of global MS-analysis of R-methyl-peptides through the application of hmSILAC strategy coupled to HpH-RP chromatography peptide fractionation and affinity-enrichment with anti-pan-R-methyl-peptides antibody kits. The former strategy allows orthogonal validation of methyl-peptides and strongly reduces the FDR of identification, while the latter protocol increases their detectability from the background of unmodified peptides22. However, a limitation of this protocol is the requirement of very large amount of starting protein extract (in the range of 20-40 mg) as input for the subsequent peptide fractionation and affinity enrichment, which limits the application of the method to immortalized, fast growing cell lines which can be expanded extensively. Instead, the current setup is not applicable to patient-derived primary cells or tissues. Future investigations should be directed to improve the protocol in this direction: additional strategies for the biochemical enrichment of methylated peptide over unmodified ones could allow circumventing the use of antibodies, enabling the scaling down of the experiments. Another interesting development could be represented by the combination of the current methods with the chemical modification of proteolytic peptides with isobaric or tandem mass tags, with two-fold potential advantages: on the one hand, the possibility of combining multiple conditions in one single experiment, thus multiplexing the relative quantification of methyl-proteomic changes upon different perturbations; on the other hand, pooling different samples into one prior to chromatographic fractionation and affinity enrichment may allow to reduce the scale of individual experiments.
This protocol relies on two separate digestions of the whole cell extract in parallel with Trypsin and LysargiNase. Trypsin cleaves the peptide bond at the C-terminal side of K and R residues, generating peptides that present a positively charged residue at the C-terminus, in addition to the N-terminal positive charge from the α-amine23. The LysargiNase enzyme selectively hydrolyzes peptidyl-K and -R bonds, generating peptides that bear a K or R at the N-terminal site, which can include K-methylated forms. The use of both proteases increases the overall proteome coverage in large scale MS-analysis, leading to the identification of peptides eventually missed upon a single tryptic digestion18. The double enzymatic digestion, instead, is carried out to reduce the number of possible missed enzymatic cleavages. In fact, methylation of K and R strongly reduce the efficiency of protein cleavage by trypsin. In spite of this precaution, it is still common for methylated peptides to be longer and contain missed cleavages, which lead to poor CID fragmentation.
The use of another type of fragmentation, such as Electron Transfer Dissociation (ETD), could solve this issue. As a matter of fact, ETD usually does not fragment doubly charged peptide ions efficiently like CID does, but it provides fairly uniform cleavage of peptide precursors of higher charge states (≥3). This could be an advantage in the case of R-methylation, since it frequently occurs in Arginine-rich domains that contain multiple and neighboring R residues. However, ETD has a lower scan rate than CID, so the total number of peptide identifications is reduced24,25,26.
Recently, several protocols that involve the enrichment of post-translationally modified peptides have been coupled with different chromatography separation strategies that help reducing the complexity of the peptide mixture, thus increasing the overall efficiency of modified peptides detection in MS. Here, HpH-RP chromatographic fractionation coupled with non-contiguous concatenation of the fractions is applied. The off-line peptide fractionation based on a high pH reversed phase chromatography displays a high resolving power separation that is orthogonal to the on-line low pH RP-separation carried out downstream during the LC-MS/MS run27. Moreover, the non-contiguous concatenation strategy has two main advantages: first, it increases the protein coverage by pooling early-, middle-, and late-eluting fractions into individual concatenated fractions, preserving the heterogeneity of peptide mixture. Second, the concatenation reduces the subsequent MS run-time analysis, by acquiring a lower number of sample fractions28.
Due to the substoichiometric nature of R-methylation, an enrichment step is necessary in order to facilitate the detection of methyl-peptides in global MS-analysis of modification proteomes. In this protocol, the methyl-peptides are enriched by immuno-affinity precipitation (IAP) using the antibodies anti-SDMA and anti-ADMA in parallel, while the immuno-precipitation of mono-methyl-peptides using anti-MMA antibody is carried out on the FTs from the previous IAP experiments. This order reflects the different efficiency of these antibodies: anti-SDMA and anti-ADMA antibodies have lower binding efficiency compared to anti-MMA antibody. Noteworthy, this different efficiency may also cause biases in the representation of the different degrees of R-methylations in modification-proteomes experimentally annotated29.
Before the commercial availability of anti-pan-R-methylation antibodies, other separation strategies were applied to boost R-methylated peptide detection by MS, such as strong cation exchange (SCX) and hydrophilic interaction (HILIC) chromatography. Despite these techniques could reduce the complexity of the peptide mixture analyzed in MS, they did not significantly improve the identification of methyl-peptides30,31,32,33.
In spite of all these technical and analytical solutions aiming at increasing the methyl-peptide separation, detection, fragmentation, and sequence annotation, the methyl-proteome coverage is still limited and biased toward the more abundant methylated proteins, such as ribonucleoprotein, RNA-binding helicases, while several known low-abundant modified proteins (e.g., TP53BP1, CHTF8, MCM2) are only detected serendipitously and not reliably over multiple global experiments34. Subcellular fractionation applied prior to the current workflow could improve the detection of such proteins; however, the current experimental scale required do not make this a viable alternative.
Upon MS, the raw data are analyzed through the MaxQuant algorithm for peptide and PTM identification. The analysis of data from hmSILAC experiments is, however, not straightforward with standard search algorithms. For instance, while MaxQuant can efficiently analyze standard SILAC experiments based on the metabolic labeling with isotopically encoded K and R, it does not work efficiently when the isotope-labeling is encoded into a variable PTM, as in the case of heavy-methyl labeling that leads to heavy-methylation. Therefore, the strategy adopted here consists in first analyzing the hmSILAC data with MaxQuant without using its built-in doublet-searching functionality so that the light and heavy peptides can be identified independently; then they are matched with a post-processing software. This bioinformatic workflow also has its own pitfalls, as one has to specify methylations in both heavy and light forms in the Variable Modifications panel of MaxQuant, ending up with a total of eight variable modifications when Methionine oxidation (heavy and light) is also included. Searching too many PTMs with a database search engine such as MaxQuant/Andromeda is impractical, because it leads to an exponential increase of the theoretical peptides the algorithm has to test: our solution was to analyze each MS raw data twice, with different sets of variable PTMs (through the parameters groups function of MaxQuant). After peptide search, the in-house developed tool hmSEEKER is employed to support the assignment of heavy-light peptide pairs from the output tables produced by MaxQuant. The first release of the hmSEEKER algorithm has been recently published13, where it was shown that hmSEEKER can identify hmSILAC doublets with FDR < 1%. False positives can still arise from pairs of peaks that by chance have a mass difference multiple of 4.02 Da, but this is very unlikely for the doublets classified as Matched or Mismatched, in light of the following facts: for a Matched or Mismatched doublet to be false, Andromeda has to incorrectly determine the sequence of both the heavy and the light counterpart. Assuming that the search engine has been run with its default parameters, each identification has a 1% probability of being incorrect. Thus, the probability of the hmSILAC counterpart also being incorrect is 0.01%.
One pitfall of hmSILAC is that peptides containing Methionine in their backbone also generate doublets that are indistinguishable from those generated by methyl-peptides. Nevertheless, from our experience, this should not represent a major issue, first because peptides without methylations can be simply discarded from the MaxQuant output and, second, because hmSEEKER automatically takes into account any Methionine residue in a methyl-peptide when calculating the expected mass difference; last, this risk is also excluded by the fact that the heavy and light modifications are searched in separate parameters groups, so that the search engine cannot split a heavy mono-methylation (+18.03 Da) into a light mono-methylation plus a heavy Methionine (14.01 + 4.02 Da).
A more formal and experimental solution to this problem was proposed by Oreste Acuto and his collaborators, who developed a variant of hmSILAC, named isomethionine methyl-SILAC (iMethyl-SILAC)22. In this alternative metabolic labeling protocol, natural light Methionine is replaced by [13C4]-Methionine, which has the same mass as [13CD3]-Methionine (Met-4), yet it does not produce stable isotopically-encoded methyl-groups, due to the different distribution of the heavy isotopes within the molecular tag. Thus, in iMethyl-SILAC experiments, unmodified Methionine-containing peptides do not generate doublets. However, it should be noted that when Acuto and co-workers compared the performance of iMethyl-SILAC and traditional hmSILAC, the two methods still displayed very similar FDRs.
A possible limitation of hmSEEKER is that it is designed to work directly on MaxQuant output tables so that its source code is not compatible with other search engines, whose output files are structured differently; in this sense, MethylQuant35 provides a good alternative bioinformatic tool that is tailored ad hoc for the direct analysis of MS raw data from hmSILAC-type of experiments and is more flexible in terms of the input files provided. A machine learning model is under development in order to distinguish true and false methyl-peptide H/L doublets without relying on user-defined thresholds.
The authors have nothing to disclose.
MM and EM are PhD students within the European School of Molecular Medicine (SEMM). EM is the recipient of a 3-years FIRC-AIRC bursary (Project Code: 22506). Global analyses of R-methyl-proteomes in the TB group are supported by the AIRC IG Grant (Project Code: 21834).
Ammonium Bicarbonate (AMBIC) | Sigma-Aldrich | 09830 | |
Ammonium Persulfate (APS) | Sigma-Aldrich | 497363 | |
C18 Sep-Pak columns vacc 6cc (1g) | Waters | WAT036905 | |
Colloidal Coomassie staining Instant | Sigma-Aldrich | ISB1L-1L | |
cOmplete Mini, EDTA-free | Roche-Sigma Aldrich | 11836170001 | Protease Inhibitor |
Dialyzed Fetal Bovine Serum (FBS) | GIBCO ThermoFisher | 26400-044 | |
DL-Dithiothreitol (DTT) | Sigma-Aldrich | 3483-12-3 | |
DMEM Medium | GIBCO ThermoFisher | requested | with stabile glutamine and without methionine |
EASY-nano LC 1200 chromatography system | ThermoFisher | ||
EASY-Spray HPLC Columns | ThermoFisher | ES907 | |
Glycerolo | Sigma-Aldrich | G5516 | |
HeLa cells | ATCC | ATCC CCL-2 | |
HEPES | Sigma-Aldrich | H3375 | |
Iodoacetamide (IAA) | Sigma-Aldrich | 144-48-9 | |
Jupiter C12-RP column | Phenomenex | 00G-4396-E0 | |
L-Methionine | Sigma-Aldrich | M5308 | Light (L) Methionine |
L-Methionine-(methyl-13C,d3) | Sigma-Aldrich | 299154 | Heavy (H) Methionine |
LysargiNase | Merck Millipore | EMS0008 | |
Microtip Cell Disruptor Sonifier 250 | Branson | ||
N,N,N′,N′-Tetramethylethylenediamine (TEMED) | Sigma-Aldrich | T9281 | |
Penicillin-Streptomycin | GIBCO ThermoFisher | 15140122 | |
PhosSTOP | Roche-Sigma Aldrich | 4906837001 | Phosphatase Inhibitor |
Pierce C18 Tips | ThermoFisher | 87782 | |
Pierce 0.1% Formic Acid (v/v) in Acetonitrile, LC-MS Grade | ThermoFisher | 85175 | LC-MS Solvent B |
Pierce 0.1% Formic Acid (v/v) in Water, LC-MS Grade | ThermoFisher | 85170 | LC-MS Solvent A |
Pierce Acetonitrile (ACN), LC-MS Grade | ThermoFisher | 51101 | |
Pierce Water, LC-MS Grade | ThermoFisher | 51140 | |
Polyacrylamide | Sigma-Aldrich | 92560 | |
Precision Plus Protein All Blue Prestained Protein Standards | Bio-Rad | 1610373 | |
PTMScan antibodies α-ADMA | Cell Signaling Technology | 13474 | |
PTMScan antibodies α-MMA | Cell Signaling Technology | 12235 | |
PTMScan antibodies α-SDMA | Cell Signaling Technology | 13563 | |
Q Exactive HF Hybrid Quadrupole-Orbitrap Mass Spectrometer | ThermoFisher | ||
Sequencing Grade Modified Trypsin | Promega | V5113 | |
Trifluoroacetic acid | Sigma-Aldrich | T6508 | |
Ultimate 3000 HPLC | Dionex | ||
Urea | Sigma-Aldrich | U5378 | |
Vacuum Concentrator 5301 | Eppendorf | Speed vac |