We present a protocol to accurately quantitate proteins with isobaric labelling, extensive fractionation, bioinformatics tools, and quality control steps in combination with liquid chromatography interfaced to a high-resolution mass spectrometer.
Many exceptional advances have been made in mass spectrometry (MS)-based proteomics, with particular technical progress in liquid chromatography (LC) coupled to tandem mass spectrometry (LC-MS/MS) and isobaric labeling multiplexing capacity. Here, we introduce a deep-proteomics profiling protocol that combines 10-plex tandem mass tag (TMT) labeling with an extensive LC/LC-MS/MS platform, and post-MS computational interference correction to accurately quantitate whole proteomes. This protocol includes the following main steps: protein extraction and digestion, TMT labeling, 2-dimensional (2D) LC, high-resolution mass spectrometry, and computational data processing. Quality control steps are included for troubleshooting and evaluating experimental variation. More than 10,000 proteins in mammalian samples can be confidently quantitated with this protocol. This protocol can also be applied to the quantitation of post translational modifications with minor changes. This multiplexed, robust method provides a powerful tool for proteomic analysis in a variety of complex samples, including cell culture, animal tissues, and human clinical specimens.
Advances in next-generation sequencing technology have led to a new landscape for studying biological systems and human disease. This has permitted a large number of measurements of the genome, transcriptome, proteome, metabolome, and other molecular systems to become tangible. Mass spectrometry (MS) is one of the most sensitive methods in analytical chemistry, and its application in proteomics has rapidly expanded after the sequencing of the human genome. In the proteomics field, the past few years have yielded major technical advances in MS-based quantitative analyses, including isobaric labeling and multiplexing capability combined with extensive liquid chromatography, in addition to instrumentation advances, allowing for faster, more accurate measurements with less sample material required. Quantitative proteomics have become a mainstream approach for profiling tens of thousands of proteins and posttranslational modifications in highly complex biological samples1,2,3,4,5,6.
Multiplexed isobaric labeling methods such as isobaric tag for relative and absolute quantitation (i.e., iTRAQ) and tandem mass tag (TMT) MS have greatly improved sample throughput and increased the number of samples that can be analyzed in a single experiment1,6,7,8. Along with other MS-based quantitation methods, such as label-free quantitation and stable isotope labeling with amino acids in cell culture (i.e., SILAC), the potential of these techniques in the proteomics field is considerable9,10,11. For example, the TMT method permits 10 protein samples to be analyzed together in 1 experiment by using 10-plex reagents. These structurally identical TMT tags have the same overall mass, but heavy isotopes are differentially distributed on carbon or nitrogen atoms, resulting in a unique reporter ion during MS/MS fragmentation of each tag, thereby enabling relative quantitation between the 10 samples. The TMT strategy is routinely applied to study biological pathways, disease progression, and cellular processes12,13,14.
Substantial technical improvements have enhanced liquid chromatography (LC) –MS/MS systems, both in terms of LC separations and MS parameters, to maximize protein identification without sacrificing quantitation accuracy. First-dimension separation of peptides by a separation technique with high orthogonality to the second dimension is critical in this type of shotgun proteomics method to achieve maximum results20. High-pH reversed-phase liquid chromatography (RPLC) provides better performance than does conventional strong cation-exchange chromatography20. When high-pH RPLC is combined with a second dimension of low-pH RPLC, both analytical dynamic range and protein coverage are improved, resulting in the ability to identify the bulk of expressed proteins when performing whole-proteome analyses15,16,17,18. Other technical advances include small C18 particles (1.9 µm) and extended long column (~1 m)19. Furthermore, other notable improvements include new versions of mass spectrometers with rapid scan rates, improved sensitivity and resolution20, and sophisticated bioinformatics pipelines for MS data mining21.
Here, we describe a detailed protocol that incorporates the most recent methodologies with modifications to improve both sensitivity and throughput, while focusing on quality control mechanisms throughout the experiment. The protocol includes protein extraction and digestion, TMT 10-plex labeling, basic pH and acid pH RPLC fractionation, high-resolution MS detection, and MS data processing (Figure 1). Moreover, we implement several quality control steps for troubleshooting and evaluating experimental variation. This detailed protocol is intended to help researchers new to the field routinely identify and accurately quantitate thousands of proteins from a lysate or tissue.
CAUTION: Please consult all relevant safety data sheets (i.e., MSDS) before use. Please use all appropriate safety practices when performing this protocol.
NOTE: A TMT 10-plex isobaric label reagent set is used in this protocol for the proteome quantitation of 10 samples.
1. Preparation of Cells/Tissues
NOTE: It is critical to collect samples in minimal time at low temperature to keep proteins in their original biological state.
2. Protein Extraction, Quality Control Western Blotting, In-solution Digestion, and Peptide Desalting
NOTE: Handling each of the 10 samples the same way during any step before the pooling of TMT-labeled samples is essential to reduce variation.
3. TMT Labeling of Peptides
NOTE: It is critical to ensure that all samples are fully labeled by TMT reagents. Several factors (e.g., amount of TMT reagents used, pH value, and accuracy of protein quantitation) can affect TMT labeling efficiency, which will negatively alter all downstream results.
4. Extensive High-resolution, Basic pH LC Prefractionation
5. LC-MS/MS Preparation and Parameters
NOTE: In TMT-based quantitation, peptide ions are isobaric and appear as 1 mass in an MS1 scan. However, they are quantitated according to the intensity of reporter ions (10 unique reporter ions) in the MS/MS scan after the peptide ion has been fragmented with higher energy collision dissociation (HCD). The TMT reporter ion ratios may be suppressed from co-elution of TMT-labeled ions24. Narrowing the ion isolation window25, gas-phase purification26, the MultiNotch MS3 method27, or extensive fractionation with multidimensional LC and long gradients (4-8 h)28 are alternative approaches.
6. MS Data Analysis
NOTE: We describe data analysis with the JUMP software program. However, data analysis can be performed with other commercially available or free programs.
7. MS Data Validation
NOTE: To evaluate the quality of MS data, at least 1 method of validation should be performed before proceeding with time-consuming biological experiments.
We used a previously described cross-species peptide mix to systematically analyze the effect of ratio compression in 3 major protocol steps, including pre-MS fractionation, MS settings, and post-MS correction23. The pre-MS fractionation was evaluated and optimized by using a combination of basic pH RPLC and acidic pH RPLC. For post-MS analysis, only species-specific peptides were considered. We used this interference model to examine a number of parameters in LC/LC-MS/MS, including the MS2 isolation window, online LC resolution, loading amount of the online acidic pH RPLC, and resolution of the offline high-pH RPLC (see Figure 3 in reference29).
To alter the resolution during the offline LC, we separated the isobaric labeled peptides into 320 fractions, and then combined selected fractions together to adjust the separation power. Offline LC resolution was tested by using a number of collected fraction subsets (1, 5, 10, 20, 40, 80, and 320), while monitoring the interference levels. We found that the interference levels decreased gradually from 16.4% to 2.8%, indicating that extensive fractionation during offline LC separation somewhat alleviated the problem of co-eluting peptides but was not capable of completely removing the interference.
Next, we evaluated the effect of the MS2 isolation window on interference. We determined experimentally that the interference was approximately proportional to the size of the isolation window, in agreement with previous studies29,30. For example, a 4-fold difference of window size (1.6 to 0.4 Da) resulted in a ~4-fold difference of inference level (14.4% to 3.7%). We also determined the optimal loading amount on the column for MS analysis and used this information to further negate interference. We reduced the interference from 9.4% to 3.3% by loading the proper amount of sample. Loading too much sample may lead to peak broadening31 and therefore raise the interference level, resulting in reduced protein quantitation accuracy. Lastly, we optimized the online RPLC resolution by changing the gradient lengths (1, 2, and 4 h) and using a long column (~45 cm). We found that a 4-h gradient nearly eliminated the interference (down to 0.4%) but also added instrument time. The optimized parameters revealed a narrow isolation window (0.4 Da), ~100 ng of sample loaded on column, and midlevel fractionation (~40 x 2 h, 3.3 days). We also showed that by using a computer-based post-MS correction strategy we improved quantitative precision when determining protein ratios (Figure 3B).
Our results indicate that the use of optimized LC-MS parameters and post-MS correction can provide a thorough proteomic profile and virtually eliminate the interference that is often produced by isobaric labeling techniques for protein quantitation.
Figure 1: Scheme of whole-proteome profiling analysis by TMT-LC/LC-MS/MS. Ten biological samples were lysed, digested, and labeled with 10 different TMT tags, pooled equally, and fractionated into 80 fractions by offline basic pH reverse-phase liquid chromatography (LC). Every other fraction was further separated by acidic RPLC and analyzed online with a high-resolution mass spectrometer. The MS/MS raw files were processed and searched against a Uniprot database for peptide and protein identification. Proteins were quantitated according to the relative intensities of the TMT tags. Finally, the identified and quantitated proteins were submitted for integrative data analysis. Please click here to view a larger version of this figure.
Figure 2: TMT labeling efficiency examination. Both TMT-labeled and their corresponding unlabeled samples were analyzed by LC-MS/MS separately to examine TMT labeling efficiency. For full TMT labeling, (A) the unlabeled peptide peak was completely absent in the labeled sample, (B) whereas the TMT-labeled peptide was present only in labeled sample. Please click here to view a larger version of this figure.
Figure 3: Computational approach for interference removal after MS data collection. (A) All MS2 scans were divided into clean (left panel) and noisy scans (right panel). Noisy scans exhibit both y1 ions of K and R (1 from a target peptide and the others from contaminating peptides). (B) Escherichia coli peptides were individually labeled with 3 different TMT reagents and pooled at 1:3:10 ratios. Approximately 20-fold more rat peptides were added as background. The summed relative intensities of the E. coli peptides in the 3 groups revealed that y1 ion-based correction is more accurate for TMT-based quantitation. Please click here to view a larger version of this figure.
We describe a high-throughput protocol for the quantitation of proteins with a 10-plex isobaric labeling strategy, which has been implemented successfully in several publications12,13,14,32. In this protocol, we can analyze up to 10 different biological protein samples in 1 experiment. We can routinely identify and quantitate well over 10,000 proteins with high confidence. Although isobaric labeling is an effective technology to quantitate proteins, it is limited by ratio compression that can lead to quantitative inaccuracy. Our protocol using various strategies, such as the optimization of LC/LC and MS/MS settings in combination with y1 ion-based post-MS correction, effectively removes most interference effects and therefore considerably enhances the precision of protein quantitation.
For protein quantitation, the JUMP suite of programs functions in a number of ways. We measured the isotopic impurity for every batch of purchased reagents, which differs slightly from the reported values in vendor-provided certificates. The measured isotopic impurity is used for correcting the intensities of reporter ions in the JUMP software. A protein is typically quantitated by many PSMs with various absolute intensities that are influenced by multiple factors, such as ionization efficiency. Because the influence of the factors is unknown, mass spectra reveal only the relative abundances of reporter ions in a TMT assay. The relative abundances of reporter ions are calculated by dividing the intensity of each reporter ion by the average intensity of all 10 reporters. To provide the best representative "absolute signal" of the protein, we calculated an average of the relative intensities of 3 PSMs with the highest absolute intensities from the protein and the average of the strongest absolute intensity to estimate the absolute signal of the protein. We defined these steps as rescaling. The y1 correction is universally applicable to any TMT assays; however, in some cases we could not correct for peptides with MS/MS spectra containing the y1 ion. However, we found that ~90% of spectra have the y1 ions from both lysine and arginine. To compensate for the interference caused by co-eluting peptides that have the same C-terminal residues as the identified peptide, the estimated interference level was essentially doubled and used for correction. This assumption was based on empirical evidence that we have collected over numerous TMT experiments, in which we observed the same intensity level for y1 in K- and R-containing peptides.
Interference can be reduced by many methods other than what we have described in this protocol, such as the MS3 multinotch approach. All of these methods are valid and have their individual merits. Ultimately, the methodology used depends on a number of factors, including but not limited to sample amount, instrument availability (i.e., instrument type and time required for analyzing samples), desired results (i.e., number of proteins identified), and quantitation accuracy needed.
To ensure the most accurate results, it is important to heed the quality control steps throughout the protocol. These include using accurate standards for sample quantitation (e.g., BSA that has undergone amino acid analysis), testing the labelling efficiency of the TMT reagent, determining the efficiency of trypsin digestion, performing a premix ratio test to ensure an equal mixing of all 10 samples, and using a software to correct any loading bias. In cases in which a known protein or multiple proteins are expected to exhibit expression changes between individual samples, it is important to perform western blot analysis after the initial cell lysis to confirm that these changes are present and can be detected. This ensures that the samples represent the biology to be tested and saves time, money, and effort before carrying out the entire TMT protocol. If a particular sample is expected to not express certain protein(s), it is also important to use western blot analysis to confirm their absence after lysis before continuing with the protocol. The premix ratio test is used when most proteins are expected to not change across the 10 biological samples. We also removed known contaminating proteins, such as keratins, so they would not negatively affect our correction. Our quantitation method corrected for any errors that may have occurred from pipetting errors, etc. When possible, we specifically used pipette volumes greater than 5 µL to reduce errors. We performed multiple rounds of this premix test to ensure that we obtained an accurate 1:1 mix. It is important to note that we did not perform this type of premix ratio test when using immunoprecipitation samples for the protocol, as we expected a large percentage of the proteins to change. In such cases, the premix test would skew the results. This is also true of any experiment in which at least 1 of the 10 samples is expected to vary greatly in protein expression (empty vector, proteasome inhibition, etc.) In such cases, it is highly recommended to use replicates to facilitate quantitation statistics. We typically recommend performing 3 replicates for these types of samples. Ultimately, the number of replicates is based on the expected variability of each sample.
With some fine-tuning, this robust protocol can be used as a general proteomic pipeline to investigate proteins, protein pathways, disease progression, and other biological functions. The protocol presented here provides a powerful technique to obtain deep protein coverage and alleviate ratio suppression during quantification. With minor modifications, this protocol can be readily adapted to quantitate protein posttranslational modifications, such as phosphorylation, ubiquitination, and acetylation33,34. These types of comprehensive proteomics projects can be integrated with genomics, transcriptomics, and possibly metabolomics for a systems biology approach to understand biological systems and facilitate novel discoveries of molecular mechanisms, biomarkers, and therapeutic targets in disease13,28,35,36,37,38.
We have demonstrated a method for accurately quantitating over 10,000 proteins from whole-cell or tissue lysates. We expect this method to be widely applicable to many biological systems.
The authors have nothing to disclose.
The authors thank all other lab and facility members for helpful discussion. This work was partially supported by NI H grants R01GM114260, R01AG047928, R01AG053987, and ALSAC. The MS analysis was performed in the St. Jude Children's Research Hospital Proteomics Facility, partially supported by NIH Cancer Center Support grant P30CA021765. The authors thank Nisha Badders for help with editing the manuscript.
1220 LC system | Agilent | G4288B | |
50% Hydroxylamine | Thermo Scientific | 90115 | |
Acetonitrile | Burdick & Jackson | AH015-4 | |
Bullet Blender | Next Advance | BB24-AU | |
Butterfly Portfolio Heater | Phoenix S&T | PST-BPH-20 | |
C18 tips | Harvard Apparatus | 74-4607 | |
Dithiothreitol (DTT) | Sigma | D5545 | |
DMSO | Sigma | 41648 | |
Formic acid | Sigma | 94318 | |
Fraction Collector | Gilson | FC203B | |
Glass Beads | Next Advance | GB05 | |
HEPES | Sigma | H3375 | |
Iodoacetamide (IAA) | Sigma | I6125 | |
Lys-C | Wako | 125-05061 | |
Methanol | Burdick & Jackson | AH230-4 | |
Pierce BCA Protein Assay kit | Thermo Scientific | 23225 | |
Mass Spectrometer | Thermo Scientific | Q Exactive HF | |
nanoflow UPLC | Thermo Scientific | Ultimate 3000 | |
ReproSil-Pur C18 resin, 1.9um | Dr. Maisch GmbH | r119.aq.0003 | |
Self Pck Columns | New Objective | PF360-75-15-N-5 | |
Sodium deoxycholate | Sigma | 30970 | |
Speedva | Thermo Scientific | SPD11V | |
TMT 10plex Isobaric label reagent | Thermo Scientific | 90110 | |
Trifluoroacetic acid (TFA) | Applied Biosystems | 400003 | |
Trypsin | Promega | V511C | |
Urea | Sigma | U5378 | |
Xbridge Column C18 column | Waters | 186003943 | |
Ziptips C18 | Millipore | ZTC18S096 | |
SepPak 1cc 50mg | Waters | WAT054960 |