A method for mass spectrometric analysis of endogenous peptides in human cerebrospinal fluid (CSF) is presented. By employing molecular weight cut-off filtration, chromatographic pre-fractionation, mass spectrometric analysis and a subsequent combination of peptide identification strategies, it was possible to expand the known CSF peptidome nearly ten-fold compared to previous studies.
This protocol describes a method developed to identify endogenous peptides in human cerebrospinal fluid (CSF). For this purpose, a previously developed method based on molecular weight cut-off (MWCO) filtration and mass spectrometric analysis was combined with an offline high-pH reverse phase HPLC pre-fractionation step.
Secretion into CSF is the main pathway for removal of molecules shed by cells of the central nervous system. Thus, many processes in the central nervous system are reflected in the CSF, rendering it a valuable diagnostic fluid. CSF has a complex composition, containing proteins that span a concentration range of 8 – 9 orders of magnitude. Besides proteins, previous studies have also demonstrated the presence of a large number of endogenous peptides. While less extensively studied than proteins, these may also hold potential interest as biomarkers.
Endogenous peptides were separated from the CSF protein content through MWCO filtration. By removing a majority of the protein content from the sample, it is possible to increase the sample volume studied and thereby also the total amount of the endogenous peptides. The complexity of the filtrated peptide mixture was addressed by including a reverse phase (RP) HPLC pre-fractionation step at alkaline pH prior to LC-MS analysis. The fractionation was combined with a simple concatenation scheme where 60 fractions were pooled into 12, analysis time consumption could thereby be reduced while still largely avoiding co-elution.
Automated peptide identification was performed by using three different peptide/protein identification software programs and subsequently combining the results. The different programs were complementary rather than comparable with less than 15% of the identifications overlapped between the three.
Biomarkers in cerebrospinal fluid (CSF) are currently transforming research into neurodegenerative disorders. In Alzheimer's disease, the most common neurodegenerative disorder, affecting over 60 million people worldwide1,2, a biomarker triplet consisting of the peptide amyloid beta, microtubule-stabilizing protein tau, and a phosphorylated tau form, can detect the disease with high sensitivity and specificity, and has been included in the diagnostic research criteria3. In other neurodegenerative diseases, such as Parkinson's disease and Multiple Sclerosis, proteomic studies have identified numerous biomarker candidates, some of which are currently under evaluation in clinical studies4,5,6.
Alongside proteins, CSF also contains an abundance of endogenous peptides7,8,9,10,11,12. Constituting cleavage products of many brain-derived proteins, these peptides also represent a potentially important source of disease biomarkers. To increase the inventory of identified endogenous peptides in human CSF and enable CSF endopeptidomic analyses in clinical studies, a method was developed for sample preparation and LC-MS analysis (a brief protocol scheme has been included in Figure 1).The application of this method in a recent study resulted in the identification of nearly 16,400 endogenous CSF peptides in pooled CSF samples from several individuals of non-specific diagnosis, expanding the known CSF endopeptidome ten-fold13. The method can optionally be used in conjunction with isobaric labelling approach for quantification.
Sample Preparation
The main source of protein mass in CSF is plasma constituents (e.g. albumin and immunoglobulins) passing over the blood brain barrier14,15. Their high abundance hampers the detection of low-abundant, brain-derived sample components. Endogenous peptides can be readily separated from the high-abundant proteins, thereby allowing a significantly larger volume of CSF peptide extract to be used for LC-MS analysis, thereby enabling detection of lower-abundant peptides.
In the protocol presented here, molecular weight cut-off (MWCO) filtration was used to separate the CSF peptides from the protein fraction; a method that has been used in several previous studies8,9,10,11,12,16. The filtration step was followed by an offline RP HPLC pre-fractionation step performed over a high-pH mobile phase gradient. By performing two RP HPLC steps in tandem, with pH being the main distinction, the difference in selectivity between the two steps results mainly from altered peptide retention as a consequence of different peptide charge states. The application of high-pH peptide pre-fractionation prior to LC-MS under acidic conditions has proven efficient in increasing peptide identification17,18, and even to be superior for this purpose in complex biological samples compared to more orthogonal separation modes19, such as strong cat-ion exchange (SCX) and RP20. To shorten the analysis time, a concatenation scheme was used, pooling every 12th fraction (e.g., fractions 1, 13, 25, 37, and 49), which due to the high resolving power of RP HPLC still largely avoided co-elution of peptides from different fractions in the LC-MS step20,21.
Peptide identification
Peptide identification in peptidomic studies differs from that of proteomic studies in that no enzyme cleavage can be specified in the database search, and as a consequence, identification rates are usually lower11. A recent study13 showed that the identification rates for endogenous peptides obtained with Sequest and Mascot were substantially improved when the default scoring algorithm of the respective software program was modified using the adaptive scoring algorithm Percolator, indicating that optimal scoring algorithms for endogenous peptides differ from that of tryptic peptides13. In that study, identification based on automatic peptide de novo sequencing using the software PEAKS (BSI) was found to be complementary to the two fragment ion fingerprinting-based search engines, resulting in a significantly larger set of identified peptides.
The protocol described below is a refined version of the one used in a previous study where a large amount of endogenous peptides were identified in human CSF15. Updates to the original protocol involve minor alterations to the chemical pre-treatment of CSF as well as optimisation of the gradient used for offline high-pH RP HPLC pre-fractionation.
Ethical considerations
All studies of Swedish patient and control materials have been approved by ethical committees: St. Göran (ref. 2005-554-31/3). CSF samples from the Amsterdam Dementia Cohort and samples collected at the National Hospital for Neurology and Neurosurgery, London were used for research with written consent from all participating patients and approval from regional ethics committees. The material here utilized mainly consisted of left-over CSF from samples taken for the purpose of diagnosis and it was de-identified before being included in our studies. There is no possibility of tracing back the sample to any individual donor or group of donors.
1. Extraction of Human Cerebrospinal Fluid (CSF):
2. Pre-treatment of CSF (1.5 mL Sample Volume, No Quantification):
3. Pre-treatment of CSF (10 x 150 µL Sample Volume, Isobaric Labelling-quantification):
4. Molecular Weight cut-off Filtration
5. De-salting and Sample Clean-up by Solid Phase Extraction
6. Offline high-pH Reverse Phase HPLC Sample Fractionation
7. LC-MS
8. Peptide Identification
The method described here has been applied and evaluated in three studies prior to the introduction of sample pre-fractionation (Table 1). The first study used offline LC for spotting CSF fractions on a MALDI target plate and resulted in 730 identified endogenous peptides11. In the two following studies, isobaric labelling was employed. Primarily in a case/control study for identification and characterisation of potential biomarkers in the CSF endopeptidome and proteome simultaneously24, and in the second study isobaric labelling was used to monitoring treatment effects in vivo of a γ-secretase inhibitor on the peptide expression in CSF over 36 h16. In the case/control study 437 endogenous peptides were identified, 64 of which significantly altered in concentration between individuals with AD and healthy controls. The third, treatment study, identified 1798 endogenous peptides, 11 of the monitored peptides could be shown to respond to the treatment.
In the fourth study, the aim was to increase the number of identified CSF peptides, particularly to identify lower-abundant peptides. Therefore, peptide pre-fractionation by HpH-RP chromatography was included and a 10-fold larger CSF sample volume was used, resulting in identification of 16,395 peptides. In this study, no isobaric labelling was performed. In addition to sample fractionation, the most recent study employed a combined peptide identification approach, whereas in the first three studies only a single database search was performed, which to some extent accounts for the larger number of peptides identified. Comparing the results obtained by the individual search engines (Mascot, Sequest HT or PEAKS) from the most recent study indicates that the used algorithms are to some extent complementary since a relatively small amount, less than 15% (2440), of peptides are identified by all three search engines (Figure 2). Further, the de novo-sequencing search engine PEAKS was the most efficient in identifying endogenous peptides, but more than 5,400 peptides would not have been identified if only PEAKS had been used (Figure 2). The application of several search engines on the same material has potential multiple testing issues and this was addressed with a test of identification correctness13. The acquired raw MS/MS data, as well as all results obtained in proteomic searches from the most recent trial have been made available in the PRIDE data repository via ProteomeXchange with identifier PXD004863.
Figure 1: A protocol scheme visualizing the principal steps of the method. Extraction of cerebrospinal fluid by lumbar puncture followed by centrifugation to remove non-soluble material, 2) Addition of GdnHCl to the sample to dissociate peptide-protein aggregates, increasing the recovery of endogenous peptides; reduction and alkylation of cysteine disulphides; isobaric labelling for peptide quantification (optional) 3) molecular weight filtration to separate the endogenous peptides from the proteins, 4) solid phase extraction to remove salts and other polar contaminants, 5) RP HPLC pre-fractionation, alkaline mobile phase gradient and concatenation of every 12th fraction, 6) RP HPLC-MS/MS, acidic mobile phase gradient, each concatenated fraction run consecutively, 7) peptide identification performed by submitting MS/MS data from all 12 analysis runs as a single sample to the search engines, subsequently peptide IDs were compared and a summation of all unique peptide IDs was performed. Please click here to view a larger version of this figure.
Study summary | TMT labelling (y/n) | HpH-RP fractionation (y/n) | Corresponding volume of CSF per MS-analysis (µl) | Number of identified peptides | Comment | Reference |
Explorative CSF peptidome analysis | n | n | 500 | 730 | Offline LC MALDI target preparation, MALDI-MS; evaluation of MWCO filters | 4 |
Quantitative comparison of CSF peptides; samples from 8 AD + 8 Ctrl | y | n | 200 | 437 | HPLC-ESI MS; combined peptidomic and proteomic protocol | 25 |
AD gamma secretase inhibitor treatment study | y | n | 300 | 1798 | HPLC-ESI MS; CSF extracted at six time points after treatment | 17 |
Expanding the CSF peptidome | n | y | 750-1000 | 18.031 | HPLC-ESI MS; combination of peptide identification softwares | 15 |
Table 1: A compilation of recent studies performed by this group which applies molecular weight filtration and mass spectrometric analysis for identification of endogenous peptides in human CSF.
Figure 2: A Venn diagram comparing the peptide identification results obtained from each of the three search engines Mascot, Sequest HT and PEAKS. A total of 16,395 endogenous peptides were identified. The de novo-sequencing search engine PEAKS identified 10,967 endogenous peptides; fragment-ion fingerprinting-based search engines Mascot and Sequest HT identified 8118 and 7304 endogenous peptides respectively. The identification consensus between all three search engines amounted to 2440 endogenous peptides, or 14.8%. There was a relatively large identification overlap between Mascot and Sequest HT, corresponding to 70% of their combined peptide identifications. PEAKS had a comparatively small identification overlap with both Mascot and Sequest HT; 20.5% and 18.9% respectively. Please click here to view a larger version of this figure.
The introduction of an high-pH RP HPLC pre-fractionation step to a previously developed protocol for recovery of endogenous peptides by molecular weight ultrafiltration11 reduced relative sample complexity and thereby allowed for a 5-fold larger sample volume to be studied. This, in turn, increased the concentration of the subset of peptides present in each fraction and thereby improved the chances of detecting low abundant peptides.
By performing an identification strategy for endogenous peptides which employed three proteomic softwares in parallel, it was possible to expand the known CSF endopeptidome more than 10-fold. A total of 16,395 endogenous peptides were identified in a preliminary trial on a pooled CSF sample material. Among the identifications were a large number of endogenous peptides derived from proteins previously noted in the context of neurodegenerative disorders. Several peptides identified in the above studies are currently being evaluated as biomarkers in our laboratory. This process involves several steps, including verification of the peptides' identities by spiking CSF with synthetic analogues labelled with heavy isotopes, establishment of targeted mass spectrometric assays, assessing peptide stability during storage and freeze-thaw cycles, and analysing different clinical cohorts.
Modifications were made to the original protocol in order to avoid introduction of contaminants (steps 2.5 and 3.5) as a consequence of a high concentration of GdnHCl in the sample during MWCO-filtration. An update to the pre-fractionation gradient (step 6.3.1) was made, capacity prolonged, linear gradient is used.
The primary causes of analyte losses in the protocol here described can be attributed to the two RP chromatographic steps as well as the MWCO filtration and SPE sample clean up step.
The peptide losses due to interaction with the MWCO-filter, or proteins retained on it, during filtration are difficult to avoid and may be a source of inter sample variation.
Further selective losses likely arise in the RP-chromatographic steps. Since peptide hydrophobicity is pH-dependent, performing two consecutive RP chromatographic steps at high and low pH, respectively, may lead to losses of the subset of peptides that are too hydrophilic at pH ≥9 to be retained on the column and similarly a second subset too hydrophilic at pH ≤3 to be retained.
Compared to previously used methods the employment of pre-fractionation has led to a 10-fold increase in identification of endogenous peptides. It has allowed for successful detection of a large number of previously unidentified peptides and is therefore a valuable tool in the study and exploration of the CSF peptidome, and possibly other complex biological samples as well.
Combined with multiplexed isobaric labelling the protocol is intended to be further applied to identify biomarker candidates for various neurodegenerative disorders in CSF, blood and brain tissue.
Varying recovery in the sample preparation steps contributes to the analytical variation for the analysis of CSF proteins and peptides by LC-MS. Performing isobaric labelling of peptides at an early stage in the sample preparation decreases the influence of such variation greatly. Compared to previously reported sample preparation protocols, the implementation of high-pH reversed phase peptide pre-fractionation increased the number of identified peptides by a factor 5. Identification of endogenous peptides from MS/MS-data was improved significantly when combining different peptide identification software programs, employing different search algorithms.
The authors have nothing to disclose.
Many thanks to Tanveer Batth and colleagues for advice in setting up the pre-fractionation method.
This work was supported by funding from the Swedish Research Council, the Wallström and Sjöblom Foundation, the Gun and Bertil Stohne Foundation Stiftelse, the Magnus Bergwall Foundation, the Åhlén Foundation, Alzheimerfonden, Demensförbundet, Stiftelsen för Gamla Tjänarinnor, the Knut and Alice Wallenberg Foundation, Frimurarestiftelsen, and FoU-Västra Götalandsregionen.
The main recipients of funding for this project were Kaj Blennow, Henrik Zetterberg and Johan Gobom.
1 M Triethylammonium bicarbonate | Fluka, Sigma-Aldrich | 17902-100ML | TEAB |
8 M Guanidinium hydrochloride | Sigma-Aldrich | G7294-100ML | GdnHCl |
Tris(2-carboxyethyl)phosphine hydrochloride | Pierce | 20490 | TCEP |
Iodoacetamide | SIGMA | I1149-5G | IAA |
Hydroxylamine 50% (w/w) | Sigma-Aldrich | 457804-50ML | |
Acetonitrile, Far UV, HPLC gradient grade | Sigma-Aldrich | 271004-2L | AcN |
Formic acid | Fluka, Sigma-Aldrich | 56302-1mL-F | FA |
Triflouroacetic acid | Sigma-Aldrich | T6508-10AMP | TFA |
Ammonium hydroxide solution | Sigma-Aldrich | 30501-1L-1M | NH4OH |
Amicon Ultra-15 Centrifugal Filter Unit with Ultracel-30 membrane | Merck Millipore | UFC903024 | MWCO-filter |
Sep-Pak C18, 100 mg | Waters | WAT023590 | SPE-column |
Resprep 12-port SPE Manifold | Restek | 26077 | Vacuum manifold |
TMT10plex Isobaric Label Reagent Set | Thermo Fisher Scientific | 90110 | TMT10plex |
UltiMate 3000 RSLCnano LC System | Dionex | 5200.0356 | Online sample separation |
Ultimate 3000 RPLC Rapid Separation Binary System | Dionex | IQLAAAGABHFAPBMBEZ | Offline high-pH fractionation |
Orbitrap Fusion Tribrid mass spectrometer | Thermo Scientific | IQLAAEGAAPFADBMBCX | Mass spectrometer for sample analysis |
Proteome Discoverer 2.0 | Thermo Fisher Scientific | IQLAAEGABSFAKJMAUH | Proteomics search platform |
Mascot v2.4 | Matrix Science | - | Proteomics search engine |
Sequest HT | Thermo | - | Proteomics search engine |
PEAKS v7.5 | Bioinformatic Solutions Inc.) | - | Proteomics search engine |
Acclaim PepMap 100, 75 µm x 2 cm, C18, 100 Å pore size, 3 µm particle size | Thermo Fisher Scientific | 164535 | Trap column (nano HPLC) |
Acclaim PepMap C18, 75 µm x 500 mm, 100Å pore size, 2 µm particle size | Thermo Fisher Scientific | 164942 | Separation Column (nano HPLC) |
Savant SpeedVac High Capacity Concentrators | Thermo Fisher Scientific | SC210A-230 | SpeedVac/Vacuum concentrator |
XBridge Peptide BEH C18 Column, 130Å, 3.5 µm, 2.1 mm X 250 mm | Waters | 186003566 | Separation Column (micro HPLC) |