The protocol has been developed to effectively extract intact histones from sorghum leaf materials for profiling of histone post-translational modifications that can serve as potential epigenetic markers to aid engineering drought resistant crops.
Histones belong to a family of highly conserved proteins in eukaryotes. They pack DNA into nucleosomes as functional units of chromatin. Post-translational modifications (PTMs) of histones, which are highly dynamic and can be added or removed by enzymes, play critical roles in regulating gene expression. In plants, epigenetic factors, including histone PTMs, are related to their adaptive responses to the environment. Understanding the molecular mechanisms of epigenetic control can bring unprecedented opportunities for innovative bioengineering solutions. Herein, we describe a protocol to isolate the nuclei and purify histones from sorghum leaf tissue. The extracted histones can be analyzed in their intact forms by top-down mass spectrometry (MS) coupled with online reversed-phase (RP) liquid chromatography (LC). Combinations and stoichiometry of multiple PTMs on the same histone proteoform can be readily identified. In addition, histone tail clipping can be detected using the top-down LC-MS workflow, thus, yielding the global PTM profile of core histones (H4, H2A, H2B, H3). We have applied this protocol previously to profile histone PTMs from sorghum leaf tissue collected from a large-scale field study, aimed at identifying epigenetic markers of drought resistance. The protocol could potentially be adapted and optimized for chromatin immunoprecipitation-sequencing (ChIP-seq), or for studying histone PTMs in similar plants.
The increasing severity and frequency of drought is expected to affect productivity of cereal crops1,2. Sorghum is a cereal food and energy crop known for its exceptional ability to withstand water-limiting conditions3,4. We are pursuing mechanistic understanding of the interplay between drought stress, plant development, and epigenetics of sorghum [Sorghum bicolor (L.) Moench] plants. Our previous work has demonstrated strong connections between plant and rhizosphere microbiome in drought acclimation and responses at the molecular level5,6,7. This research will pave the way for utilizing epigenetic engineering in adapting crops to future climate scenarios. As a part of the efforts in understanding epigenetics, we aim to study protein markers that impact gene expression within the plant organism.
Histones belong to a highly conserved family of proteins in eukaryotes that pack DNA into nucleosomes as fundamental units of chromatin. Post-translational modifications (PTMs) of histones are dynamically regulated to control chromatin structure and influence gene expression. Like other epigenetic factors, including DNA methylation, histone PTMs play important roles in many biological processes8,9. Antibody-based assays such as western blots have widely been used to identify and quantify histone PTMs. In addition, the interaction of histone PTMs and DNA can be effectively probed by Chromatin immunoprecipitation – sequencing (ChIP-seq)10. In ChIP-seq, chromatin with specific targeted histone PTM is enriched by antibodies against that specific PTM. Then, the DNA fragments can be released from the enriched chromatin and sequenced. Regions of genes that interact with the targeted histone PTM are revealed. However, all these experiments heavily rely on high quality antibodies. For some histone variants/homologs or combinations of PTMs, development of robust antibodies can be extremely challenging (especially for multiple PTMs). In addition, antibodies can only be developed if the targeted histone PTM is known.11 Therefore, alternative methods for untargeted, global profiling of histone PTMs are necessary.
Mass spectrometry (MS) is a complementary method to characterize histone PTMs, including unknown PTMs for which antibodies are not available11,12. The well-established “bottom-up” MS workflow uses proteases to digest proteins into small peptides prior to liquid chromatography (LC) separation and MS detection. Because histones have large numbers of basic residues (lysine and arginine), the trypsin digestion (protease specific to lysine and arginine) in the standard bottom-up workflow cuts the proteins into very short peptides. The short peptides are technically difficult to analyze by standard LC-MS, and do not preserve the information about the connectivity and stoichiometry of multiple PTMs. The use of other enzymes or chemical labeling to block lysines generates longer peptides that are more suitable for characterization of histone PTMs13,14.
Alternatively, the digestion step can be completely omitted. In this “top-down" approach, intact protein ions are introduced into the MS by electrospray ionization (ESI) after online LC separation, yielding ions of the intact histone proteoforms. In addition, ions (i.e., proteoforms) of interest can be isolated and fragmented in the mass spectrometer to yield the sequence ions for identification and PTM localization. Hence, top-down MS has the advantage to preserve the proteoform-level information and capture the connectivity of multiple PTMs and terminal truncations on the same proteoform15,16. Top-down experiments can also provide quantitative information and offer insights of biomarkers at the intact protein level17. Herein, we describe a protocol to extract histone from sorghum leaf and analyze the intact histones by top-down LC-MS.
The example data shown in Figure 1 and Figure 2 are from sorghum leaf collected at week 2 after planting. Although variation of yield is expected, this protocol is generally agnostic to specific sample conditions. The same protocol has been successfully used for sorghum plant leaf tissue collected from 2, 3, 5, 8, 9, and 10 weeks after planting.
1. Preparing sorghum leaf material
NOTE: The sorghum plants were grown in soil in the field in Parlier, CA.
2. Preparing buffers and materials (3–4 h)
NOTE: The high concentration stock solutions can be made ahead of time and stored until use. But all working buffers must be made fresh on the day of the extraction (by dilution from stock and mixing with other contents) and to be placed on ice during the process. The whole experiment should be performed at 4 °C unless recommended otherwise.
Reagents | Stock concentration | EB1 | EB2A | EB2B |
Volume (mL) | Volume (mL) | Volume (mL) | ||
Sucrose | 2.5M | 4.4 | 1.25 | 0.5 |
Tris HCl pH8 | 1M | 0.25 | 0.125 | 0.05 |
DTT | 1M | 0.125 | 0.0625 | 0.025 |
H2O | 20.225 | 9.6875 | 4.375 | |
protease inhibitor (PI) tablet | 0.5 pill | 0.5 pill | 0.5 pill | |
Additional inhibitors (Optional) | 33mM | 0.25 | 0.125 | 0.05 |
MgCl2 | 1M | 0.125 | 0.05 | |
Triton X100 | 10% | 1.25 | ||
Overall Volume | 25 mL | 12.5 mL | 5 mL |
Table 1: Composition for extraction buffers (EBs).
NLB | Stock concentration | Volume (mL) |
NaCl | 5M | 0.4 |
Tris HCl pH8 | 1M | 0.05 |
Triton X100 | 10% | 0.5 |
EDTA | 0.5M | 0.2 |
H2O | 3.85 | |
PI tablets | 0.5 pill | |
Additional inhibitors (optional) | 33mM | 0.05 |
Overall Volume | 5 mL |
Table 2: Composition for the nuclei lysis buffer (NLB).
3. Nuclei isolation procedure
NOTE: It is recommended to perform steps 3.1–3.3 of the first day (2–3 h), save the nuclei in NLB buffer at -80 °C and resume the following day (or later) for protein purification (4 h). The nuclei isolation steps in this protocol were adapted from a sorghum ChIP-seq protocol being used at the Joint Genome Institute. Additional washes and sucrose gradient separation may be required to ensure nuclei purity for ChIP-seq applications.
4. Mass spectrometry of purified histones
Following the protocol, the histones can be extracted and identified using the LC-MS analysis. The raw data and processed results are available at MassIVE (https://massive.ucsd.edu/) via accession: MSV000085770. Based on the TopPIC results from the representative sample (available also from MassIVE), we identified 303 histone proteoforms (106 H2A, 72 H2B, 103 H3, and 22 H4 proteoforms). Co-purified ribosomal proteoforms have also been detected, typically eluting early in the LC. They usually consist of ~20% of the identified proteoforms, but do not overlap with the histone proteoforms eluting in the later stage of the LC gradient. The results can be easily visualized with the latest TopPIC or Informed-Proteomics packages. For demonstration, we will focus on the data visualization using the Informed-Proteomics package, which can be used to directly load raw MS files and manually examine proteoform identifications. Please note that both software packages use different algorithms and parameters. The reported numbers of proteoforms will not be identical. We recommend reporting the proteoform counts from TopPIC because it is more conservative, and it does consider unknown PTMs. Informed-Proteomics package has integrated data processing and visualization for easy manual validation. For organisms with well-annotated PTMs, we recommend ProSightPC24 for best site localization. Combining the results using multiple tools can increase the number of and the confidence of proteoform identifications.
After processing the data with Informed-Proteomics, the LC-MS feature map can be visualized in LcMsSpectator, which displays the deconvoluted protein masses against the LC retention time. By clicking on the identified proteoforms in the software, the associated feature will be highlighted with a small green rectangle in the feature map. Major histone proteins should be seen in specific regions of the map, which indicates the success of the experiment. Figure 1a shows a representative LC-MS feature map of intact histones. Full-length histone proteoforms are highlighted in the dashed boxes. Most proteoforms detected can be confidently identified using MS2 data.
Figure 1b shows the zoom in of the region with H2A and H2B proteoforms. Most of them have N-terminal modifications of 42 Da. This nominal mass corresponds to either trimethylation (42.05 Da) or acetylation (42.01 Da), which are commonly seen for histones. Their accurate masses differ only by 0.04 Da and are difficult to differentiate at the intact protein level (~2 ppm). In high resolution MS2 spectra, the PTMs can be easily differentiated and confirmed because of the lower mass of the fragments29. In addition, H2A and H2B histones have multiple homologs with very similar sequences as noted by the different UniProt accession numbers in Figure 1b. Again, high resolution LC-MS analysis can readily identify and differentiate them. Two types of H2As were identified for sorghum histones. The 16 kDa H2A histones in Figure 1b have extended terminal tails in the non-conserved regions of histones. Another group of H2A histones without the extended tails (14 kDa) can be seen in Figure 1c.
For H4 histones, N-terminal acetylation was identified as the major PTM. Additional lysine acetylations and methionine oxidations can be also observed simply by examining the mass differences of the features in Figure 1d. We also observed an unknown modification of 112.9 Da in addition to the N-terminal acetylation (the feature above “3Ac” in Figure 1d). This is likely some unknown adducts from the reagent used in the preparation. We have previously detected sulfate ion adducts on H4, which may be attributed to residual salts combined with high basicity of histone proteins. For H3, two protein sequences were identified H3.3 and H3.2 (Figure 1e). Although these two protein sequences differ at only 4 residues (32, 42, 88, and 91), they can still be easily distinguished in LC-MS based on the separation in both dimensions, mass, and retention time. H3 proteins are heavily modified by varying degrees of methylation and acetylation. The high degree of modification can be easily visualized by the dense, parallel lines in the feature map, which are 14 Da apart. However, three methylation groups (14*3 Da) have the equal nominal mass to one acetylation (42 Da). Because these PTMs cannot be easily resolved at intact protein level, they are referred to as “methyl equivalents” (i.e., multiples of 14 Da; one acetylation equals three methyl equivalents). In Figure 1e, H3 proteoforms are labeled in the form of methyl equivalents based on their intact mass. Due to limited resolution of the RPLC separation, many different H3 proteoforms are likely co-eluting and fragmented in the same spectrum. The method presented here will only identify the most abundant combinations of methylation and acetylation as illustrated in Figure 2. For more comprehensive characterization of H3, more targeted analysis is still required30,31.
Figure 1: LC-MS feature map on intact histones extracted from sorghum leaves. The figure shows LC retention time (in minutes) vs. the molecular mass for all detected proteoforms. The log abundance is shown by the color scale next to the top map (log 10 abundance). (a) The major histone peaks are labeled by the dashed boxes. Most of the features outside the boxes are truncated histones and ribosomal proteins. Zoom-in views for each group of histones: (b) H2B and 16 kDa H2A, (c) H3, (d) 14 kDa H2A, and (e) H3. The UniProt accession numbers are noted alongside each feature, followed by detected PTMs. “Ac”, “me”, “+O” indicate acetylation, methylation, and oxidation, respectively. In (b), two truncated H2A C5YZA9 proteoforms are labeled, which had one or two C-terminal alanine clipped (shown as -A*, and -AA*). Please click here to view a larger version of this figure.
A representative example of proteoform identification is shown in Figure 2 using MSPathfinder and visualized in LcMsSpectator. The fragmentation spectrum in Figure 2a was generated using ETD, which yields c and z type ions along the protein backbone. HCD of the same precursor can be used to validate the identification, but HCD generally provides limited sequence coverage20. The precursor ions in the previous and next MS1 spectra are shown in Figure 2b,c, with their matched isotope peaks highlighted in purple. The sequence coverage map in Figure 2d can help localize any possible PTMs. A high-confidence identification should have most of the fragments matched, precursor ion matched, and good sequence coverage to help localize PTMs. In this example, an H3.2 proteoform was identified with two PTMs—di-methylation on K9 and methylation on K27. Following the same method, other proteoforms with different PTMs and terminal truncations can be manually validated.
Figure 2: Representative example of an identified histone H3.2 proteoform. H3.2 preteoform with its (a) ETD spectrum, (b) precursor ion in the previous MS1 spectrum, (c) precursor ion in the next MS1 spectrum, and (d) sequence coverage map. The c ions from the N-terminus are labeled in cyan, and the z ions from the C-terminus are in pink. Two PTMs were identified and highlighted in yellow in (d) with their mass shifts annotated. Please click here to view a larger version of this figure.
Quantitative comparison of the detected histone proteoforms can reveal potential epigenetic markers. We have applied this protocol previously to 48 sorghum samples collected from the field (“additional inhibitors” were not used in this study)29. Two different genotypes of sorghum were compared in response to pre-flowering or post-flowering droughts. By comparing the relative abundance of the proteoforms, we discovered some interesting changes of truncated histone proteoforms that are specific to sample conditions as shown in Figure 3. C-terminal truncation of H4 was observed only in weeks 3 and 9 for some of the samples (Figure 3a,b). For H3.2, N-terminal truncated proteoforms were generally more abundant in week 10 (Figure 3c,d). In contrast, C-terminal truncated H3.2 tend to be seen in earlier time points (Figure 3c). More importantly, the two genotypes did not respond in the exact same way. The H4 C-terminal truncated proteoforms were significantly more abundant in BTx642 than in RTx430 (Figure 3b). Such data reveals potential epigenetic markers of plant development and stress tolerance that can be further tested with other techniques.
Figure 3: Quantitative comparison of histone proteoforms. (a) Heatmap of histone H4 proteoforms across different samples. For each proteoform, the abundance extracted from top-down MS data was normalized to the sum of all identified H4 proteoforms in each analysis, yielding the “relative abundance”. The values were then scaled to the maximum of each row to better show the changes in low abundance proteoforms. The scaled relative abundance is denoted in the color key at the bottom of the heatmap. Growth conditions are noted on the horizontal axis (Pre: pre-flowering drought, Post: post-flowering drought). Three replicates are grouped together and are separated by black vertical stripes from other conditions. For samples labeled with asterisks, only technical replicates were acquired. Proteoforms are represented on the vertical axis, in the format “starting residue – ending residue: mass; putative modification”. (b) Relative abundance plot of the truncated H4 proteoforms 2–99 (proteoforms highlighted in bold in (a) are summed) at different conditions. The key to the symbols is shown in the legend in the top-right corner. Filled dots in the middle of the error bars are the average values. (c) Heatmap of H3.2 proteoforms and (d) abundance plot for all identified N-terminal truncated H3.2 are shown in the same format as those for H4. Proteoforms smaller than 8 kDa in (c) were omitted for simplicity. The N-terminal and C-terminal truncated H3.2 proteoforms showed different responses across the growth conditions. Reprinted with permission from ELSEVIER from ref.29. Please click here to view a larger version of this figure.
The presented protocol describes how to extract histones from sorghum leaf (or more generally plant leaf) samples. The average histone yield is expected to be 2–20 µg per 4–5 g sorghum leaf material. The materials are sufficiently pure for the downstream histone analysis by LC-MS (mostly histones with ~20% ribosomal protein contamination). Lower yield may be obtained due to sample variations, or potential mishandling/failures throughout the protocol. Maintaining the integrity of the nuclei before the nuclei lysis step is critical; therefore, aggressive vortexing and pipetting should be avoided before adding NLB. In addition, loss of nuclei may occur when removing the supernatants from the pellets. Care must be taken to not disrupt the pellets when pipetting. The Triton X-100 concentration of 1% was optimized to selectively lyse the non-targeted organelles but not the nuclei (step 3.2). Optimal detergent concentration for other tissue or organisms may be different and need to be experimentally determined. Color change of the supernatant during the filtration process could indicate potential issues such as inefficient release of chloroplast or insufficient grinding of leaf. If possible, use a microscope to check for lysis of chloroplasts and retention of intact nuclei after each step to further optimize the protocol (especially if modifying the protocol for other tissues or plants). This protocol has only been tested with sorghum leaf tissue. It does not work for sorghum root tissue likely due to interference from soil. Application to other plant leaf tissues has not been tested and application to different plants may need additional optimization. For adapting the nuclei isolation protocol for ChIP-seq applications, an additional sucrose gradient density separation after step 3.3.4 (before using NLB) is advised to reduce cytoplasmic contamination. Because of the extensive clean-up steps, small amounts of residual non-nuclei materials are not expected to cause significant interference for histone analysis in LC-MS and can be left with the pellet.
Several initial trials failed when using commercial tablets of phosphatase inhibitors (e.g., PhosSTOP). The supernatant in step 3.1.6 appeared to be intense green when the tablets were used in the extraction buffer. The final extract showed low number of identified histones. We suspect the proprietary ingredients in the tablets may have caused nuclei lysis before step 3.4, reducing the overall histone yield. Another possible reason for failure is the incompatibility of the ingredients in the histone purification step with the ion exchange resin (step 3.4). We have used this protocol to consistently extract high purity histones for subsequent LC-MS over 150 samples. On average, we were able to obtain higher yield without using the “additional inhibitors” (unpublished data). Therefore, it is advised to cautiously test new inhibitors when modifying or adapting this protocol for other purposes. If phosphorylation is not of interest, the phosphatase inhibitors can be omitted in the extraction buffers.
The steps in 3.4 can take 3–4 h or more. It is recommended to break the protocol in 2 days—freeze the nuclei pellet from step 3.3 and perform the purification on day 2 (or later). The freeze-thaw cycle may partially help the nuclei lysis. The MWCO filter steps (3.4.7) can be very time consuming but can be easily scaled up by preparing multiple samples in parallel. Do not add the protease inhibitor tablets in step 3.4. Many commercial tablets contain polymers (e.g., polyethene glycol) as fillers, which will interfere with LC-MS analysis. At this step, the most other proteins should have been removed or denatured, so enzyme inhibitors are not critical. However, it is still necessary to keep the samples at 4 °C or frozen to minimize degradation.
Following this protocol, histones can be successfully extracted from sorghum leaves. Histone PTMs can be characterized with LC-MS. The method can be potentially applied to large scale studies for comparing histone PTMs between different biological samples (e.g., different genotypes, plants grown under different conditions, etc.) as shown by the example data in Figure 3. However, data processing still requires extensive manual analysis for confidently assigning proteoforms, especially for unexpected (or novel) PTMs. New developments in bioinformatics tools are anticipated to automate the workflow and significantly increase the throughput for large-scale studies. Another limitation is that the top-down MS method, currently, cannot easily differentiate many proteoforms of hyper-modified H3 (e.g., multiple sites of mono/di/tri-metlylation and acetylation). The single dimension reversed-phase LC cannot fully separate the different H3 proteoforms. Therefore, the MS2 spectra of H3 will typically contain fragments from multiple proteoforms and cannot be easily and confidently deconvoluted. Combining top-down with bottom-up or middle-down methods30,32,33 can be especially beneficial for characterization of histone H3. Alternatively, multi-dimensional separation can be considered to improve the depth of top-down MS34,35,36.
Histone PTM profiling by LC-MS enables discovery of novel epigenetic markers for designing chromatin modifiers and improve the resilience of plants to severe environmental conditions. A pilot study using sorghum from two cultivars and grown under drought conditions in the field indicated that selective histone terminal clipping in leaf may be related to drought acclimation and plant development29. The identified histone markers may serve as targets by complementary techniques such as ChIP-seq. Comprehensive understanding of epigenetic factors gained from these complementary techniques would be indispensable for engineering innovative solutions to crops in response to environmental changes.
The authors have nothing to disclose.
We thank Ronald Moore and Thomas Fillmore for helping with mass spectrometry experiments, and Matthew Monroe for data deposition. This research was funded by grants from US Department of Energy (DOE) Biological and Environmental Research through the Epigenetic Control of Drought Response in Sorghum (EPICON) project under award number DE-SC0014081, from the US Department of Agriculture (USDA; CRIS 2030-21430-008-00D), and through the Joint BioEnergy Institute (JBEI), a facility sponsored by DOE (Contract DE-AC02-05CH11231) between Lawrence Berkeley National Laboratory and DOE. The research was performed using Environmental Molecular Sciences Laboratory (EMSL) (grid.436923.9), a DOE Office of Science User Facility sponsored by the Office of Biological and Environmental Research.
Acetonitrile | Fisher Chemical | A955-4L | |
Dithiothreitol (DTT) | Sigma | 43815-5G | |
EDTA, 500mM Solution, pH 8.0 | EMD Millipore Corp | 324504-500mL | |
Formic Acid | Thermo Scientific | 28905 | |
Guanidine Hydrochloride | Sigma | G3272-100G | |
MgCl2 | Sigma | M8266-100G | |
Potassium phosphate, dibasic | Sigma | P3786-100G | |
Protease Inhibitor Cocktail, cOmplete tablets | Roche | 5892791001 | |
Sodium butyrate | Sigma | 303410-5G | Used for histone deacetylase inhibitor |
Sodium Chloride (NaCl) | Sigma | S1888 | |
Sodium Fluoride | Sigma | S7020-100G | Used for phosphatase inhibitor |
Sodium Orthovanadate | Sigma | 450243-10G | Used for phosphatase inhibitor |
Sucrose | Sigma | S7903-5KG | |
Tris-HCl | Fisher Scientific | BP153-500 g | |
Triton X-100 | Sigma | T9284-100ML | |
Weak cation exchange resin, mesh 100-200 analytical (BioRex70) | Bio-Rad | 142-5842 | |
Disposables | |||
Chromatography column (Bio-Spin) | BIO-RAD | 732-6008 | |
Mesh 100 filter cloth | Millipore Sigma | NY1H09000 | This is part of the Sigma kit (catalog # CELLYTPN1) for plant nuclei extraction. Similar filters with the same mesh size can be used. |
Micropipette tips (P20, P200, P1000) | Sigma | ||
Tube, 50mL/15mL, Centrifuge, Conical | Genesee Scientific | 28-103 | |
Tube, Microcentrifuge, 1.5/2 mL | Sigma | ||
Equipment | |||
Analytical Balance | Fisher Scientific | 01-912-401 | |
Beakers (50mL – 2L) | |||
Microcentrifuge with cooling | Fisher Scientific | 13-690-006 | |
Micropipettes | |||
Swinging-bucket centrifuge with cooling | Fisher Scientific | ||
Vortex | Fisher Scientific | 50-728-002 | |
Water bath Sonicator | Fisher Scientific | 15-336-120 |