SEC-BioSAXS measurements of biological macromolecules are a standard approach for determining solution structure of macromolecules and their complexes. Here, we analyze SEC-BioSAXS data from two types of commonly encountered SEC traces—chromatograms with fully resolved and partially resolved peaks. We demonstrate the analysis and deconvolution using scatter and BioXTAS RAW.
BioSAXS is a popular technique used in molecular and structural biology to determine the solution structure, particle size and shape, surface-to-volume ratio and conformational changes of macromolecules and macromolecular complexes. A high quality SAXS dataset for structural modeling must be from monodisperse, homogeneous samples and this is often only reached by a combination of inline chromatography and immediate SAXS measurement. Most commonly, size-exclusion chromatography is used to separate samples and exclude contaminants and aggregations from the particle of interest allowing SAXS measurements to be made from a well-resolved chromatographic peak of a single protein species. Still, in some cases, even inline purification is not a guarantee of monodisperse samples, either because multiple components are too close to each other in size or changes in shape induced through binding alter perceived elution time. In these cases, it may be possible to deconvolute the SAXS data of a mixture to obtain the idealized SAXS curves of individual components. Here, we show how this is achieved and the practical analysis of SEC-SAXS data is performed on ideal and difficult samples. Specifically, we show the SEC-SAXS analysis of the vaccinia E9 DNA polymerase exonuclease minus mutant.
Biological macromolecules are too small to be seen even with the best light microscopes. Current methods to determine their structures generally involve crystallizing the protein or measurements on vast numbers of identical molecules at the same time. While crystallography provides information on the atomic level, it represents an artificial sample environment, given that most macromolecules are not presented in a crystalline form in the cell. During the last couple of years cryo-electron microscopy delivered similar high-resolution structures of large macromolecules / macromolecular complexes, but although the samples are closer to physiological condition, they are still frozen, hence immobile and static. Bio-small angle X-ray scattering (BioSAXS) provides a structural measurement of the macromolecule, in conditions that are relevant to biology. This state can be visualized as a low resolution 3-D shape determined on nanometer scale and captures the entire conformational space of the macromolecule in solution. BioSAXS experiments efficiently assess oligomeric state, domain and complex arrangements as well as flexibility between domains1,2,3. The method is accurate, mostly non-destructive and usually requires only a minimum of sample preparation and time. However, for the best interpretation of the data, the samples need to be monodisperse. This is challenging; biological molecules are often susceptible to contaminations, poor purification and aggregation, for example from freeze thawing4. The development of inline chromatography followed by immediate SAXS measurement helps mitigate these effects. Size-exclusion chromatography separates the samples by size thus excluding most contaminants and aggregations5,6,7,8,9,10. However, in some cases even SEC-SAXS is not sufficient to produce a monodisperse sample, because the mixture may consist of components that are too close in size or their physical properties or their fast dynamics lead to overlapping peaks in the SEC UV trace. In these cases, a software-based deconvolution step of the obtained SAXS data might lead to an idealized SAXS curve of the individual component5,11,12. As an example, in protocol section 2, we show the standard SEC-SAXS analysis of the vaccinia E9 DNA polymerase exonuclease minus mutant (E9 exominus) in complex with DNA. Vaccinia represents the model organism of the Poxviridae, a family containing several pathogens, for example the human smallpox virus. The polymerase was shown to bind tightly to DNA in biochemical approaches, with the structure of the complex recently solved by X-ray crystallography13.
Most synchrotron facilities will provide an automated data processing pipeline that will perform data normalization and integration producing a set of unsubtracted frames. But the approach described in this manuscript could also be use with a lab source provided SEC-SAXS is performed. Furthermore, additional automation may be available that will reject radiation-damaged frames and perform the buffer subtraction14. We will show how to perform primary data analysis on pre-processed data and make the most of the available data in section 2.
In section 3, we show how to deconvolute SEC-SAXS data and analyze the curves efficiently. While there are several deconvolution methods such as the Gaussian peak deconvolution, implemented in US-SOMO15 and the Guinier optimized maximum likelihood method, implemented in the DELA software16, these generally require a model for the peak shape12. The finite size of individual peaks we are investigating allows the use of evolving factor analysis (EFA), as an enhanced form of singular value decomposition (SVD) to deconvolute overlapping peaks, without relying on the peak shape or scattering profile5,11. A SAXS-specific implementation can be found in BioXTAS RAW17. EFA was first used on chromatography data when 2D diode array data allowed matrices to be formed from absorbance against retention time and wavelength data18. Where EFA excels is that it focuses on the evolving character of singular values, how they change with the appearance of new components, with the caveat that there is an inherent order in the acquisition10. Fortunately, SEC-SAXS data provides all the necessary ordered acquisition data in organized 2D data arrays, lending itself nicely to the EFA technique.
In section 4, we will demonstrate the basics of model-independent SAXS analysis from the buffer-background subtracted SAXS curve. Model-independent analysis determines the particle’s radius-of-gyration (Rg), volume-of-correlation (Vc), Porod Volume (Vp), and Porod-Debye Exponent (PE). The analysis provides a semi-quantitative assessment of the particle’s thermodynamic state in terms of compactness or flexibility via the dimensionless Kratky plot2,4,19.
Finally, SAXS data are measured in reciprocal space units and we will show how to transform the SAXS data to real-space to recover the pair-distance, P(r), distribution function. The P(r)-distribution is the set of all distances found within the particle and includes the particle’s maximum dimension, dmax. Since this is a thermodynamic measurement, the P(r)-distribution represents the physical space occupied by the particles’ conformational space. Proper analysis of a SAXS dataset can provide solution-state insights that complement high-resolution information from crystallography and cryo-EM.
1. Protein expression, purification and SEC-SAXS measurement is based on the published protocol13
2. Primary data analysis
3. Data deconvolution
4. Determine SAXS properties
NOTE: An in-depth tutorial for SAXS determination is found at Bioisis.net. Here we show a basic step by step approach, highlighting the most useful buttons in Scatter.
The advantage of using deconvolution over classical frame selection13 is to remove the influence of species on one another, producing a monodisperse scattering signal. This is also often followed with a better signal to noise ratio. When E9 exominus is bound to DNA and run using SEC-SAXS, two peaks are observed (Figure 1). The first, large peak (approximately frames 420‒475) is the E9 exominus-DNA complex the second (approximately frames 475‒540), the unbound state (see Supplementary Data: Figure 2). While the classical approach of selecting frames provides a stable Rg of the complex in the first peak (see Supplementary Data: Figure 3), the second peak is clearly merged and the Rg across the plot shows that the second peak of interest does not have a stable Rg, due to cross-peak contamination. Only 5 frames could be used that showed a semi-stable Rg, when subtracted they gave an Rg = 36.3 Å (Figure 2, green). When the peaks were deconvoluted using EFA the corresponding curve for the second peak (Figure 2, blue) was overlaid with the original and showed a clear decrease in signal to noise, and a lower Rg, 34.1 Å was recorded. The Kratky plot (Figure 3) shows the complex with the deconvoluted peak (blue) is more globular. This is confirmed by the P(r) curve (Figure 4) which gives a dmax 108.5 Å for the deconvoluted curve (blue) while the non-deconvoluted is more elongated with a dmax 120 Å (green), this is most likely due to heterogeneity arising from the unbound E9 exominus.
Figure 1: Signal plot of E9 exominus alone and with DNA in complex.
The top panel shows a plot of the integral ratio to the background for each frame of a SEC-SAXS run (light blue). The red points show the Rg at each frame over the peak. The bottom panel shows the corresponding heat map showing the residuals for each frame colored according to the Durbin-Watson auto-correlation analysis, regions of high similarity are colored cyan while dissimilar frames follow darker blues to pinks and finally to red depending on the severity of the dissimilarity. Please click here to view a larger version of this figure.
Figure 2: Plot of intensity versus scattering vector.
An overlay of the subtracted SAXS data form the E9 exominus . In green 5 frames (frame 517‒522) averaged and subtracted from an area of semi-stable Rg and in blue the representative scattering curve derived from the EFA deconvolution of the SEC-SAXS peak. Please click here to view a larger version of this figure.
Figure 3: Dimensionless Kratky curve.
Overlay of the deconvoluted (blue) and non-deconvoluted (green) Kratky curve showing E9 exominus is globular. Please click here to view a larger version of this figure.
Figure 4: P(r) curve.
The overlay of the deconvoluted (blue) and non-deconvoluted (green) curves for the E9 exominus. Please click here to view a larger version of this figure.
Supplementary Data. Please click here to download this file
It is desired to have a monodisperse sample before starting a SAXS experiment, but in reality, many data collections do not satisfy this and must be improved by combining the measurement with inline chromatography—SEC in most cases. However, even the shortage of time between purification and data acquisition monodispersity of the sample is not guaranteed. Most commonly, this applies to experiments where components are too close in size or in their physical properties to be separated or are prone to fast dynamics. Here, we have provided a protocol combining single value decomposition with evolving factor analysis to remove the influence of DNAbound E9 exominus from its unboundform creating a monodisperse scattering profile that we were then able to analyze with the SAXS package Scatter IV.
SVD with EFA of SEC-SAXS data are very powerful methods developed to deconvolute SAXS data and improve analysis, but they do have limitations. They require that noise or drift in the buffer baseline of the SEC-SAXS is kept to a minimum. This may involve extra column equilibration (better to use more than 3 column volumes, depending on the buffer) before sample loading. However, the most critical step is the choice of the number of the singular values and the range of data used, as this will greatly affect the accuracy of the deconvolution. It is for this reason that the results should not be taken on their own but further analyzed using techniques such as analytical ultracentrifugation (AUC) or multi-angle-laser-light-scattering (MALLS) for biological interpretation.
Scatter IV is a new, software package, free for research and industrial use with an intuitive user interface that allows even non-experts to analyze their data. Scatter IV has several new features that help to improve the analysis of SEC-SAXS data, such as the heat map linked to the signal plot, enabling greater accuracy with choice of frame selection. In primary data analysis, the Guinier Peak analysis and the cross-validation plot associated with the P(r) analysis offer an integrated troubleshooting ability in the software.
It should be mentioned that many other programs can be used for primary data analysis; these contain the same basic features and are also updated regularly such as BioXTAS RAW17 ATSAS package24 and US-SOMO15 to name a few.
But regardless of which SAXS package is used for analysis, the major limitations are common: the sample preparation, before collection and analysis. In the E9 exominus example shown, it is clear to see the improvement in the signal to noise ratio and with a reduction in the Rg the dmax associated with a monodisperse sample. This will greatly aid further processing of the data such as fitting or modeling with known high-resolution structures.
The authors have nothing to disclose.
We acknowledge the financial support for the project from the French grant REPLIPOX ANR-13-BSV8-0014 and by research grants from the Service de Santé des Armées and the Délégation Générale pour l'Armement. We are thankful to the ESRF for the SAXS beam time. This work used the platforms of the Grenoble Instruct-ERIC center (ISBG; UMS 3518 CNRS-CEA-UGA-EMBL) within the Grenoble Partnership for Structural Biology (PSB), supported by FRISBI (ANR-10-INBS-05-02) and GRAL, financed within the University Grenoble Alpes graduate school (Ecoles Universitaires de Recherche) CBH-EUR-GS (ANR-17-EURE-0003). IBS acknowledges integration into the Interdisciplinary Research Institute of Grenoble (IRIG, CEA). We thank Wim P. Burmeister and Frédéric Iseni for financial and scientific support and we also thank Dr. Jesse Hopkins from BioCAT at the APS for his help and for developing BioXTAS RAW.
Beamline control software BsXCuBE | ESRF | Pernot et al. (2013), J. Synchrotron Rad. 20, 660-664 | local development |
BioXTAS Raw 1.2.3. | MacCHESS | http://bioxtas-raw.readthedocs.io/en/latest/index.html | First developed in 2008 by Soren Skou as part of the biological x-ray total analysis system (BioXTAS) project. Since then it has been extensively developed, with recent work being done by Jesse B. Hopkins |
HPLC program LabSolutions | Shimadzu | n.a. | |
ISPyB | ESRF | De Maria Antolinos et al. (2015). Acta Cryst. D71, 76-85. | local development |
NaCl | VWR Chemicals (BDH Prolabo) | 27808.297 | |
Scatter | Diamond Light Source Ltd | http://www.bioisis.net/tutorial/9 | Supported by SIBYLS beamline (ALS berkeley, Ca) and Bruker Cororation (Karlsruhe, Germany) |
Superdex 200 Increase 5/150 GL column | GE Healthcare | 28990945 | SEC-SAXS column used |
Tris base | Euromedex | 26-128-3094-B |