This article presents a method to generate protein crystals derivatized with I3C (5-amino-2,4,6-triiodoisophthalic acid) using microseeding to generate new crystallization conditions in sparse matrix screens. The trays can be set up using liquid dispensing robots or by hand.
Protein structure elucidation using X-ray crystallography requires both high quality diffracting crystals and computational solution of the diffraction phase problem. Novel structures that lack a suitable homology model are often derivatized with heavy atoms to provide experimental phase information. The presented protocol efficiently generates derivatized protein crystals by combining random microseeding matrix screening with derivatization with a heavy atom molecule I3C (5-amino-2,4,6-triiodoisophthalic acid). By incorporating I3C into the crystal lattice, the diffraction phase problem can be efficiently solved using single wavelength anomalous dispersion (SAD) phasing. The equilateral triangle arrangement of iodine atoms in I3C allows for rapid validation of a correct anomalous substructure. This protocol will be useful to structural biologists who solve macromolecular structures using crystallography-based techniques with interest in experimental phasing.
In the field of structural biology, X-ray crystallography is regarded as the gold standard technique to determine the atomic-resolution structures of macromolecules. It has been utilized extensively to understand the molecular basis of diseases, guide rational drug design projects and elucidate the catalytic mechanism of enzymes1,2. Although structural data provides a wealth of knowledge, the process of protein expression and purification, crystallization and structure determination can be extremely laborious. Several bottlenecks are commonly encountered that hinder the progress of these projects and this must be addressed to efficiently streamline the crystal structure determination pipeline.
Following recombinant expression and purification, preliminary conditions that are conducive to crystallization must be identified which is often an arduous and time-consuming aspect of X-ray crystallography. Commercial sparse matrix screens that consolidate known and published conditions have been developed to ease this bottleneck3,4. However, it is common to generate few hits from these initial screens despite using highly pure and concentrated protein samples. Observing clear drops indicates that the protein may not be reaching the levels of supersaturation required to nucleate a crystal. To encourage crystal nucleation and growth, seeds produced from pre-existing crystals can be added to the conditions and this allows for increased sampling of the crystallization space. Ireton and Stoddard first introduced the microseed matrix screening method5. Poor quality crystals were crushed to make a seed stock and then added systematically to crystallization conditions containing different salts to generate new diffraction-quality crystals that would not have otherwise formed. This technique was further improved by D'Arcy et al. who developed random microseed matrix screening (rMMS) in which seeds were introduced into a spare matrix crystallization screen6,7. This improved the quality of crystals and increased the number of crystallization hits on average by a factor of 7.
After crystals are successfully produced and an X-ray diffraction pattern is obtained, another bottleneck in the form of solving the 'phase problem' is encountered. During the data acquisition process, the intensity of diffraction (proportional to the square of the amplitude) is recorded but the phase information is lost, giving rise to the phase problem that halts immediate structure determination8. If the target protein shares high sequence identity to a protein with a previously determined structure, molecular replacement can be used to estimate the phase information9,10,11,12. Although this method is fast and inexpensive, model structures may not be available or suitable. The success of the homology model-based molecular replacement method drops significantly as sequence identity falls below 35%13. In the absence of a suitable homology model, ab initio methods, such as ARCIMBOLDO14,15 and AMPLE16, can be tested. These methods use computationally predicted models or fragments as starting points for molecular replacement. AMPLE, which uses predicted decoy models as starting points, struggles to solve structures of large (>100 residues) proteins and proteins containing predominately β-sheets. ARCIMBOLDO, which attempts to fit small fragments to extend into a larger structure, is limited to high resolution data (≤2 Å) and by the ability of algorithms to expand the fragments into a full structure.
If molecular replacement methods fails, direct methods such as isomorphous replacement17,18 and anomalous scattering at a single wavelength (SAD19) or multiple wavelengths (MAD20) must be used. This is often the case for truly novel structures, where the crystal must be formed or derivatized with a heavy atom. This can be achieved by soaking or co-crystallizing with a heavy atom compound, chemical modification (such as 5-bromouracil incorporation in RNA) or labelled protein expression (such as incorporating selenomethionine or selenocysteine amino acids into the primary structure)21,22. This further complicates the crystallization process and requires additional screening and optimization.
A new class of phasing compounds, including I3C (5-amino-2,4,6-triiodoisophthalic acid) and B3C (5-amino-2,4,6-tribromoisophthalic acid), offer exciting advantages over pre-existing phasing compounds23,24,25. Both I3C and B3C feature an aromatic ring scaffold with an alternating arrangement of anomalous scatters required for direct phasing methods and amino or carboxylate functional groups that interact specifically with the protein and provide binding site specificity. The subsequent equilateral triangular arrangement of heavy metal groups allows for simplified validation of the phasing substructure. At the time of writing, there are 26 I3C-bound structures in the Protein Data Bank (PDB), of which 20 were solved using SAD phasing26.
This protocol improves the efficacy of the structure determination pipeline by combining the methods of heavy metal derivatization and rMMS screening to simultaneously increase the number of crystallization hits and simplify the crystal derivatization process. We demonstrated this method was extremely effective with hen egg white lysozyme and a domain of a novel lysin protein from bacteriophage P6827. Structure solution using the highly automated Auto-Rickshaw structure determination pipeline is described, specifically tailored for the I3C phasing compound. There exists other automated pipelines that can be used such as AutoSol28, ELVES29 and CRANK230. Non-fully automated packages such as SHELXC/D/E can also be used31,32,33. This method is particularly beneficial to researchers who are studying proteins lacking homologous models in the PDB, by significantly reducing the number of screening and optimization steps. A prerequisite for this method is protein crystals or a crystalline precipitate of the target protein, obtained from previous crystallization trials.
1. Experimental planning and considerations
2. Preparation of lithium I3C stock
3. Addition of I3C to the protein stock
4. Making a seed stock
5. Setting up an rMMS screen
6. Data collection
7. Data processing and structure solution
Incorporating I3C into rMMS can generate new conditions supporting derivatized crystal growth
The efficacy of simultaneous rMMS screening and I3C derivatization was demonstrated in two proteins, hen egg white lysozyme (HEWL, obtained as a lyophilized powder) and the putative Orf11 lysin N-terminal domain (Orf11 NTD) from bacteriophage P68. Each protein was screened against PEG/ION HT under four different conditions including: unseeded, seeded, unseeded with I3C and seeded with I3C (Figure 1). For both proteins, the sole addition of I3C did not increase the number of conditions conducive to crystallization. In the case of Orf11 NTD, only one suitable condition was identified with and without I3C (Figure 1B). When I3C was added to the HEWL screens, the number of hits was reduced from 31 to 26, highlighting the added complexities of crystallisation when introducing phasing compounds (Figure 1A). Consistent with other studies, adding seed to commercial sparse matrix screens to generate an rMMS screen significantly increased the number of possible crystallization conditions for both proteins, resulting in a 2.1 and 6 fold increase for HEWL and Orf11 NTD, respectively6,61 (Figure 1). Most importantly, simultaneous addition of I3C and seed increased the number of hits relative to an unseeded screen, demonstrating a 2.3 and 7 fold increase for HEWL and Orf11 NTD, respectively. Many of the crystals from rMMS in the presence of I3C show excellent crystal morphology (Figure 2).
Seeding allows careful control of crystal number in I3C rMMS screens
In microseeding experiments, the number of seeds introduced into a crystallization trial can be controlled by dilution of the seed stock and this allows for precise control of nucleation in the drop7,36. This often allows larger crystals to form since there is reduced competition of protein molecules at nucleation sites. This advantage also extends to the I3C-rMMS method and has been demonstrated successfully in both HEWL and Orf11 NTD. Recreation of a crystallization condition identified from the I3C-rMMS screen with a diluted seed stock yielded fewer but larger crystals (Figure 3).
SAD phasing can be used to solve the structures from crystals derived from rMMS I3C screen
Crystals grown using the diluted seed stock shown in Figure 3 were used to solve the structure of the proteins using SAD phasing using diffraction data from a single crystal (Figure 4). Data was collected on the Australian Synchrotron MX1 beamline62. Detailed data collection and structure solution details are described elsewhere27.
Figure 1 – rMMS was used to generate new conditions for crystal growth in the presence of I3C for two test proteins. 96 well vapor diffusion crystallization screens were carried out using commercial sparse matrix screens. (A) Hen egg white lysozyme was tested with the Index HT screen. Trays were seeded with HEWL crystals grown in 0.2 M ammonium tartrate dibasic pH 7.0, 20% (w/v) polyethylene glycol 3350. (B) Orf11 NTD from bacteriophage P68 was tested with the PEG/ION screen. Orf11 NTD trays were seeded from crystals from condition G12 from the unseeded screen, shown in blue. Conditions supporting crystal growth are shown in red. rMMS seeding in the presence and absence of I3C both gave significantly more crystal hits than unseeded trays. Figure adapted from Truong et al.27. Please click here to view a larger version of this figure.
Figure 2 – Representative images of crystals grown from the vapor diffusion trials shown in Figure 1 (a) and (b). Figure adapted from Truong et al.27. Please click here to view a larger version of this figure.
Figure 3 – Dilution of the seed stock is an effective way to reduce nucleation in a crystallization condition found using the I3C-rMMS method, to control the number of crystals that form. Reducing nucleation within a drop often results in crystals growing to larger dimensions. Figure adapted from Truong et al.27. Please click here to view a larger version of this figure.
Figure 4 – Orf11 NTD (PDB ID 6O43) and HEWL (PDB ID 6PBB) were crystallized using the I3C-rMMS method and solved using Auto-Rickshaw SAD phasing. (A) Ribbon structures of HEWL and Orf11 NTD solved through experimental phasing. (B) I3C molecule bound to HEWL and Orf11 NTD. (C) Anomalous iodine atoms in I3C are arranged in an equilateral triangle of 6 Å. Thus the presence of this triangle in the phasing substructure indicates that there is an I3C molecule in that position. Please click here to view a larger version of this figure.
Structure determination of a novel protein in the absence of a suitable homology model for molecular replacement requires experimental phasing. These methods require incorporation of heavy atoms into the protein crystal which adds a level of complexity to the structure determination pipeline and can introduce numerous obstacles that must be addressed. Heavy atoms can be incorporated directly into the protein through labelled expression using selenomethionine and selenocysteine. As this method is costly, laborious and can result in lower protein yields, labelled protein is often expressed after crystallization conditions has been found and optimized with unlabeled protein. Alternatively, crystals can be derivatized by soaking in a solution containing heavy atoms22,63,64. This method often uses high quality crystals and is therefore performed after a robust crystallization method has already been developed. Successfully obtaining a derivatized crystal using this method requires further optimization of soaking procedures and screening of different phasing compounds, therefore adding more time to an already laborious process.
Co-crystallization of the protein with the heavy atom can be performed at the screening stage, thus efficiently streamlining the process and reducing crystal manipulation steps that can cause damage. However, there still exists the potential scenario of obtaining few initial crystallization hits and the problem of choosing a compatible heavy atom compound. Many currently available phasing compounds are incompatible with precipitants, buffers and additives commonly found in crystallization conditions. They may be insoluble in sulphate and phosphate buffers, chelate to citrate and acetate, react unfavorably with HEPES and Tris buffers or become sequestered by DTT and β-mercaptoethanol21. As the I3C phasing compound does not suffer from these incompatibilities, it is a robust phasing compound that could be amenable to many different conditions.
In this study, a streamlined method of producing derivatized crystals ready for SAD phasing through simultaneous co-crystallization of the I3C phasing compound and rMMS is presented. The combination of both techniques increases the number of crystallization hits, with many of the conditions having improved morphology and diffraction characteristics. In both Orf11 NTD and HEWL test cases, new conditions in the I3C-rMMS screen were identified that were absent when I3C was not present. Potentially, I3C may bind favorably to the protein, facilitating the formation and stabilization of crystal contacts27. In turn, this may induce crystallization and possibly improve diffraction characteristics. Besides being a compound compatible with sparse matrix screens, I3C is also an attractive phasing compound due to its intrinsic properties. The functional groups that alternate with iodine on the aromatic ring scaffold allow specific binding to proteins. This leads to greater occupancy and potentially reduces background signal23. Furthermore, the arrangement of anomalous scatterers in an equilateral triangle is obvious in the substructure and can be used to rapidly validate binding of I3C (Figure 4B and 4C). Finally, it can produce an anomalous signal with tunable synchrotron radiation as well as chromium and copper rotating anode X-ray sources. Thus, it can be applied to many different workflows. As I3C is widely available and inexpensive to purchase, this approach is within reach for most structural biology laboratories.
There are several experimental considerations that must be addressed when using the I3C-rMMS method. This method cannot be applied if initial crystalline material of the protein cannot be obtained. In difficult cases, crystalline material from a homologous protein can also be used to generate seed stock. This cross-seeding approach to rMMS has shown some promising results7. Optimizing crystal number through dilution of the seed stock is a crucial step, which should not be overlooked, to maximize the chance of producing high quality large crystals and acquiring suitable diffraction data. If there are few I3C sites identified in the asymmetric unit, conditions conducive to crystallization should be further optimized with an increased concentration of I3C. This may increase the occupancy of I3C to maximize the anomalous signal and aid crystal derivatization.
There can be cases where this technique may not be the optimal method to derivatize protein crystals. As the size of a protein or protein-complex increases, the limited number of I3C sites on the protein surface may not provide sufficient phasing power to solve the structure. In these scenarios where protein size is suspected to be impeding phasing, selenomethionine labelling of the protein may be a more viable approach to phasing the protein. If the protein has adequate numbers of methionine residues in the protein (recommended having at least one methionine per 100 residues65) and high efficiency selenomethionine incorporation into a protein can be achieved (such as in bacterial expression systems66), multiple high occupancy selenium atoms will be present in the crystals to phase the structure.
In addition, some proteins may inherently be unsuited for derivatization with I3C. I3C binding sites on proteins are dependent on protein structure. There may exist proteins that naturally have few exposed patches compatible with I3C binding. Thus, it is not unforeseeable that there may be difficulties in co-crystallizing some target proteins with I3C.
The authors have nothing to disclose.
This research was undertaken on the MX1 beamline at the Australian Synchrotron, part of ANSTO. The authors would like to acknowledge members of the Shearwin and Bruning laboratories for discussions on this work. The authors would also like to acknowledge Dr. Santosh Panjikar and Dr. Linda Whyatt-Shearwin who contributed to the original work that pioneered this protocol.
The following funding is acknowledged: Australian Research Council (grant Nos. DP150103009 and DP160101450 to Keith E. Shearwin); University of Adelaide (Australian Government Research Training Program stipend scholarship to Jia Quyen Truong and Stephanie Nguyen).
10 mL disposable luer lock syringes | Adelab Scientific | T3SS10LAT | Used for dispensing vacuum grease for hanging drop crystal tray wells |
24 well tissue culture plate | Sigma Aldrich | CLS3527 | Used for hanging drop crystal tray |
3 inch wide Crystal Clear Sealing Tape | Hampton Research | HR4-506 | For 96 well crystallization screens set up by robot |
5-amino-2,4,6-triiodoisophthalic acid | Alfa Aesar | B22178 | Commonly referred to as I3C in the article |
Art Robbins Intelli-Plate 96-2 Original | Hampton Research | HR3-297 | For 96 well crystallization screens set up by robot |
Coverslips | Thermo Fisher Scientific | 18X18-2 | Coverslips for hanging drop crystal tray wells |
Dow Corning vacuum grease | Hampton Research | HR3-510 | Used for sealing hanging drop crystal tray wells |
Eppendorf Pipette 0.1 μL-2.5 μL | Eppendorf | 3120000011 | |
Gilson Pipette 2 μL-20 μL | John Morris Group | 1153247 | |
Gilson Pipette 20 μL-200 μL | John Morris Group | 1152006 | |
Glass pasteur pipettes | Adelab Scientific | HIR92601.01 | |
Hen Egg White Lysozyme | Sigma-Aldrich | L6876 | Approximately 95% pure |
IndexHT screen | Hampton Research | HR2-134 | |
Microscope illuminator | Meiji Techno | FT192/230 | Light source to illuminate crystallography experiments |
PEG/ION HT screen | Hampton Research | HR2-139 | |
Phoenix Liquid Dispenser | Art Robbins Instruments | 602-0001-10 | |
Scalpel with scalpel blade no. 15 | Adelab Scientific | LV-SMSCPO15 | |
Seed bead kit | Hampton Research | HR2-320 | Kit contains a glass probe for crushing crystals. A PTFE seed bead, designed for crushing crystals, is also part of the kit but not used in this protocol. |
Stereo microscope | Meiji Techno | EMZ-5TR | Microscope for visualising crystallography experiments |
Tweezers | Sigma-Aldrich | T5415 | |
Vortex mixer | Adelab Scientific | RAVM1 |