Structurally related proteins frequently exert distinct biological functions. The exchange of equivalent regions of these proteins in order to create chimeric proteins constitutes an innovative approach to identify critical protein regions that are responsible for their functional divergence.
The goal of this protocol encompasses the design of chimeric proteins in which distinct regions of a protein are replaced by their corresponding sequences in a structurally similar protein, in order to determine the functional importance of these regions. Such chimeras are generated by means of a nested PCR protocol using overlapping DNA fragments and adequately designed primers, followed by their expression within a mammalian system to ensure native secondary structure and post-translational modifications.
The functional role of a distinct region is then indicated by a loss of activity of the chimera in an appropriate readout assay. In consequence, regions harboring a set of critical amino acids are identified, which can be further screened by complementary techniques (e.g. site-directed mutagenesis) to increase molecular resolution. Although limited to cases in which a structurally related protein with differing functions can be found, chimeric proteins have been successfully employed to identify critical binding regions in proteins such as cytokines and cytokine receptors. This method is particularly suitable in cases in which the protein’s functional regions are not well defined, and constitutes a valuable first step in directed evolution approaches to narrow down the regions of interest and reduce the screening effort involved.
Several types of proteins, including cytokines and growth factors, are grouped in families whose members share similar three-dimensional structures but often exert distinct biological functions1,2. This functional diversity is usually the consequence of small differences in amino acid composition within the molecule's active sites3. Identification of such sites and functional determinants do not only offer valuable evolutionary insights but also to design more specific agonists and inhibitors4. However, the large number of differences in residue composition frequently found between structurally related proteins complicates this task. Even though constructing large libraries containing hundreds of mutants is nowadays feasible, assessing every single residue variation and combinations of them still remains a challenging and time-consuming effort5.
Techniques assessing the functional importance of large protein regions are thus of value to reduce the number of possible residues to a manageable number6. Truncated proteins have been the most used approach to tackle this issue. Accordingly, regions are considered to be functionally relevant if the protein function under study is affected by the deletion of a particular region7,8,9. However, a major limitation of this method is that deletions can affect the protein's secondary structure, leading to misfolding, aggregation and the inability to study the intended region. A good example is a truncated version of the cytokine oncostatin M (OSM), in which an internal deletion larger than 7 residues resulted in a misfolded mutant that could not be further studied10.
The generation of chimeric proteins constitutes an alternative and innovative approach that permits the analysis of larger protein regions. The goal of this method is to exchange regions of interest in a protein by structurally related sequences in another protein, in order to assess the contribution of the replaced sections to specific biological functions. Widely used in the field of signaling receptors to identify functional domains11,12, chimeric proteins are particularly useful to study protein families with little amino acid identity but conserved secondary structure. Appropriate examples can be found in the class of interleukin-6 (IL-6) type cytokines, such as interleukin-6 and ciliary neurotrophic factor (6% sequence identity)13 or leukemia inhibitory factor (LIF) and OSM (20% identity)6, on which the following protocol is based.
1. Chimeric Protein Design
2. Preparation for Molecular Cloning
3. Polymerase Chain Reaction (PCR) Amplification of the Individual DNA Fragments Forming the Chimera
4. PCR Amplification to Generate the Chimeric DNA Sequence
5. Insertion of the Chimeric DNA into an Expression Vector
Construction and generation of a chimeric protein (Figure 1) will be exemplified with two members of the interleukin-6 cytokine family, OSM and LIF, which were the subject of a recently published study6. Figure 2 shows the three-dimensional structure of these proteins. Both molecules adopt the characteristic secondary structure of class I cytokines, with four helices (termed A to D) packed in a bundle and joined by loops28. The aligned amino acid structures of the human proteins can be seen in Figure 3A. In this example the BC loop region of OSM was exchanged by the corresponding LIF sequence to create an OSM-LIF chimera with the amino acid sequence as shown in Figure 3B.
For this purpose, the DNA sequence of OSM and LIF were obtained and the encoding amino acid region corresponding to the BC loop was identified for both cytokines and replaced (Figure 4). A 6-histidine tag was additionally incorporated in the C-terminus to facilitate downstream protein purification. Next, a suitable vector for mammalian expression (pCAGGS) was chosen, and unique restriction sites within its multiple cloning site were selected (PacI and AscI) after ensuring that they were not present in the chimeric gene sequence (see step 2.1.1.).
Primers were designed as shown in Table 4. The N-terminal forward OSM primer included a leading sequence of 9 base pairs, followed by the PacI restriction site, a plasmid-specific spacer, and the initial 27 base pairs of OSM. The C-terminal reverse primer incorporated the leading sequence followed by the AscI restriction site, a spacer, and the 27 last base pairs of the gene, which in this particular case corresponded to the C-terminal histidine tag. In addition, 30-base pair primers spanning the junction points of the BC loop were required in both forward and reverse orientations.
The first PCR amplification step consisted of three separate reactions. The N-terminal OSM fragment, which required N-terminal OSM forward and BC start reverse primers, used OSM as template. The LIF BC loop was obtained through BC start forward and BC end reverse primers using LIF as template. The C-terminal OSM fragment used BC end forward and C-terminal OSM reverse primers, as well as OSM as template. These three fragments, with expected sizes of 385, 75 and 321 base pairs respectively, can be seen in Figure 5A after separation in a 1% agarose gel.
These purified fragments were then used as a template in the second PCR reaction, along with N-terminal OSM forward and C-terminal OSM reverse primers. The result of this amplification, corresponding to the OSM-LIF BC loop gene sequence and is shown in Figure 5B. This step was followed by purification, a 4-hour digestion of the gene fragment and the chosen plasmid, gel electrophoresis and purification, overnight ligation at 16 °C, and transformation into E. coli XL1-Blue. Individual plasmids were isolated and screened by restriction enzyme digestion for proper insertion of the DNA fragment (Figure 6). Finally, positive hits were sent for sequencing to verify that the sequence corresponded to the intended OSM-LIF BC loop chimera before proceeding to protein expression, purification and testing in functional assays6.
Figure 1: Schematic representation of chimeric protein generation. (A) Chimeric design process: after selection of the regions to be exchanged, the sequence of the desired chimera and the necessary primers are constructed by means of DNA editing software. (B) The key steps in the generation of chimeric proteins are depicted. Two steps of PCR amplification produce a chimeric gene sequence, which is then digested with the appropriate restriction enzymes and ligated into an expression vector. Please click here to view a larger version of this figure.
Figure 2: Structural similarities between OSM and LIF. Representation of the crystal structures of OSM29 (PDB: 1EVS) and LIF30 (PDB: 2Q7N), along with an approximate representation of the designed chimera. These cytokines adopt a four-helical bundle conformation joined by loops. This research was originally published in the Journal of Biological Chemistry. Adrian-Segarra, J. M., Schindler, N., Gajawada, P., Lörchner, H., Braun, T. & Pöling, J. The AB loop and D-helix in binding site III of human Oncostatin M (OSM) are required for OSM receptor activation. J. Biol Chem 2018; 18:7017-7029. © the Authors6. Please click here to view a larger version of this figure.
Figure 3: Comparison of OSM and LIF amino acid sequences. (A) Alignment of the full-length amino acid sequences of human OSM and LIF, with the BC loop region highlighted. Asterisks (*) indicate fully conserved residues, colons (:) correspond to amino acids with strongly similar properties and periods (.) denote those with weakly similar features. (B) Amino acid sequence of the OSM BC loop chimera, with the BC loop region of OSM replaced by its LIF equivalent. Please click here to view a larger version of this figure.
Figure 4: DNA sequence of the OSM BC loop chimera. Sequence of the chimeric OSM protein. The region inserted from LIF is highlighted in orange. Please click here to view a larger version of this figure.
Figure 5: Amplification of the OSM BC loop chimera DNA fragments. (A) Result from the first PCR amplification, with bands corresponding to the N-terminal region (lane 2), BC loop (lane 3) and C-terminal region (lane 4). (B) Result from the second PCR amplification, in which the three bands obtained in the first amplification are combined to generate the OSM chimera. Please click here to view a larger version of this figure.
Figure 6: Insertion of the OSM BC loop chimera into plasmid vector. Restriction enzyme digestion of the generated plasmids, with a lower band present at ~700 base pairs indicating the correct insertion of the OSM chimera gene sequence. Please click here to view a larger version of this figure.
Reagent | Stock Concentration | Volume (in 50 µL) |
Sterile Water | to 50 µL | |
PCR Buffer | 10X | 5 µL |
dNTPs | 10 mM | 1 µL |
DMSO | 100% | 1.5 µL |
Forward Primer | 10 µM | 1 µL |
Reverse Primer | 10 µM | 1 µL |
Template DNA | Variable | As required for 2.5-12.5 ng of plasmid DNA |
Phusion High Fidelity DNA Polymerase | 2 Units/µL | 0.5 µL |
Table 1: Reagents required for the PCR reaction mixture.
N-terminal fragment | Chimeric insertion | C-terminal fragment | |
Forward Primer | N-terminal forward | First junction forward | Second junction forward |
Reverse Primer | First junction reverse | Second junction reverse | C-terminal reverse |
Template DNA | Recipient | Donor | Recipient |
Table 2: Primer and templates needed for the generation of a standard chimeric protein.
Cycle step | Temperature | Time | Number of cycles |
Initial Denaturation | 98 ºC | 30s | 1 |
Denaturation | 98 ºC | 10s | 23-25 |
Annealing | 68 ºC* | 25s | 23-25 |
Extension | 72 ºC | 30s per kilobase | 23-25 |
Final Extension | 72 ºC | 300s | 1 |
Hold | 4 ºC | ∞ | 1 |
*At least 5 ºC lower than the primer melting temperature |
Table 3: PCR protocol used to amplify the chimeric fragments and the full chimeric protein.
Reagent | Stock Concentration | Volume (in 50 µL) |
Sterile Water | to 50 µL | |
Restriction Enzyme Buffer* | 10X | 5 µL |
Template DNA | Variable | As required for 1 µg of DNA |
Enzyme #1 | Variable | 1 µL |
Enzyme #2 | Variable | 1 µL |
*Ensure both enzymes are compatible with the buffer selected |
Table 4: Components of the restriction enzyme digestion reaction.
Reagent | Stock Concentration | Volume (in 20 µL) |
Sterile Water | to 20 µL | |
T4 DNA Ligase Buffer | 10X | 2 µL |
Vector DNA | Variable | As required for 40 ng of DNA |
Insert DNA | Variable | As required for a 3:1 molar ratio |
T4 DNA Ligase | Variable | 1-2 µL |
Table 5: Reagents required for the ligation reaction.
Name | Primer sequence |
N-terminal OSM forward primer (PacI) | AAAGGGAAA-TTAATTAA-GCTAGCGCATCGCCACC-ATGGGGGTACTGCTCACACAGAGGACG |
C-terminal HisTag reverse primer (AscI) | TTTCCCTTT-GGCGCGCC-GCGGCCGCTA-TCAGTGGTGGTGGTGGTGGTGCTCGAG |
BC loop start forward | AACATCACCCGGGACTTAGAGCAGCGCCTC |
BC loop start reverse | GAGGCGCTGCTCTAAGTCCCGGGTGATGTT |
BC loop end forward | GACTTGGAGAAGCTGAACGCCACCGCCGAC |
BC loop end reverse | GTCGGCGGTGGCGTTCAGCTTCTCCAAGTC |
Table 6: Primers used in the generation of the OSM BC loop chimera.
The generation of chimeric proteins constitutes a versatile technique, which is able to go beyond the limits of truncated proteins to address questions such as the modularity of cytokine-receptor binding domains13. The design of chimeras is a key step in this kind of studies, and requires careful consideration. Preliminary studies to establish functional domains will generally require substitution of broad regions in a first phase, while smaller replacements of variable lengths are more suited to detailed studies of a single region. Special attention should be given to the presence of small conserved motifs within a protein family in this step, since these are often indicative of functional sites31,32. Personal experience indicates that more than one round of chimeric protein design can be necessary to narrow down a key functional region, with each round requiring significant time (weeks to months) from initial design to functional assay testing.
As long as there exists a structurally similar protein to the protein of interest, but possessing diverging biological functions, the method is applicable to any sequence of interest, although it has to be optimized for each particular gene due to its reliance on PCR amplification. Particularly, genes possessing GC-rich regions might prove particularly challenging targets, since these types of sequences are known to reduce the efficiency of the amplification33. These issues can usually be solved by different means, such as the addition of different additives (e.g. betaine) to the reaction, the use of specialized DNA polymerase buffers, or the modification of the annealing parameters34. Hence, it will generally require some trial and error before adequate conditions for the gene of interest are found.
The protocol provided is based on classic restriction enzyme-based cloning methods, which are generally accessible to every type of laboratory, but it can be further adapted to take advantage of more advanced cloning techniques. For example using gateway cloning, which facilitates cloning the same insert in several different vectors (e.g. if different expression systems are to be tested in parallel), would merely require particular attB recombination sites in place of the restriction sites detailed in this protocol35. Other newer cloning methodologies can bypass the need for a second PCR reaction (e.g. USER36 or Gibson assembly37) and ligation (e.g. sequence and ligation-independent cloning (SLIC)38 or In-fusion assembly39). While requiring different reagents and primer design strategies, readers with access to these methods are encouraged to apply them to significantly speed up the generation of chimeric constructs after following the basic design principles detailed in step 1 of this protocol.
Overall, the application of this method can supply valuable insight regarding the mechanisms by which other protein biological functions take place, in particular involving protein-protein or protein-nucleic acid interactions, and constitutes a useful tool to identify and specify unique structure-function relationships within a protein family6.
The authors have nothing to disclose.
This work was supported by the Max Planck Society and the Schüchtermann-Clinic (Bad Rothenfelde, Germany). Part of this research was originally published in the Journal of Biological Chemistry. Adrian-Segarra, J. M., Schindler, N., Gajawada, P., Lörchner, H., Braun, T. & Pöling, J. The AB loop and D-helix in binding site III of human Oncostatin M (OSM) are required for OSM receptor activation. J. Biol. Chem. 2018; 18:7017-7029. © the Authors.
Labcycler thermocycler | Sensoquest | 011-103 | Any conventional PCR machine can be employed to carry out this protocol |
NanoDrop 2000c UV-Vis spectrophotometer | ThermoFisher Scientific | ND-2000C | DNA quantification |
GeneRuler 100 bp DNA ladder | ThermoFisher Scientific | SM0241 | |
GeneRuler DNA Ladder Mix | ThermoFisher Scientific | SM0331 | |
AscI restriction enzyme | New England Biolabs | R0558 | |
PacI restriction enzyme | New England Biolabs | R0547 | |
Phusion Hot Start II DNA Polymerase | ThermoFisher Scientific | F-549S | |
dNTP set (100 mM) | Invitrogen | 10297018 | |
T4 DNA ligase | Promega | M1804 | |
NucleoSpin Gel and PCR clean-up kit | Macherey-Nagel | 740609 | |
MGC Human LIF Sequence-Verified cDNA (CloneId:7939578), glycerol stock | ThermoFisher Scientific | MHS6278-202857165 | |
LE agarose | Biozym | 840004 | |
Primers | Sigma-Aldrich | Custom order | |
Human Oncostatin M cDNA | Gift of Dr. Heike Hermanns (Division of Hepatology, University Hospital Würzburg, Germany) | ||
pCAGGS vector with PacI and AscI restriction sites | Gift of Dr. André Schneider (Max Planck Institute for Heart and Lung Research, Bad Nauheim, Germany) |