The objective of this protocol is to use molecular dynamics simulations to examine the dynamic structural changes that occur due to activating mutations of the EGFR kinase protein.
Numerous somatic mutations occurring in the epidermal growth factor receptor (EGFR) family (ErbB) of receptor tyrosine kinases (RTK) have been reported from cancer patients, although relatively few have been tested and shown to cause functional changes in ErbBs. The ErbB receptors are dimerized and activated upon ligand binding, and dynamic conformational changes of the receptors are inherent for induction of downstream signaling. For two mutations shown experimentally to alter EGFR function, A702V and the Δ746ELREA750 deletion mutation, we illustrate in the following protocol how molecular dynamics (MD) simulations can probe the (1) conformational stability of the mutant tyrosine kinase structure in comparison with wild-type EGFR; (2) structural consequences and conformational transitions and their relationship to observed functional changes; (3) effects of mutations on the strength of binding ATP as well as for binding between the kinase domains in the activated asymmetric dimer; and (4) effects of the mutations on key interactions within the EGFR binding site associated with the activated enzyme. The protocol provides a detailed step-by-step procedure as well as guidance that can be more generally useful for investigation of protein structures using MD simulations as a means to probe structural dynamics and the relationship to biological function.
The human epidermal growth factor receptor (EGFR) family (ErbB) of receptor tyrosine kinases (RTKs) includes four members – EGFR/ErbB1/HER1, ErbB2/HER2, ErbB3/HER3 and ErbB4/HER4. The ErbB receptors regulate fundamental cellular processes such as cell growth and proliferation, differentiation, migration and survival1,2, and are thus potent proto-oncogenes. Aberrant activity of the ErbB receptors, especially EGFR and ErbB2, has been frequently associated with human cancers making ErbB receptors key targets for cancer therapeutics2,3.
Several somatic alterations of ERBB genes have been reported from human malignancies3,4,5. The best characterized examples include the recurrent, activating point mutations and short in-frame deletions in the EGFR kinase domain in non-small cell lung cancer (NSCLC). These EGFR mutations represent key drivers of cancer growth, and predict sensitivity to EGFR targeting cancer drugs6,7,8. However, in most cancers, somatic mutations in EGFR occur outside of these recurrent "hotspots" and are distributed over the entire 1210-residue span of the receptor. Indeed, most of the residues along the EGFR primary sequence have been found to be mutated in human cancer9. Nevertheless, apart from the few hotspots, the functional significance of the vast majority of the cancer-associated EGFR mutations remains unknown.
The monomeric structure of ErbBs consists of a large amino terminal extracellular domain, followed by a single transmembrane helix leading to the intracellular tyrosine kinase domain and C-terminal tail region that contains docking sites for intracellular signaling proteins. Ligand binding triggers a dramatic conformational change in the extracellular domain, which facilitates the formation of receptor dimers by exposing the dimerization arms that symmetrically cross over each other and interact with their aromatic/hydrophobic surfaces. Upon receptor dimer formation the tyrosine kinase domains come into contact asymmetrically (Figure 1), resulting in the activation of the kinases that phosphorylate the C-terminal tails of the receptor monomers, and subsequently in activation of downstream signaling10,11.
Figure 1: Structure of the EGFR dimer. EGFR dimerizes when the extracellular domains bind growth factor (EGF, epidermal growth factor). The receiver kinase domain is then activated through asymmetric interaction with the activator kinase domain, and the C-terminal tails are autophosphorylated at tyrosine residues (Modified from Tamirat et al.12). Please click here to view a larger version of this figure.
Because of the dynamic structural rearrangements that occur during the monomer dimer transitions, along with kinase activation that is associated with the formation of an asymmetric dimer, mutations along the full length of the receptor structure can potentially have an effect on receptor function. Here we describe several examples from our previous studies in which modeling of the mutation and visualization were sufficient to explain the consequences for function.
Example 1: One reported mutation, D595V in ErbB413, led to increased ErbB4 dimerization and phosphorylation14. Visualization of the location of the mutation was a critical factor in understanding the observed functional effects: D595V occurred at the symmetrical crossover of the dimeric arms of the ectodomain (Figure 2A). The arms are largely aromatic and hydrophobic, and the replacement of polar aspartic acid by valine would be expected to increase the "sticky" hydrophobic interactions, stabilizing the dimer and hence increase the length of time when phosphorylation takes place14. It was a surprise at first to find aspartate in each arm, but in retrospect one might think of it as a timing mechanism for activity, where the polar acid side chains reduce the affinity and lifetime of the intact dimer and hence limit kinase-mediated phosphorylation and signaling. Replacement by valine would then remove this safeguard by further stabilizing the ErbB4 dimer.
Figure 2: Location of an ErbB4 activating mutation and mutations producing kinase-dead ErbB4. (A) D595 (activating D595V mutation) is located on the aromatic/hydrophobic dimeric arms of the ErbB4 ectodomain model; the arms associate on growth factor binding; (nearby residues are shown as sticks). (B) In ErbB4, G802 (inactivating G802dup mutation) helps form the binding pocket around the adenine ring of ATP and catalytic D861 (inactivating D861Y mutation) binds both Mg2+ (not shown) and the γ-phosphate group of ATP. Please click here to view a larger version of this figure.
Example 2: One might anticipate that somatic mutations that target the ATP-binding site of the kinase domain would alter or eliminate enzymatic activity leading to an impaired or kinase-dead receptor incapable of signaling. Of nine reported mutations from patients with breast, gastric, colorectal, or NSCLC15, two of the nine mutations when tested had highly diminished phosphorylation activity16: G802dup (G → GG) and D861Y. Both inactivating somatic mutations were found within the ATP binding site of the tyrosine kinase domain structure (Figure 2B): flexible glycine, duplicated, would alter the adenine ring site and small aspartic acid replaced by the bulky tyrosine near the terminal phosphates would physically prevent Mg2+-ATP from binding. However, since ErbB4 can form a heterodimer with ErbB2 – ErbB2 does not bind a growth factor and depends on association with an ErbB that does in order to heterodimerize – the ErbB2(active)-ErbB4(kinase-dead) heterodimer would stimulate cell proliferation via the Erk/Akt signaling pathway yet cells would not differentiate because of kinase-dead ErbB4 and lack of STAT5 pathway activation16.
In more recent studies, it became apparent that the dynamic movements of ErbBs were relevant to understanding the effects of some mutants on ErbB function, especially mutations that occur within the tyrosine kinase domain. The tyrosine kinase domain consists of an N-lobe (mainly β-sheets) and C-lobe (largely alpha helical), which are separated by the catalytic site where ATP binds. The N-lobe includes the αC helix and P-loop, whereas the activation (A-loop) and catalytic loops are present in the C-lobe17,18,19. Crystal structures of the tyrosine kinase domain revealed two inactive conformations, the majority of structures have the Src-like inactive state. In the active conformation the catalytic aspartate of the A-loop points towards the ATP binding site and the αC helix is oriented towards the ATP binding pocket ("αC-in" conformation), forming a strong glutamate-lysine ion-pair interaction.
Because ErbBs and the component kinase domain are highly dynamic entities, and especially for instances where the effects of mutations on function and biological activity are likely to be tightly linked to the conformational states of the ErbBs, it is important to assess mutations with respect to the range of dynamic changes they would experience. X-ray crystal structures of ErbBs provide static snapshots of the 3D structure, which may or may not be relevant to understanding the dynamic consequences of a mutation. In order to probe the range of dynamic changes corresponding to the "energy landscape" available to a three-dimensional (3D) structure, molecular dynamics (MD) simulations are widely used20. In the case of mutations that would lead to local conformational changes within the tyrosine kinase domain or stabilization of a complex, simulations on the order of 100 ns may be sufficient. However, larger scale conformational changes (e.g., transitions between the active and inactive conformations of the kinase domain) require a longer simulation time – on the order of microseconds21.
With respect to the protocol described below we consider two activating mutations within the tyrosine kinase domain (Figure 3). Both mutations are located within the kinase domain at locations that experience local conformational changes that dictate whether the kinase is active or not, and thus MD simulations were applied in both instances. In the first case, we consider changes that directly affect the ATP binding site and catalytic machinery of the EGFR receiver kinase domain, specifically examining the consequences of an exon 19 deletion mutation that is widely implicated in NSCLC4,7. The Δ746ELREA750 mutation, which reduces the length of the β3-αC loop preceding the αC helix – the helix that moves towards the binding/active site on kinase activation and participates in forming the critical electrostatic interaction between E762 of the helix and K745 by positioning the lysine for interaction with ATP – predisposes the domain for activation12. In the second case, we consider the A702V mutation of EGFR, shown to be a novel gain-of-function activating mutation revealed by the iScream platform9 and identified in an NSCLC patient22. Alanine-702 on the receiver kinase domain is located on juxtamembrane segment B at the interface of the receiver and the activator kinase domains, in which this asymmetric kinase dimer complex and kinase conformational changes are required for activation9.
Figure 3: The asymmetric kinase domain dimer of EGFR. The A702V mutation would be located at the critical interface of the activator and receiver kinase domains, adjacent to the αC helix and close to isoleucine 941 of the activator kinase. Conformational changes induced by formation of the asymmetric dimer lead to kinase activation. The β3-αC loop containing the ELREA sequence directly precedes the αC helix; during activation, the αC helix moves inward towards the ATP binding site. Please click here to view a larger version of this figure.
NOTE: The detailed steps taken to examine the effects of the ΔELREA and A702V mutation on the EGFR structure using MD simulations are discussed as follows:
1. Structure preparation
NOTE: In order to study the structural impacts of the ΔELREA mutation, wild-type and mutant forms of apo active, ATP-bound active and apo inactive EGFR monomer structures are prepared as follows.
Apo active EGFR | Apo inactive EGFR | ATP-bound active EGFR | |
Principal structure | 2GS2 | 2GS7 | 2ITX |
Structures used to build missing loops | 1M14 (723-725) | 3W2S (958-984) | 2GS6 (862-865) |
3W2S (967-981) | 4HJO (848-850) | 3W2S (990-1001) |
Table 1: Structures used to build composite models of apo active, apo inactive and ATP-bound active structures. Missing regions (amino acid range in parentheses) in the principal structure were constructed from the listed structures.
2. System setup
3. Molecular dynamics simulation
4. Analysis
The described protocol was used to study the structural effects of the ΔELREA and A702V mutations on the EGFR kinase structure. One application of the protocol was to investigate the effect of the mutations on local structural/conformational stability by computing RMSD and RMSF values from the MD simulations. As the A702V mutation is located at the juxtamemembrane B segment, the RMSD of this segment of the receiver kinase relative to the starting structure was calculated for both the wild-type and A702V EGFRs. The result (Figure 4A) revealed that the juxtramembrane B segment of the mutant has increased conformational stability during the 100 ns simulation (average RMSD 0.7 Å – 95% confidence interval (CI) 0.009) as compared to the wild-type EGFR kinase domain (average RMSD 1.1 Å – 95% CI 0.01). This is very likely the result of the tighter hydrophobic interactions at the dimer interface due to replacement of alanine 702 (methyl group side chain) to a bulkier hydrophobic residue, valine (isopropyl group side chain), leading to increased hydrophobic interactions of V702 on the receiver kinase domain with isoleucine 941 of the activator kinase domain.
The ΔELREA mutation is located at the β3-αC loop, adjacent to the functionally critical αC helix; the conformation of the αC helix is key to shifts between the active and inactive states of the EGFR kinase. The conformational stability of the αC helix in the active state was assessed by examining the RMSF over the Cα atoms of residues within the helix during MD simulations (Figure 4B): overall, there are lower fluctuations in the mutant (average RMSF 1.1 Å – 95% CI 0.4) in comparison to the wild-type (average RMSF 1.5 Å – 95% CI 0.57); with the highest difference in fluctuations recorded for the N-terminal residues. The sampled conformations respectively superposed on the median structure of the wild-type kinase domain and of the ΔELREA kinase domain also support these results (Figure 4C): both the wild-type and ΔELREA kinase domains have overall similar stability for the superposed conformations except for the β3-αC loop and the αC helix, which are clearly more stable in ΔELREA EGFR. These findings indicate that deletion of the ELREA sequence restrains the movement of the active state αC helix, hence restraining and thus stabilizing the active conformation. Furthermore, since the αC helix forms part of the asymmetric dimer interface, the restraints on the αC helix in the mutant would very likely stabilize the asymmetric dimer, prolonging the duration of the activated state.
Another application of the protocol is to investigate the behavior of key intra- and intermolecular interactions taking place during the simulation. Thus, the interaction between K745 and E762, which is fundamental to EGFR enzymatic activity, was analyzed for both the active form wild-type and ΔELREA EGFR kinase by measuring the percentage occupancy of hydrogen bonds formed between the side-chain polar atoms of the two residues during the MD simulations (Figure 5A): this key electrostatic interaction was formed more often in the ΔELREA kinase domain in comparison to the wild-type kinase domain, owing to the more stable αC helix that accommodates E762. The interactions between Mg2+-ATP and the wild-type and ΔELREA EGFR kinase domains (Figure 5B) during the simulation were also assessed (Figure 5C): the number of hydrogen bonds were greater for ΔELREA (average value 4.0 – 95% CI 0.03) than for the wild-type EGFR (average value 3.2 – 95% CI 0.04). Further analysis of the hydrogen bonds revealed that K745 interacts more frequently with the phosphate groups of ATP in ΔELREA EGFR, which is linked to the more stable K745-E762 interaction noted in the simulation of the ΔELREA mutant EGFR kinase domain.
MD simulations as described in the protocol are also useful in assessing the relative free energy of binding for protein-protein and protein-ligand interactions. The binding energies between the activator and receiver kinase domains of wild-type and A702V EGFRs, and between ATP and the wild-type and ΔELREA mutant EGFR kinase domains, were computed from molecular mechanics generalized Born surface area (MMGBSA) calculations (Figure 6A): the A702V mutant produced a lower average ΔGbind value (average ΔGbind = -76 kcal/mol – 95% CI 0.47), representing more favorable dimer interactions, in contrast to the wild-type EGFR domain (average ΔGbind = -61 kcal/mol – 95% CI 0.61). This observation is consistent with the more stable juxtamembrane B segment and tighter dimer interface due to increased hydrophobic interactions observed for the A702V EGFR kinase domain. In the case of ATP binding to the wild-type and ΔELREA EGFR kinase domains (Figure 6B), MMGBSA calculations predict stronger ATP binding with the ΔELREA mutant (average ΔGbind -57 kcal/mol – 95% CI 0.43) in comparison to wild-type EGFR (average ΔGbind -48 kcal/mol – 95% CI 0.33). This outcome is in line with the greater number of hydrogen bonds recorded between ATP and ΔELREA EGFR (Figure 5C) in comparison to the wild-type domain.
The protocol can also be employed to investigate conformational changes observed during a simulation. In the current study, the effects of the ΔELREA mutation on the inactive EGFR conformation was studied by visual inspection and superpositioning of the sampled conformations from the simulation. The analysis uncovered an inward movement of the αC helix in the ΔELREA EGFR kinase domain (Figure 7A), a structural change expected during the transition to the active state. In contrast, the αC helix of wild-type inactive EGFR maintained its initial conformation (Figure 7B). Thus, the MD simulations support the proposal that the deletion mutation, shown experimentally to increase kinase activity40,41, promotes a conformational shift from the inactive kinase towards the active state.
Figure 4: Wild-type and mutant conformational stability of the active EGFR kinase domain during MD simulations. (A) RMSD (backbone atoms) over the juxtamembrane B segment of the wild-type (blue) and A702V (red) receiver kinase domain. (B) RMSF (Cα atoms) over residues of the αC helix: wild-type (blue) and ΔELREA (gold). (C) Superposed sampled conformations of wild-type (left) and ΔELREA (right) EGFR kinase domain; chain traces colored based on RMSD (Cα atoms) of each residue with respect to the median structure. Coloring ranges from blue to white to red, representing regions of high to low conformational stability. Note that the "free" N-terminal regions of the isolated kinase domain, colored red, would not exhibit this level of mobility in the intact EGFR structure. Figures adapted from Chakroborty et al.9 (Figure 4A reproduced with permission of the Journal of Biological Chemistry) and Tamirat et al.12. Please click here to view a larger version of this figure.
Figure 5: Key features seen in the active receiver kinase during MD simulations: the K745-E762 salt bridge, the αC helix and interactions with ATP. (A) Percentage occupancy of the K745-E762 interaction during the simulation of the wild-type (blue) and ΔELREA (gold) EGFR kinase domains. (B) Residues of the wild-type and ΔELREA mutant interacting with ATP (sticks). Mg2+ (green) coordinates with ATP and D855. (C) The number of hydrogen bonds formed by ATP with both the wild-type and ΔELREA EGFR kinase domains during MD simulations. Figure from Tamirat et al.12. Please click here to view a larger version of this figure.
Figure 6: Lower relative free energies of binding are observed for the mutant kinase domains during the simulations. (A) Binding energies calculated for the interaction between the activator and receiver kinase domains of the wild-type (blue) and A702V (red) EGFRs. (B) ΔGbind of ATP to wild-type (blue) and ΔELREA (gold) EGFR kinase domains. Figures adapted from Chakroborty et al.9 (Figure 6A reproduced with permission of the Journal of Biological Chemistry) and Tamirat et al.12. Please click here to view a larger version of this figure.
Figure 7: Superposed conformations from the wild-type and ΔELREA inactive EGFR kinase domain. Conformation of the αC helix and A-loop helix of (A) wild-type (median structure in blue) and (B) ΔELREA EGFRs (gold). Other sampled conformations, faded white; initial structures prior to the MD simulations, pink. Figure from Tamirat et al.12. Please click here to view a larger version of this figure.
The protocol described in this study focuses on using molecular dynamics simulations to investigate local and global structural alterations that arise from activating somatic mutations of the EGFR kinase domain. Although X-ray crystal structures of wild-type and mutant EGFRs provide invaluable structural insight, they depict one or a few static representations. However, inherent to the biological function of ErbBs is the necessary transitions between the enzymatically inactive and active tyrosine kinase, invoking dynamic changes in both the structure and intramolecular interactions between kinase monomers. MD simulations were thus carried out to probe the dynamic nature of the EGFR tyrosine kinase domain, including the wild-type structure, the introduced ΔELREA deletion mutation, and the A702V mutation. These simulations were successful in elucidating the likely role of these mutations in the structures and how their effects on the conformation of the tyrosine kinase domain would lead to the experimentally observed increases in EGFR kinase activity.
A crucial step in this protocol is the use of a relevant structure to assess the impact of the mutation. One way to select a relevant simulation input structure is to visualize the location of the mutation in the static 3D structure and examine its possible impact with respect to the neighboring amino acids and structural units. In this study, for instance, since the A702V EGFR mutation is located at the juxtamembrane B segment that forms the asymmetric dimer interface, the use of the dimer structure for the simulation as opposed to the monomer is critical. The use of a monomeric structure would have exposed the juxtamembrane B segment of the receiver kinase to the solvent, depriving it from the stabilizing interactions, enhanced by the mutation to a larger hydrophobic residue and interactions with isoleucine 941 from the C-lobe residues of the activator kinase. Moreover, it is noteworthy that the 3D structure represented by the coordinates in a PDB file do not necessarily correspond to the biologically relevant structure that should be used for study. For example, with the structure of ErbB4, PDB code 3BCE, the PDB coordinates correspond to a trimer, but this is due to crystal contacts (few contacts between the monomers are seen when visualizing this structure). Matrices within the PDB file can be used (e.g., within Chimera) to reconstruct the crystallographically related structures, which can be visualized to identify chains that correspond to the biologically relevant 3D structure as reported in the original publication42. Another essential step of the protocol is to properly prepare the simulation input structure, such as building missing amino acids in different loop regions, and especially where located in the vicinity of the mutation. Although numerous wild-type EGFR structures exist in the PDB, only a limited number of mutant EGFR structures are available. Consequently, the mutant structures also need to be modeled; for a single residue mutation like A702V, Chimera was used to mutate the residue; whereas, for the ΔELREA deletion mutation, Modeller was used.
The various parameters utilized in the simulation input files – for example, the number of minimization cycles, heating the system to the desired temperature in one go or instead heating slowly through several intermediate temperatures, the time period for the equilibration and for the production simulations – can be modified based on the molecule of study, the aim of the work and one's own preferences. While carrying out MD simulations, it is also common to come across errors that can arise from the input files, issues related to the simulation software in use or even a user error. Hence, it is very important to understand the source of the errors by carefully examining any error messages. Most simulation programs have a mailing list where users can pose questions to the software developers and to other users by which most problems can be resolved. Additionally, user manuals provide significant assistance to understanding the details of the simulation protocol, including assumptions and limitations. Although MD simulation is an important tool to explore the dynamic properties of molecules, remember that computational results need to be carefully evaluated in conjunction with other sources of information to assess their validity. Whenever possible, work together with researchers that are experts on the proteins under study, especially where relevant wet-lab experimental studies are made, which serve to provide results for structural interpretation as well as to suggest experiments that may be made based on structural observations to test hypotheses.
In this study, the protocol was effective in examining the dynamic structural impacts of the ΔELREA and A702V mutations on the EGFR kinase structures. The simulations revealed that ΔELREA restrains the functionally essential αC helix and promotes a conformational shift from the inactive kinase to a stabilized active kinase. The simulation results are independently supported by drug response data that demonstrated the effects of tyrosine kinase inhibitors on lung cancer cell lines having the ΔELREA deletion mutation and wild-type EGFR, where greater inhibition by drugs recognizing the active kinase conformation was reported for ΔELREA than for wild-type EGFR12. With the A702V mutation, the MD simulations indicate, in comparison to the wild type, increased stabilization of the activator-receiver kinase interface as well as higher affinity of the activator and receiver kinase for each other, together supporting maintenance of the activated conformation of the EGFR kinase. The A702V mutation, located on the juxtamembrane B segment of the receiver kinase, would increase hydrophobic interactions with the activator kinase, functioning to prolong the duration of the activated state. The A702V mutation supports cell survival in the absence of growth factor and was identified in an in vitro screening for EGFR mutations9.
The authors have nothing to disclose.
This research is funded by grants to M.S.J from the Academy of Finland (308317, 320005), Sigrid Juselius Foundation and Tor, Joe and Pentti Borg memorial fund, and to K.E. from the Academy of Finland (274728, 316796), the Cancer Foundation of Finland, and the Turku University Central Hospital. M.Z.T. is funded by the Åbo Akademi Doctoral Network of Informational and Structural Biology. We thank the CSC IT Center for Science for the computing resources and Dr. Jukka Lehtonen for the IT support under the Biocenter Finland bioinformatics network; and Biocenter Finland structural biology infrastructure network.
Amber software | University of California, San Francisco | Version 2018 | Executable |
Chimera program | Resource for Biocomputing, Visualization, and Informatics at the University of California, San Francisco | Version 1.13.1 | Executable |
EGFR struture files | The Protein Data Bank | 3D coordinates of EGFR structures | |
Maestro | Schrödinger LLC | Version 2018-3 | Executable |
Modeller program | The Andrej Šali Lab, Departments of Biopharmaceutical Sciences and Pharmaceutical Chemistry, University of California San Francisco | Included in the Chimera program | |
VMD software | Theoretical and Computational Biophysics Group, University of Illinois at Urbana-Champaign | Version 1.9.3 | Executable |