Genetic code expansion serves as a powerful tool to study a wide range of biological processes, including protein acetylation. Here we demonstrate a facile protocol to exploit this technique for generating homogeneously acetylated proteins at specific sites in Escherichia coli cells.
Post-translational modifications that occur at specific positions of proteins have been shown to play important roles in a variety of cellular processes. Among them, reversible lysine acetylation is one of the most widely distributed in all domains of life. Although numerous mass spectrometry-based acetylome studies have been performed, further characterization of these putative acetylation targets has been limited. One possible reason is that it is difficult to generate purely acetylated proteins at desired positions by most classic biochemical approaches. To overcome this challenge, the genetic code expansion technique has been applied to use the pair of an engineered pyrrolysyl-tRNA synthetase variant, and its cognate tRNA from Methanosarcinaceae species, to direct the cotranslational incorporation of acetyllysine at the specific site in the protein of interest. After first application in the study of histone acetylation, this approach has facilitated acetylation studies on a variety of proteins. In this work, we demonstrated a facile protocol to produce site-specifically acetylated proteins by using the model bacterium Escherichia coli as the host. Malate dehydrogenase was used as a demonstration example in this work.
Post-translational modifications (PTMs) of proteins occur after the translation process, and arise from covalent addition of functional groups to amino acid residues, playing important roles in almost all the biological processes, including gene transcription, stress response, cellular differentiation, and metabolism1,2,3. To date, about 400 distinctive PTMs have been identified4. The intricacy of the genome and the proteome is amplified to a great extent by protein PTMs, as they regulate protein activity and localization, and affect the interaction with other molecules such as proteins, nucleic acids, lipids, and cofactors5.
Protein acetylation has been at the forefront of PTMs studies in the last two decades6,7,8,9,10,11,12. Lysine acetylation was first discovered in histones more than 50 years ago13,14, has been well scrutinized, and is known to exist in more than 80 transcription factors, regulators, and various proteins15,16,17. Studies on protein acetylation have not only provided us with a deeper understanding of its regulatory mechanisms, but also guided treatments for a number of diseases caused by dysfunctional acetylation18,19,20,21,22,23. It was believed that lysine acetylation only happens in eukaryotes, but recent studies have shown that protein acetylation also plays key roles in bacterial physiology, including chemotaxis, acid resistance, activation, and stabilization of pathogenicity islands and other virulence related proteins24,25,26,27,28,29.
A commonly used method to biochemically characterize lysine acetylation is using site-directed mutagenesis. Glutamine is used as a mimic of acetyllysine because of its similar size and polarity. Arginine is utilized as a non-acetylated lysine mimic, since it preserves its positive charge under physiological conditions but cannot be acetylated. However, both mimics are not real isosteres and do not always yield the expected results30. The most rigorous approach is to generate homogeneously acetylated proteins at specific lysine residues, which is difficult or impossible for most classical methods due to the low stoichiometry of lysine acetylation in nature7,11. This challenge has been unraveled by the genetic code expansion strategy, which employs an engineered pyrrolysyl-tRNA synthetase variant from Methanosarcinaceae species to charge tRNAPyl with acetyllysine, utilizes the host translational machinery to suppress the UAG stop codon in the mRNA, and directs the incorporation of acetyllysine in the designed position of the target protein31. Recently, we have optimized this system with an improved EF-Tu-binding tRNA32 and an upgraded acetyllysyl-tRNA synthetase33. Furthermore, we have applied this enhanced incorporation system in acetylation studies of malate dehydrogenase34 and tyrosyl-tRNA synthetase35. Herein, we demonstrate the protocol for generating purely acetylated proteins from the molecular cloning to biochemical identification by using malate dehydrogenase (MDH), which we have extensively studied as a demonstrative example.
1. Site-Directed Mutagenesis of the Target Gene
Note: MDH is expressed under T7 promoter in the pCDF-1 vector with the CloDF13 origin and a copy number of 20 to 4034.
2. Expression of the Acetylated Protein
3. Purification of the Acetylated Protein
4. Biochemical Characterization of the Acetylated Protein
The yield of acetylated MDH protein was 15 mg per 1 L culture, while that of wild-type MDH was 31 mg per 1 L culture. Purified proteins were analyzed by SDS-PAGE as shown in Figure 1. The wild-type MDH was used as a positive control34. The protein purified from cells harboring the acetyllysine (AcK) incorporation system and the mutant mdh gene, but without AcK in growth media, was used as a negative control. Lysine acetylation of purified proteins was detected by western blotting using the acetyllysine-antibody as shown in Figure 2. The acetylation of the lysine residue 140 in the malate dehydrogenase was confirmed by tandem mass spectrometry analysis as shown in Figure 3.
The protein sequence of MDH protein (The fragment for tandem MS analysis is in bold):
MKVAVLGAAGGIGQALALLLKTQLPSGSELSLYDIAPVTPGVAVDLSHIPTAVKIKGFSGEDAT
PALEGADVVLISAGVARKPGMDRSDLFNVNAGIVKNLVQQVAKTCPKACIGIITNPVNTTVAIAA
EVLKKAGVYDKNKLFGVTTLDIIRSNTFVAELKGKQPGEVEVPVIGGHSGVTILPLLSQVPGV
SFTEQEVADLTKRIQNAGTEVVEAKAGGGSATLSMGQAAARFGLSLVRALQGEQGVVECAY
VEGDGQYARFFSQPLLLGKNGVEERKSIGTLSAFEQNALEGMLDTLKKDIALGEEFVNK
The protein sequence of optimized acetyllysyl-tRNA synthetase33:
MDKKPLDVLISATGLWMSRTGTLHKIKHYEISRSKIYIEMACGDHLVVNNSRSCRPARAFRYHKY
RKTCKRCRVSDEDINNFLTRSTEGKTSVKVKVVSEPKVKKAMPKSVSRAPKPLENPVSAKAST
DTSRSVPSPAKSTPNSPVPTSASAPALTKSQTDRLEVLLNPKDEISLNSGKPFRELESELLSRR
KKDLQQIYAEERENYLGKLEREITRFFVDRGFLEIKSPILIPLEYIERMGIDNDTELSKQIFRVDKN
FCLRPMMAPNLLNYARKLDRALPDPIKIFEIGPCYRKESDGKEHLEEFTMLNFFQMGSGCTRE
NLESIITDFLNHLGIDFKIVGDSCMVYGDTLDVMHGDLELSSAVVGPIPLDREWGIDKPWIGAGF
GLERLLKVKHDFKNIKRAARSESYYNGISTNL
The gene sequence of optimized tRNAPyl 32:
GGAAACGTGATCATGTAGATCGAATGGACTCTAAATCCGTTCAGTGGGGTTAGATTCCCC
ACGTTTCCGCCA
Figure 1: The Coomassie blue-stained SDS-PAGE gel of purified full-length MDH and its AcK-containing variant. The same volumes of elution fractions were loaded onto the SDS-PAGE gel. Please click here to view a larger version of this figure.
Figure 2: The western blotting of purified wild-type MDH and its AcK-containing variant. The same volumes of elution fractions were loaded. Please click here to view a larger version of this figure.
Figure 3: LC-MS/MS analysis of AcK-containing MDH variant. The tandem mass spectrum of the peptide (residues 135-142) AGVYDKACNK from purified acetylated MDH variant. KAC denotes AcK incorporation. The partial sequence of the peptide containing the AcK can be read from the annotated b or y ion series. Matched peaks were in red. Please click here to view a larger version of this figure.
The genetic incorporation of noncanonical amino acids (ncAAs) is based on the suppression of an assigned codon, mostly the amber stop codon UAG36,37,38,39, by the ncAA-charged tRNA containing the corresponding anticodon. As is known, the UAG codon is recognized by the release factor-1 (RF1) in bacteria, and it can also be suppressed by near cognate tRNAs from hosts charged by canonical amino acids (cAAs) such as lysine and tyrosine40,41. So, the efficiency of ncAA incorporation at the UAG codon depends on the competition between ncAA-charged tRNAs and RF1, while the purity of ncAA incorporation relies on the competition between ncAA-charged tRNAs and cAA-charged near cognate tRNAs. Low yield and purity of the target acetylated protein may be caused by the low incorporation efficiency of the orthogonal pair introduced into the host cells. This problem could be solved by increasing the concentration of acetyllysine in the media, and using recently optimized acetyllysine incorporation systems, which increased the UAG codon suppression by 58 times32,33, both the efficiency and purity of acetyllysine incorporation will be improved. As shown in Figure 1 and comparing protein yields, the efficiency of acetyllysine incorporation was about 50%, and there was no detectable protein purified from cells harboring the AcK incorporation system and the mutant gene of MDH, but without AcK in growth media, which indicated the high purity of acetyllysine incorporation. Moreover, mass spectrometry analysis also did not show any cAAs at position 140 of MDH, indicating the homogeneity of the acetyllysine incorporation.
There are two main limitations of this approach. Firstly, because of the competition of acetyllysine-charged tRNA with both RF1 and cAA-charged near cognate tRNAs described above, currently, the maximum number of acetyllysine residues that can be simultaneously incorporated into a single protein is three33,42. Secondly, cells have other types of deacetylases, which resist nicotinamide and may deacetylate certain target proteins. So, those proteins may not reach 100% acetylation at specific sites. Recently, we have established a thio-acetyllysine incorporation system which can be used as a non-deacetylable analog of acetyllysine43, thus this system could be a good alternative approach in this case.
As mentioned before, the classic approach to biochemically characterize lysine acetylation is using site-directed mutagenesis. Glutamine is used as a mimic of acetyllysine, and arginine is utilized as a non-acetylated lysine mimic. However, both mimics are not real isosteres, and do not always yield the expected results30. The genetic code expansion strategy could generate homogeneously acetylated proteins at specific lysine residues, which is the most rigorous way to characterize acetylated proteins.
The genetic incorporation system for acetyllysine was derived from the pair of pyrrolysyl-tRNA synthetase variants, and their cognate tRNA from Methanosarcinaceae species, which is also known to be orthogonal in eukaryotes44. Previous studies have shown that this system could be applied in mammalian cells and certain animals for protein acetylation studies39, thus the present protocol could be expanded to mammalian cells and even animals for wider applications in medical research and industry. Furthermore, this protocol is also essentially the same protocol used to incorporate different kinds of ncAAs, necessitating a simple change to the orthogonal pair introduced into the host cells.
Lysine deacetylases (KDACs) remove the acetyl group from the acetylated lysine residue in proteins45. The sirtuin-type CobB is the only well-known deacetylase in E. coli, which can be inhibited by nicotinamide27. So, to prevent the deacetylation of acetylated protein generated during cell growth and protein purification, 20 to 50 mM nicotinamide should be added in both growth media and purification buffers. Once purified, acetylation of lysine residues is relatively stable due to the lack of deacetylase. Secondly, to lower the background of nonspecific acetylation at other lysine residues in the protein, the BL21 (DE3) strain was used as the expression strain, due to its significantly lower level of protein acetylation than commonly used K12-derived strains46. As shown in Figure 2, the wild-type MDH expressed from BL21(DE3) cells had no detectable acetylation by western blotting. This is another important factor to increase the purity of acetylation in the target protein.
The authors have nothing to disclose.
This work was supported by the NIH (AI119813), the start-up from the University of Arkansas, and the award from Arkansas Biosciences Institute.
Bradford protein assay | Bio-Rad | 5000006 | Protein concentration |
4x Laemmli Sample Buffer | Bio-Rad | 1610747 | SDS sample buffer |
Coomassie G-250 Stain | Bio-Rad | 1610786 | SDS-PAGE gel staining |
4-20% SDS-PAGE ready gel | Bio-Rad | 4561093 | Protein determination |
Ac-K-100 (HRP Conjugate) | Cell Signaling | 6952 | Antibody |
IPTG | CHEM-IMPEX | 194 | Expression inducer |
Nε-Acetyl-L-lysine | CHEM-IMPEX | 5364 | Noncanonical amino acid |
PD-10 desalting column | GE Healthcare | 17085101 | Desalting |
Q5 Site-Directed Mutagenesis Kit | NEB | E0554 | Introducing the stop codon |
BL21 (DE3) cells | NEB | C2527 | Expressing strain |
QIAprep Spin Miniprep Kit | QIAGEN | 27106 | Extracting plasmids |
Ni-NTA resin | QIAGEN | 30210 | Affinity purification resin |
nicotinamide | Sigma-Aldrich | N3376 | Deacetylase inhibitor |
β-Mercaptoethanol | Sigma-Aldrich | M6250 | Reducing agent |
BugBuster Protein Extraction Reagent | Sigma-Aldrich | 70584 | Breaking cells |
Benzonase nuclease | Sigma-Aldrich | E1014 | DNase |
ECL Western Blotting Substrate | ThermoFisher | 32106 | Chemiluminescence |
Premixed LB Broth | VWR | 97064 | Cell growth medium |
Bovine serum albumin | VWR | 97061-416 | western blots blocking |