This manuscript describes a generic approach for tailor-made design of microbial cultivation media. This is enabled by an iterative workflow combining Kriging-based experimental design and microbioreactor technology for sufficient cultivation throughput, which is supported by lab robotics to increase reliability and speed in liquid handling media preparation.
A core business in industrial biotechnology using microbial production cell factories is the iterative process of strain engineering and optimization of bioprocess conditions. One important aspect is the improvement of cultivation medium to provide an optimal environment for microbial formation of the product of interest. It is well accepted that the media composition can dramatically influence overall bioprocess performance. Nutrition medium optimization is known to improve recombinant protein production with microbial systems and thus, this is a rewarding step in bioprocess development. However, very often standard media recipes are taken from literature, since tailor-made design of the cultivation medium is a tedious task that demands microbioreactor technology for sufficient cultivation throughput, fast product analytics, as well as support by lab robotics to enable reliability in liquid handling steps. Furthermore, advanced mathematical methods are required for rationally analyzing measurement data and efficiently designing parallel experiments such as to achieve optimal information content.
The generic nature of the presented protocol allows for easy adaption to different lab equipment, other expression hosts, and target proteins of interest, as well as further bioprocess parameters. Moreover, other optimization objectives like protein production rate, specific yield, or product quality can be chosen to fit the scope of other optimization studies. The applied Kriging Toolbox (KriKit) is a general tool for Design of Experiments (DOE) that contributes to improved holistic bioprocess optimization. It also supports multi-objective optimization which can be important in optimizing both upstream and downstream processes.
Modern recombinant gene technology enables the wide use of technical enzymes for various applications in the pharmaceutical industry, animal feeding, organic chemistry, and food processing1,2,3. The production of technical enzymes in bulk quantities is a major topic for industrial biotechnology and for optimized recombinant protein production, and both strain and bioprocess engineering is needed. For the generation of efficiently engineered production strains, different genetic libraries are available, e.g., for balanced gene expression4 or increased secretion efficiency5.
Corynebacterium glutamicum is a major producer of amino acids at the industrial scale6,7 and represents an attractive non-conventional expression host for the secretory production of recombinant proteins8,9. Both the general secretory (Sec) and twin-arginine translocation (Tat) pathway are present in C. glutamicum and were successfully applied for recombinant protein secretion10. Extensive experience in bioprocess engineering regarding amino acid production at the industrial scale, as well as the ability to secrete proteins to g/L amounts11 and great robustness concerning bioprocess inhomogeneities found in large scale cultivations12,13, make C. glutamicum a promising platform organism for the secretory production of heterologous proteins at the industrial scale.
Nutrition medium optimization is known to improve recombinant protein production with microbial systems14,15,16,17 and consequently, the adjustment of medium composition is a rewarding step in bioprocess development with respect to optimal productivity18,19,20,21. Intense research on the application of microtiter plates (MTPs) for microbial cultivation22,23,24 paved the way for the development and design of MTPs for microbial cultivation25,26 and the development of MTP-based microbioreactor (MBR) systems with online monitoring and environmental control27,28. MBRs enable a significant increase in experimental cultivation throughput. Besides, MBR systems stemming from other types of bioreactors, e.g., bubble columns or stirred tank reactors, are available for microbial bioprocess optimization29,30,31,32.
In general, optimization studies benefit from increased experimental throughput, which becomes even more powerful in combination with DOE methodologies, such as to assess interactions between design variables or reduce high-dimensional search spaces. Consequently, the combined use of MBR systems, lab automation, and DOE has proved to be a powerful method in biotechnology8,16,33,34,35.
A protocol for media optimization is presented combining state-of-the-art lab automation, MBR technology with online process monitoring, and Kriging-based data analysis/experimental design. The Kriging methodology is implemented in a MATLAB Toolbox ("KriKit") which can be downloaded and used free of charge36. As application example, maximization of secretory green fluorescent protein (GFP) production with C. glutamicum is shown by optimizing the composition of CgXII minimal medium. GFP titer was chosen as the optimization objective as it can be quantified easily and it is widely applied as model protein for studies on MBR systems37,38,39.
The presented framework is divided into four steps, which are illustrated in Figure 1. The steps are indicated by box frames and correspond to sections of the protocol. The first step (Figure 1A) is to define the project goals and to determine the required methods. The combination of DOE methodologies, MBR technology, and lab automation allows an increased experimental throughput that demands powerful data processing. The second step (Figure 1B) aims to detect sensitive design variables (i.e., medium components) with high influence on the optimization objective. This leads to a reduced number of design variables of interest. The third step (Figure 1C) comprises an iterative optimization for a more detailed investigation of the functional relationship between the remaining design variables and the objective of interest. Using the successively extended data set, the Kriging approach is applied for predicting the experimental outcome at unmeasured locations. The iterative cycle stops as soon as the Kriging model predicts an optimum or plateau with sufficient accuracy. The results are verified in the fourth step (Figure 1D), beginning with a further sensitivity analysis around the identified optimum. If initially, insensitive components are found to be insensitive also in the optimal region, it is reasonable to assume that this holds true during the iterative optimization procedure in the third step. Afterwards, it is advised to verify optimization results by application of orthogonal methods, like an activity assay or SDS-Page.
The generic nature of the presented protocol allows for easy adaption to different lab equipment, other expression hosts, and target proteins of choice, as well as further bioprocess variables like pH value or cultivation temperature. Furthermore, other optimization objectives like protein production rate, specific yield, or product quality can be chosen to fit the scope of other optimization studies.
Figure 1: Workflow of optimization study. The four frame boxes correspond to sections of the protocol, "Conceiving the Study and Definition of Methods" (Section 1), "Sensitivity Analysis" (Section 2), "Iterative Optimization" (Section 3), and "Validation" (Section 4). Please click here to view a larger version of this figure.
1. Conceiving the Study and Definition of Methods (Figure 1: Part A)
NOTE: Definition of the optimization objective: Is a time course of product formation needed or is only a limited time interval or even a fixed time point relevant? Also, consider potential issues such as stability, effort of analytical quantification, or cultivation time. As an alternative to final protein titer, other objectives could be considered such as biomass or cultivation time. Biomass is reflected by biomass specific product yield, while cultivation time is reflected by space-time-yield. (Minimum) product quality could also be an aim. Multi-objective optimization may be required in certain situations, as discussed elsewhere40. In this study, GFP titer after 17 h of cultivation was chosen as the optimization objective. GFP fluorescence can be followed online using the equipment available in this study, which greatly simplifies determining the concentration of the model protein.
NOTE: Definition of the parameters to be optimized: CgXII medium consists of 16 individual components41 and investigating all of these in a full factorial design would result in 216 ≈ 65,000 experiments. Consequently, the search space needs to be reduced on a rational and experience-driven basis. The selection of media components considered for optimization can be supported by available expert knowledge or literature data.
2. Sensitivity Analysis (Figure 1: Part B)
NOTE: The goal of this part is to identify the important factors that have a significant effect on the objective.
3. Iterative Optimization (Figure 1: Part C)
NOTE:The MATLAB tool KriKit was used for the interpretation and statistical data analysis36. KriKit allows the user to construct a data-driven Kriging model. This Kriging model predicts the functional relationship between the media components and the objective. It also provides information on the prediction uncertainty. High uncertainty indicates noisy data and/or insufficient data density.
4. Verification of Results (Figure 1: Part D)
NOTE: After finishing the iterative optimization, the initial assumptions need to be checked for validity.
The introduced protocol was applied for maximizing the titer of secreted GFP. Specifically, the GFP titer after 17 h of cultivation was chosen as the optimization objective. Online fluorescence detection of GFP allowed simple product quantification. However, the normalization of GFP signal with data from a reference cultivation is indispensable to ensure reproducibility and comparability of results. A pre-selection of media components was performed on a rational basis as described in Section 1. Experiments were performed following the instructions of Section 1: the parameters of the wet lab procedures were defined for the whole study ensuring consistency and reproducibility of results.
As described in Section 2, an initial screening was performed for identifying relevant components showing a significant impact on the optimization objective for the further, more detailed study. The MTP-based MBR system allows 48 experiments to be performed in parallel. Taking into account the maximal possible number of parallel experiments on one MTP (48) and the total number of media components (11) makes the 2IV11-6 fractional design an appropriate choice. This experimental design comprises 32 experiments and allows the estimation of the main effect for each of the investigated media components. The remaining cultivation wells (16) were used for multiple replicates of experiments with the reference medium to assess reproducibility and positional effects. That is, each experiment is conducted once (no replicates), except of the reference experiment (five replicates).
Table 1 summarizes the results of the screening analysis. In the considered concentration range, varying the majority of the media components did not show a noticeable effect on the objective. Component NH4+ shows a strong negative effect, while Ca2+ and Mg2+ show the strongest positive tendency. The effect of Mg2+ is not significant for the current concentration range but might be for a broader concentration range. Consequently, it was decided to omit NH4+ from the medium and to investigate the effect of Ca2+ and Mg2+ in further experiments.
Section 3 describes the iterative optimization procedure that is used for maximizing the GFP fluorescence signal while varying the concentrations of Ca2+ and Mg2+. In iteration 1, the hypothesis that NH4+ can be omitted was tested. The concentration range for Ca2+ and Mg2+ was adopted from the screening analysis. The minimum concentration of NH4+ was set to zero and the maximum concentration was adopted from the screening experiment. In the following experiments, the component concentrations were distributed over a 3 x 3 x 3 grid inside the defined concentration range, resulting in 27 experiments. During all cultivations, five replicates of reference medium were included, which served as internal standard and to ensure that no positional effects over the MTP occurred. For the remaining 16 wells, the concentrations of NH4+, Ca2+, and Mg2+ were randomly distributed inside the given ranges.
Figure 5A visualizes the results of the first iterations. Axis labels refer to the component concentrations used in the original reference medium, indicated by x Ref. The blue surfaces represent the Kriging interpolations that were calculated using the KriKit software. Each surface is associated with a relative concentration level for NH4+ (dark blue: 0 x Ref, checkered: 1 x Ref, light blue: 2 x Ref). This visual representation reveals that it is favorable to omit NH4+. Interpolation surfaces also show the positive effects of both Mg2+ and Ca2+, as all planes rise with increasing concentrations.
Based on the results of iteration 1, it was decided to expand the concentration range of Ca2+ and Mg2+ by doubling the maximum concentrations and shifting the experimental design window to the upper right corner, see Figure 5B. Inside this range, the concentrations were distributed on a 6 x 6 grid. This ensures an even distribution over the full concentration range, leading to optimal Kriging interpolation results. Figure 5B shows the Kriging interpolation plot based on the combined data measured in both iterations (red dots and yellow squares). For both, Ca2+ and Mg2+, the positive effect of increasing their concentrations continues. Consequently, the procedure was repeated by doubling the maximum concentration and thus, the experimental design window was moved to explore the boundaries of the upper right corner.
Figure 6A gives an overview of the remaining optimization procedure. The analysis of the collected data set up to iteration 3 revealed a limitation of the positive effect of Mg2+, i.e., an optimal concentration range of Mg2+ was identified. It was therefore decided to further expand the concentration range only for Ca2+ (iteration 4). This procedure was repeated twice (iteration 5 and 6) until a saturation of the GFP signal was found. This saturation is explained by precipitation of Ca-salts for the applied concentrations of Ca2+, which are not available to the cells.
As experimental results are always perturbed by noise, the resulting Kriging interpolation appears irregular and visual inspection might lead to false conclusions. However, the optimal concentration range of media components for the saturated GFP signal can be reliably identified with the statistical z-test, which is also implemented in KriKit. The z-test uses directly the intrinsic statistical information provided by the Kriging method, i.e., prediction values and prediction uncertainties. Figure 6B shows the identified plateau, as determined and visualized using the KriKit toolbox. The KriKit toolbox is freely available36 and comes with a detailed tutorial that explains how to use its features.
If more than two relevant components are found, 3D-visualization reaches its limit. KriKit provides several other possible visual representation methods such as movies or "screening plot". If the potential optimum lies inside the defined concentration range, new experiments are automatically designed using the expected improvement40,46. The experimental design based on the expected improvement is integrated in the KriKit toolbox. More detailed information can be found in the software documentation.
After the iterative procedure, a verification of the results was conducted, as described in Part D. The validity of initial assumptions was checked by performing an additional sensitivity analysis using the optimal medium composition. That is, all initial media components of interest were varied, but Ca2+ and Mg2+ were set to their optimal concentration levels. In this study, the optimal concentrations = 32 x Ref and = 6.8 x Ref were chosen. Table 2 shows the results of the validation screening. Similar to the initial sensitivity screening (cf. Table 1), NH4+ still has a significant negative influence and remaining effects are still negligible.
Due to easy access, the GFP fluorescence signal from the cultivation suspension was used to quantify the extracellular GFP titer during all experiments. For verification reasons, GFP fluorescence was validated against other measurements. Because GFP is secreted via the Tat-pathway, the fluorescence signal cannot discriminate between intra- and extracellular GFP. Thus, cultivations were reproduced using the reference medium and the optimized medium. Besides the fluorescence measurement from cell-free cultivation supernatants, protein content was quantified by the Bradford assay and (semi)-qualitative GFP improvement visualized by SDS-Page15. All resulting measurement signals were approximately doubled for cultivations with optimized medium compared to reference medium and validated the approximately 100% improvement of secretion performance of optimized medium. Consequently, GFP specific fluorescence of cultivation suspension can be considered a suitable metric for the optimization objective, i.e., the extracellular GFP titer.
Figure 2: Screenshot from the volume pipetting list for sensitivity analysis. Entries in the first column assign a unique identifier to all volumes of a row; this identifier is the MTP well number of the target cultivation MTP on the liquid handler worktable, cf. Figure 4C. Remaining columns encode volumes for different solutions ("Sln-01" to "Sln-15") to be pipetted. The cumulative volume of one row corresponds to the final cultivation volume of the corresponding well. Please click here to view a larger version of this figure.
Figure 3: Screenshot from the liquid handling control software "WinPREP". Left: Row-ordered commands, including a transfer command for each stock solution to be pipetted. Before the final command for inoculum addition, a user prompt is inserted to ensure the seed culture is placed at the table just in time. Right: Schematic of the worktable, including the source labware for Variation Stocks (two deep well plates with 12 column-like wells), the reagent trough for Rest Stock, water and inoculum, and the media preparation target cultivation MTP. Please click here to view a larger version of this figure.
Figure 4: Compilation of detailed screenshots for setup of pipetting of a stock solution. (A) Unwrapped command for pipetting of Fe stock. Source labware and source well within are marked on the worktable by the read frame and red column of the corresponding deep well plate. Destination labware and destination wells within are marked by the blue frame around and blue wells of the target cultivation MTP. (B) Detailed example view on assignment of pipetting volumes for this step (Fe stock solution). Number of destinations is read from the pipetting list, which has 48 rows. The dispense volumes for all destination wells for Fe stock solution is found in column 4 in the pipetting list. Note that the first column in the pipetting list contains identifiers and not volumes to be transferred, see Figure 2. (C) Details on destination well numbering. Volumes written in the row #1 of the corresponding pipetting list will be pipetted into well marked as #1, and so on. Wells #01, #08, #41 and #48 correspond to wells A01, A08, F01 and F08 for the alpha-numeric coding, which is also printed into the cultivation MTP itself. Please click here to view a larger version of this figure.
Figure 5: Detailed results from the first iteration. (A) Kriging interpolation based on experimental data of iteration 1. Red dots indicate the data set. For comparison, all three interpolation surfaces are overlayed in one plot (dark blue: 0 x Ref, checkered: 1 x Ref, light blue: 2 x Ref). An alternative representation of the results can be found elsewhere15. (B) Kriging interpolation based on the experiments performed in iteration 1 (red dots) and iteration 2 (yellow squares). Parts of the data presented in this figure have been previously published15. Please click here to view a larger version of this figure.
Figure 6: Depiction of iteratively collected optimization results. (A) Final Kriging model prediction. (B) Statistical identification of optimal area (red) based on the statistical z-test, which is provided by KriKit. Boxes indicate successive steps of iterative design and execution of experiments. Parts of the data presented in this figure have been previously published15. Please click here to view a larger version of this figure.
Component | Normalized Coefficient Mean Value |
Fe2+ | -0.08 |
Mn2+ | -0.05 |
Zn2+ | -0.21 |
Cu2+ | -0.21 |
NH4+ | -2.04 |
Ni2+ | -0.11 |
Co2+ | -0.10 |
MoO42- | 0.03 |
BO33- | 0.06 |
Ca2+ | 1.00 |
Mg2+ | 0.45 |
Table 1: Results of the sensitivity analysis. Coefficient values representing the average effect when increasing the respective media component concentration from its center value to its maximum value. For optimal experimental designs, as found in the standard literature and used here, the standard deviation represents directly the experimental variation due to replication of the reference experiment. Coefficient values were normalized by the maxium value (0.0422 for component Ca2+). Normalized and absolute coefficient standard deviation is 0.54 and 0.0226, respectively.
Component | Normalized Coefficient Mean Value |
Fe2+ | -1.00 |
Mn2+ | 1.00 |
Zn2+ | -3.48 |
Cu2+ | -0.52 |
NH4+ | -15.95 |
Ni2+ | 0.69 |
Co2+ | -0.51 |
MoO42- | -0.45 |
BO33- | -1.11 |
Table 2: Results of the final sensitivity analysis. Coefficient values representing the average effect when increasing the respective media component concentration from its center value to its maximum value. As an optimal experimental design was used, the standard deviation only depends on the experimental variation using the optimized medium composition. Experimental variation increased slightly in comparison to the variation using the reference medium. Coefficient values were normalized by the maxium value (0.0106 for component Mn2+). Normalized and absolute coefficient standard deviation is 3.63 and 0.0385, respectively.
The generic nature of the presented protocol allows various adaptions, e.g., for studying other microbial expression hosts9,47,48,49,50,51, or to optimize other properties of the target protein, like glycosylation pattern or disulphide bonds. The protocol may also need to be adapted to the available lab equipment. The integration of an MBR system allows increasing experimental throughput, which enables great savings in time. However, when replacing fully instrumented and controllable bioreactors by MBR systems, scalability of results must be considered8,37,52,53. The use of DOE methodologies and mathematical modeling helps to maximize the information content of measurement data with respect to the studied objective54 by efficient experimental planning and model-based data interpretation15.
Modifications to the Method
Next to multi-purpose and expandable robotic liquid handling systems like the one used in this study, it should be mentioned that there are several smaller liquid handling systems commercially available which are capable to perform this task and can be placed inside of laminar flow work benches. If no automated pipetting system is available, different media compositions according to the DOE plan can also be realized by manual pipetting using single and/or multi-channel pipettes. Since the manual preparation is more error-prone and will require highly focused work for quite a long time, it is recommended to prepare a lower number of different media compositions.
Depending on the capabilities of the employed MBR system, the corresponding cultivation protocol will vary. For instance, if no online measurement of biomass formation is available, it may be sufficient to measure biomass concentration after completion of the growth experiment. In combination with online monitoring of the pH and dissolved oxygen, which is implemented in several MBR systems, the growth saturation can be determined safely. In principle, the growth experiments can be conducted in MTPs alone placed inside shaking incubators, without the use of a MBR system. In this case, proper cultivation conditions have to be ensured: (1) Oxygen-limited cultivations can be avoided by using MTPs with suitable geometries, in combination with proper shaking frequencies and shaking diameters, e.g., square 96 or 24 deep well plates operated at 1,000 rpm at 3 mm throw or at 250 rpm at 25 mm throw, respectively. Importantly, the lower the achievable maximal oxygen transfer rates, the lower the main carbon source should be concentrated. As mentioned above, for this study, the use of 10 g/L glucose was suitable to prevent oxygen limitation for the employed cultivation conditions; (2) Sampling of the MTP cultures for biomass and product quantification should be reduced to a minimum. Each time the MTP is removed from the shaking incubator, oxygen transfer will immediately break-down which may result in unfavorable cultivation conditions; (3) In the opinion of the authors, the use of MTP readers as cultivation devices is not recommended as these devices were not developed for this purpose. For example, shaking mechanics were built for occasional mixing of microplates after the reagent addition and thus, often lack robustness for long runs of continuous shaking lasting for days. Moreover, sufficient power input needed for microbial cultivations cannot be realized in these readers. The integration of optical density readings in short time intervals requires stopping of the shaking motion, resulting in repeated periods of oxygen limitation. Furthermore, evaporation in such systems over long cultivation periods will distort results. For more details on the surprisingly complex topic on using MTPs for microbial cultivations, the reader is referred to the cited literature22,23,24,25,26 and references therein.
Further Considerations
To speed-up iterative optimization steps, it is advised to carefully select the analytical method for product quantification. Fast and simple methods should be preferred at the cost of precision and accuracy, as the iterative experimental design strategy tolerates experimental inaccuracy. However, the final results must be verified against sufficiently precise and accurate product quantification methods that might be more complicated. In general, careful evaluation and decision making about the study procedures require effort in the beginning of the study, but pay out in the long run, after routine methods have been established.
It is highly recommended to define a reference experiment that is compared to all experiments during the optimization. That is, the applied medium component concentrations as well as measured output are normalized via dividing by reference values. This way, each applied and measured value can be interpreted as the x-fold of the reference value. To take into account variations between the plates, five reference experiments are performed on each plate. The mean value of the measured outcome is used for the normalization.
It can generally not be guaranteed that the developed medium is also optimal for other strains. However, the improved medium will most likely also be appropriate for cultivating expression strains with small genetic differences, e.g., when producing enzyme variants with single amino acid substitutions obtained from mutagenesis studies (although even single point mutations have been described to effect cellular metabolism and heterologous expression performance55,56). In this case, the presented protocol can be a first step, followed by protocols for high-throughput expression screenings57. If the protocol is used for medium development with subsequent scale-up to fed-batch cultivations, the optimized medium should be verified for the corresponding bioprocess conditions, as clone screening campaigns at the microscale identified different top performers for different feeding strategies and cultivation media52,58. Furthermore, the introduced KriKit36 can generally contribute to improved holistic bioprocess optimization. Only recently, the tool abilities were extended to also support multi-objective optimization40, which can be important for optimizing both upstream and downstream processes59,60.
The authors have nothing to disclose.
The scientific activities of the Bioeconomy Science Center were financially supported by the Ministry of Innovation, Science, and Research within the framework of the NRW-Strategieprojekt BioSC (No. 313/323-400-002 13). The authors thank the Ministry of Innovation, Science, and Research of North Rhine-Westphalia and the Heinrich Heine University Düsseldorf for a scholarship to Lars Freier within the CLIB-Graduate Cluster Industrial Biotechnology. Further funding was received from the Enabling Spaces Program "Helmholtz Innovation Labs" of German Helmholtz Association to support the "Microbial Bioprocess Lab – A Helmholtz Innovation Lab".
BioLector | m2p-labs | G-BL-100 | |
Flowerplate | m2p-labs | MTP-48-BOH | For cultivation in the BioLector device |
Sealing foil | m2p-labs | F-GP-10 | Sterile sealing for Flowerplate |
MATLAB | Mathworks | 2016b | |
KriKit | Forschungszentrum Jülich | n/a | Freely available, MATLAB installation required |
Janus pipetting robot | Perkin Elmer | n/a | Includes "WinPrep" software installation |
12-column deep well microplate | E&K Scientific | EK-2034 | Container for medium stock solutions |
96 well microplates, transparent, F-bottom | Greiner | 655101 | For Bradford protein assay |
µclear 96 well microplates, black body, transparent F-bottom | Greiner | 655087 | For flourescence measurement in cell-free supernatants |
Pipette Research plus multi-channel pipettes | Eppendorf | n/a | Facilitates manual liquid handling with microplates |
TruPAG Precast Gels | Sigma | PCG2002 | For SDS-Page analysis of cell-free supernantants |
Bradford Reagent | Sigma | B6916 | |
C. glutamicum pCGPhoDBs-GFP | n/a | n/a | Carries pEKEx2 plasmid with fusion of GFP gene and PhoD signal peptide from B.subtilis as expression insert. Plasmid provides kanamycin resistance. Described and published by Meissner et al. Appl Microbiol Biotechnol 76 (3), 633–42 (2007) |