3.3 Applications of ML algorithms for fermentation analysis and
optimization
ANNs have been used successfully in several studies in the field of
fermentation prediction and optimization (Table 3 ). The
predictive capacity comparison of ANN and RSM has been studied by
Nelofer et al. for the lipase production process by a recombinantEscherichia coli [99]. In this study, fermentation parameters
were optimized based on experimental lipase production data. As a
result, ANN showed better performance over RSM for both R2 and
adjusted-R2 values. Moreover, absolute average deviation (AAD) and root
mean square error (RMSE) in the ANN model gave lower values, indicating
the high accuracy of ANN. Instead of comparing ANN and RSM, integration
of these two strategies has also attracted much attention in recent
years. For instance, in a recent study, Wang et al. proposed an ANN-RSM
methodology to overcome the pure RSM failure in predicting complex
nonlinear systems. They used original experimental datasets to train and
validate an ANN model and produce response surface models to analyze the
effect of critical parameters in dark hydrogen fermentation. The
constructed model showed good and reliable results for this nonlinear
and noisy process [100]. Genetic algorithm (GA) is a global search
optimization method inspired by natural selection theory. GA usually has
been coupled with ANN to find the optimum values of fermentation
parameters used in model training. Recently, Unni et al. employed ANN
together with GA to optimize medium composition for the production of
human interferon-gamma (hIFN‐γ) using a recombinant Kluyveromyces
lactis [101]. Recently, an on-line μ-stat strategy was proposed for
controlling methanol feeding in a fed-batch process of RecombinantPichia pastoris . In this study, Tavasoli et al. employed MLP3
neural network (a class of ANNs) to reconstruct and adjust the
controller’s performance. Consequently, a significant enhancement was
observed in the production of human recombinant alpha 1-antitrypsin
(A1AT) [102].
SVMs are another popular method for training experimental fermentation
data and predicting the process outcome. One specific advantage that
SVMs have over ANNs is that they always find the global optimum
solution, while ANNs may fall into the local optimum. Moreover, SVMs are
effective for problems with a small number of samples. The predictive
capabilities of SVM and ANN were compared recently by Zhang et al. In
this investigation, SVM and ANN were used to build models for predicting
biomass yield, lipid production, and COD removal rate in a microbial
lipid fermentation. The results demonstrated that the SVM linked with
the genetic algorithm performed better over ANN with a small number of
samples [103]. In another study, the least-square SVM (LS-SVM), a
modified SVM, was coupled with orthogonal experimental design (OED) to
map the relationship between process parameters as inputs and cumulative
biogas production (CBP) as the output for corn stalks anaerobic
fermentation with only nine samples. In this study, the LS-SVM
parameters were optimized by the grid search method. The results showed
that using this optimization method as an alternative to pure OED
increases CBP by 14.13% [104].
Other ML methods also have shown reliable results in fermentation
prediction and optimization. Kennedy et al. investigated the
capabilities of fuzzy logic as a tool for media formulation. They found
that this method can save 63% of the experiments and the remaining
experiments are adequate for media design. They found that the selection
of correct number of fuzzy logic rules is critical for enhancing model
accuracy [105]. In another study, Melcher et al. utilized random
forest and ANN for biomass and recombinant protein modeling in a
fed-batch Escherichia coli process. Online fermentation
parameters and two-dimensional (2D) fluorescence spectroscopy were used
for dry cell mass and productivity prediction. The hybrid model accuracy
reached about ±4% for dry cell mass and ±12% for protein concentration
[106]. Masampally et al. employed Gaussian process regression (GPR)
in fed-batch fermentation of yeast saccharomyces cerevisiae to
predict biomass concentration. In this study, three cascade sub-models
were developed to predict gas hold-up, dissolved oxygen (DO), and
biomass storage, respectively. Validation experiments were eventually
performed [107]. Recently, using the k-nearest-neighbor (KNN)
method, a 1.64-fold improvement in Penicillium brevicompactumfermentation producing mycophenolic acid (MPA) has been obtained
[108].