3.1.4 Validation and evaluation
Model evaluation is first conducted on the training dataset to measure
how well the model can predict previously unseen data [87]. For
instance, k-fold cross-validation is a standard method that evaluates
the estimator’s performance by randomly splitting the training dataset
into training and test sets. For better understanding, in this method,
the training dataset is divided randomly into k equal parts, called k
folds. Then, the model runs k times, and in each round, one particular
fold is used as the test set and k-1 folds as training ones [88].
The accuracy of the model is computed for each test fold. When k-fold
cross-validation is operated, one can see how sensitive the model is to
the training dataset. Next, by evaluating the accuracy, the model’s
parameters can be tuned to improve its generalization performance for
unseen data [89, 90]. The final evaluation of the model is conducted
with test data, and the predictive ability of the model is quantified by
different evaluation metrics such as accuracy, sensitivity, specificity,
precision, and recall [91, 92]. Figure 4 shows how
evaluation methods can help to improve the model’s performance.