Overfitting vs Underfitting
In machine learning, the loss (or cost) function is a measure of how well the model's predictions match the true values of the input data. The loss function is used to guide the optimization process of the model, with the goal of minimizing the loss. The history of the loss function is used to evaluate the performance of the model during the training process.
The loss history is a record of the values of the loss function at each iteration (or epoch) of the training process. It is typically plotted as a line graph, with the x-axis representing the number of iterations (or epochs) and the y-axis representing the value of the loss function.
The loss history can be used to evaluate the performance of the model during training. A decreasing trend in the loss history indicates that the model is learning from the data and improving over time. If the loss history plateaus or even increases, it may indicate that the model is not learning from the data or is overfitting. Overfitting occurs when a ML model becomes too specialized in the training data and doesn't generalize well to new data. You can resemble overfitting to a student who has memorized all answers to a test but doesn't understand the concepts and can't apply the knowledge to new situations.
On the other hand, underfitting occurs when a ML model is too simple and can't capture the complexity of the relationship between the features and the target variable. Similarly, you can consider underfitting as a student who hasn't studied enough and can't answer even the easiest questions. In ML, underfitting leads to poor performance on the training data and poor predictions on unseen data.
In addition, the loss history can be used to monitor the convergence of the optimization process. The optimization process is considered to have converged when the loss function reaches a minimum and does not change significantly. This can be observed by looking at a flattening of the loss history graph.
For instance, the loss history graph below suggests that the model is customized to the training data so much that it performs poorly on the validation data, which is a sign of overfitting:
Updated on: 30/01/2023