In this graph, we can see that the original loss is very high and decreases every epoch. Loss is the ‘inaccuracy’ of the prediction, so minimizing loss increases the accuracy of the model. Since the model is being trained on the training data, the loss for the training data will decrease more rapidly and smoothly, but this also means that the model will eventually become overfit. This happens at the point where the validation loss is no longer decreasing and even increass slightly. The reasoning for this is because the model starts trying to find patterns in the training data that do no actually exist, so it will end up performing better with training data, but worse with the testing data.