My data has 1751 sentences however when training a number appears under the epochs bars. Sometimes it is 1751 which makes sense it's the number of sentences I have, but most of the times it's 50% the number of data (sentences I have as shown in the figure below).
I tried to look in the documentation to understand if the number should be the same as my training set size but I couldn't find an answer.
I am using Kaggle with GPU backend. Does this means that the model is indeed not training on all data?
In short: no, it is training on all data.
First let's look at some of the parameters:
num_of_train_epochs: 4
: your setting, meaning the whole dataset would be trained 4 times. Which is why you have 4 bars in the output.
train_batch_size: 8
: this is the default setting, meaning that for each update on the weights, you use 8 records in your training data (out of a total of 1751)
So this means, you have a total of 1751/8 = 218.875 batches per epoch, which is the 219/219 you see in the output.
The 876 that you sees in the bottom simply means it went through a total of 219(batch per epoch) * 4(number of epoch) = 876 number of batches/updates.
One way to prove this is to change num_of_train_epochs
to 1. And you should see 219 instead of 876.
Definition of batch and epoch:
The batch size is a number of samples processed before the model is updated.
The number of epochs is the number of complete passes through the training dataset.