I started doing some research lately on Machine Learning, I had a code for assessing the suitability of SNNs and ANNs for classifying motion events in synthetic data, focusing on their speed (inference time) and accuracy. When I plotted the loss and accuracy curves it did appear to me that the model was overfitting, although I checked with chat GPT and DeepSeek and both mentioned that it was not and those results were due to the task being too simple, can I rely on them for this type of analysis ? and how can I know each time when the model has learned properly or not? here are the figures for the functions
I tried changing the number of epochs but the results were very similar
so, lets talk in simple terms, overfitting basically means if you create a model to detect cats, but when you use the model in real world scenario, you are able to only detect a specific cat/cats ( the cat images used in your training dataset )
reasons for overfitting is one or all of the following
how can I know each time when the model has learned properly or not
for now lets not consider the values or metrics that provide information about the model accuracy and stuff, the most practical and best way to check your models performance or model overfitting is testing your model, rather than just relying on the metric values ( though they do provide the accurate information )
create a dataset named test dataset and make sure this dataset is unique from the training and validation dataset, in simple terms, if you are training a model with cats ( x,y,z ) in different backgrounds and positions, make sure that the testing dataset contains ( A,B,C ) cats, in unique and positions from the training dataset
then after dataset creation, run inference or test your model using this dataset, that is manually check if the model is able to detect unique data rather than trained data,
as mentioned by about your responses from chatbots, yes it could be true, without you mentioning the volume, variety, and veracity of the dataset, i can only assume that your dataset is the problem
task being too simple
, lets say you need to train a model to perform mathematical operations ( probability, word problems, permutations ), but you only train the model to perform simple addition and multiplication with small , then yes your model will overfit to only addition and multiplication
though the example mentioned is not realistic, but it may help you to understand what i am trying to say
can I rely on them for this type of analysis ?
partially yes and partially no, basically they are just models to process the human language and provide response to those, by searching their database or the web or probably even by the patterns learned by it during the training process.