pythondeep-learningconv-neural-networkthresholdimage-thresholding

Determine the best classification threshold value for deep learning model


How to Determine the best threshold value for deep learning model. I am working on predicting seizure epilepsy using CNN. I want to determine the best threshold for my deep learning model in order to get best results.

I am trying for more than 2 weeks to find how I can do it.

Any help would be appreciated.

code

history=model.fit_generator(generate_arrays_for_training(indexPat, filesPath, end=75), #end=75),
                                validation_data=generate_arrays_for_training(indexPat, filesPath, start=75),#start=75),
                                steps_per_epoch=int((len(filesPath)-int(len(filesPath)/100*25))),#*25), 
                                validation_steps=int((len(filesPath)-int(len(filesPath)/100*75))),#*75),
                                verbose=2,
                                epochs=50, max_queue_size=2, shuffle=True, callbacks=[callback,call])

Solution

  • In general, choosing right classification threshold depends on the use case. You should remember that choosing threshold is not a part of hyperparameters tuning. The value of classification threshold greatly impacts the behaviour of model after you train it.

    If you increase it, you want your model to be very sure about prediction which means you will be filtering out false positives - you will be targeting precision. This might be the case when your model is a part of a mission-critical pipeline where decision made based on positive output of model is costly (in terms of money, time, human resources, computational resources etc...)

    If you decrease it, your model will say that more examples are positives which will allow you to explore more examples that are potentially positive (you target recall. This is important when a false negative is disastrous e.g in medical cases (You would rather check whether low-probability patient has cancer rather than ignoring him and find out later that he was indeed sick)

    For more examples please see When is precision more important over recall?

    Now, choosing between recall and precision is a trade-off and you have to choose it based on you situation. Two tools to help you achieve this are ROC and Recall-Precision Curves How to Use ROC Curves and Precision-Recall Curves for Classification in Python which indicates how model handles false positives and false negatives depending on classification threshold