machine-learningdeep-learningcomputer-visionfalse-positive

How does Overfitting result in false positives in Object detection?


I am doing tensorflow object detection and I find that there are lot of false positives. One of the main reasons that I see for this is the case of overfitting. But my doubt is how does false positive become a result of overfitting? The overfitting happens when it learns a complex pattern in data or in short it leads to memorisation of the data.

If it was memorisation, wouldn't it show more false negatives as it has only memorised the training data and is unable to detect new cases. How can it really classify other objects as belonging to trained class is it not counter intuitive?


Solution

  • One reason I could think of would be outliers in your training data:

    Say you have some strong outliers in your training data in class A, which in consequence might lay more in the domain of the other class B in some dimension, then overfitting will result in a shift of the class boundary in the direction of this outlier. This could effectively result in a lot of false positives, as the shifted boundary of class A now partially lays in an area which should be in the domain of class B.

    For an extreme example an overfitted boundary might look like this:

    enter image description here

    Here, due to overfitting, we keep the outlier in the positive class, at the cost of also taking in 2 false negatives. A generalized good boundary in-between those 2 classes would discard the outlier as false negative, but would still have an higher accuracy due to not also including those 2 false positives.

    Same could go for false positives due to outliers by the way, that's why overfitting is generally considered bad.