opencvcascadehaar-classifier

On what dataset was haarcascade_frontalface_default.xml trained?


I'm trying to make my own cascade file, but so far with little success, because of the time it takes to train one decent file (2 days) I don't want to spend months failing. In the example files of Opencv, there is the haarcascade_frontalface_default.xml file.

my question is: Does anybody know on what data was this trained?

I can see that it was 22stages, but I would like to see the images used to better understand how it was made.

I have little hopes since this must be kept quite secret, but I thought I'd ask.


Solution

  • Originally, it was trained on the frontal CALTECH dataset in 1999 (see J. Howse "openCV 4 for secret agent", second edition, Packt, Chapter 3, pages 101-102). It is not easy to find this dataset, but you can use the more recent Caltech 10,000 Web face

    update actually on the github site of the book cited there are some details to find and process the dataset. The dataset itself can be found here