chainerchainercv

What does generate_anchor_base()'s arguments mean?


Github page

Looking generate_anchor_base method, which is Faster R-CNN util method in ChainerCV.

What is the base_size = 16? I saw in the Documentation that it is

The width and the height of the reference window.

But what does "reference window" mean?

Also it says that anchor_scales=[8, 16, 32] are the areas of the anchors but I thought that that the areas are (128, 256, 512)

Another question:
If the base size is 16 and h = 128 and w=128, Does that mean anchor_base[index, 0] = py - h / 2 is a negative value? since py = 8 and and h/2 = 128/2


Solution

  • The method is a util function of Faster R-CNN, so I assume you understood what is the "anchor" proposed in Faster R-CNN.

    base_size and anchor_scales determines the size of the anchor. For example, when base_size=16 and anchor_scales=[8, 16, 32] (and ratio=1.0), height and width of the anchor will be 16 * [8, 16, 32] = (128, 256, 512), as you expected. ratio determines the height and width aspect ratio.

    (I might be wrong in below paragraph, please correct if I'm wrong.)

    I think base_size need to be set as the size of the current hidden layer's scale. In the chainercv Faster R-CNN implementation, extractor's feature is fed into rpn (region proposal network) and generate_anchor_base is used in rpn. So you need to take care what is the feature of extractor's output. chainercv uses VGG16 as the feature extractor, and conv5_3 layer is used as extracted feature (see here), this layer is a place where max_pooling_2d is applied 4 times, which results 2^4=16 times smallen feature.

    For the another question, I think your understanding is correct, py - h / 2 will be negative value. But this anchor_base value is just a relative value. Once anchor_base is prepared at the initialization of model (here), actual (absolute value) anchor is created in each forward call (here) in _enumerate_shifted_anchor method.