deep-learningtensorflow2.0k-meansfeature-engineeringself-supervised-learning

How to use K means clustering to visualise learnt features of a CNN model?


Recently I was going through the paper : "Intriguing Properties of Contrastive Losses"(https://arxiv.org/abs/2011.02803). In the paper(section 3.2) the authors try to determine how well the SimCLR framework has allowed the ResNet50 Model to learn good quality/generalised features that exhibit hierarchical properties. To achieve this, they make use of K-means on intermediate features of the ResNet50 model (intermediate means o/p of block 2,3,4..) & quote the reason -> "If the model learns good representations then regions of similar objects should be grouped together".

Final Results : KMeans feature visualisation

I am trying to replicate the same procedure but with a different model (like VggNet, Xception), are there any resources explaining how to perform such visualisations ?


Solution

  • The procedure would be as follow:

    Let us assume that you want to visualize the 8th layer from VGG. This layer's output might have the shape (64, 64, 256) (I just took some random numbers, this does not correspond to actual VGG). This means that you have 4096 256-dimensional vectors (for one specific image). Now you can apply K-Means on these vectors (for example with 5 clusters) and then color your image corresponding to the clustering result. The coloring is easy, since the 64x64 feature map represents a scaled down version of your image, and thus you just color the corresponding image region for each of these vectors.

    I don't know if it might be a good idea to do the K-Means clustering on the combined output of many images, theoretically doing it on many images and one a single one should both give good results (even though for many images you probably would increase the number of clusters to account for the higher variation in your feature vectors).