pythondatasetpytorchimagenet

Pytorch ImageNet dataset


I am unable to download the original ImageNet dataset from their official website. However, I found out that pytorch has ImageNet as one of it’s torch vision datasets.

Q1. Is that the original ImageNet dataset?

Q2. How do I get the classes for the dataset like it’s being done in Cifar-10

classes = [‘airplane’, ‘automobile’, ‘bird’, ‘cat’, ‘deer’, ‘dog’, ‘frog’, ‘horse’, ‘ship’, ‘truck’]

Solution

  • The torchvision.datasets.ImageNet is just a class which allows you to work with the ImageNet dataset. You have to download the dataset yourself (e.g. from http://image-net.org/download-images) and pass the path to it as the root argument to the ImageNet class object.

    Note that the option to download it directly by passing the flag download=True is no longer possible:

    if download is True:
        msg = ("The dataset is no longer publicly accessible. You need to "
               "download the archives externally and place them in the root "
               "directory.")
        raise RuntimeError(msg)
    elif download is False:
        msg = ("The use of the download flag is deprecated, since the dataset "
               "is no longer publicly accessible.")
        warnings.warn(msg, RuntimeWarning)
    

    (source)

    If you just need to get the class names and the corresponding indices without downloading the whole dataset (e.g. if you are using a pretrained model and want to map the predictions to labels), then you can download them e.g. from here or from this github gist.