This is a technical question on preparing a dataset.
I'm trying to follow this official example
https://github.com/pytorch/examples/tree/master/imagenet
but I cannot even start with because I don't understand the requirements. It says
pip install -r requirements.txt
For the first requirement, I'm working on Colab, so I don't think I need to install PyTorch again on my local pc.
The second one doesn't work, as there's obviously no module named "requirements.txt". This is where I'm beginning to realize there's something on this git repo that I completely don't understand how to use. Anyway, I could just open the text file from the git repo directly, and it just says use torch
and torchvision
. Okay, I have no problem importing them.
The third requirement. So I went to ImageNet website and signed the agreement for the research use. Now the requirement tells me to download THE ImageNet data, but I see bunch of various options there (like by published years, purposes like for a competition, resolution, etc.). Which one is THE DATASET?
I'm new to PyTorch, and I think I'm missing some protocol about how the PyTorch dev community provides examples via this way...
Any help will be appreciated. Thank you.
there's obviously no module named "requirements.txt"
It's the requirements.txt
file in that repo. You can add package names in a file such as this and install all packages at once using pip, that's why pip install -r requirements.txt
. Of course, since it only contains torch and torvision, you don't need to install it as these are already installed on google colab.
Which one is THE DATASET?
I can't access this page without signing up, though you can download any dataset (of any year etc), the important thing is that in order to train it using pytorch using Imagefolder
api (which is the one used in the repo you mentioned), its structure should be like this:
train/
dog/
xxx.png
xxy.png
cat/
xxz.png
val/
...
You can use the script they mentioned for Imagenet data to do so.
If you're just getting started with pytorch, I'd advise you to go through pytorch tutorials such as this one.