FYI, I can only spare the initial 300$ since I'm a student so I need to minimize the trial & error phase.
I have my Pytorch-based model which currently runs on local GPU with a ~100GB of frames dataset that is in my local storage, I'm looking for a guide that shows how to set up a machine to train & test my model with TPUs on the dataset which will be in my Google Drive(?)(or any other recommended cloud storage).
The guides I found don't match up to my description, most of them either run on GPU or TPU with a dataset that is included in a dataset library, I prefer not to waste time and budget on trying to assemble a puzzle from those pieces.
First, to use TPUs on Google Cloud TPUs you have to use the PyTorch/XLA
library, as its enable the support to use TPUs with PyTorch.
There is some options to do so, you can use code lab or create an environment on GCP to this. I understand that you may want to know how is to work in a "real environment" besides working on codelab, but there will be no much difference, and codelab is often used as main environment for ml development.
gs://bucket_name/data.csv
. It also have a free tierAlso, keep mind that with a TPU instance and a notebook to code in a notebook in GCP will drain your 300 $ in a fell days (or hours). Just the TPU v3
ready for pytorch you cost around $ 6k/month.
Create a notebook to write your code.
Se the XRT_TPU_CONFIG
env variable with the IP of your TPU on he code:
os.environ["XRT_TPU_CONFIG"]="tpu_worker;0;10.0.200.XX:8470"