pythonkagglegoogle-drive-shared-drive

Do we still don't have any way to access folders from google drive on kaggle?


I have relatively huge dataset (apprx 5 GB) of images stored on google drive in a folder. I wanted to do some processing and apply deep learning algorithms on the dataset. For this to be possible, I must have dataset in the kaggle environment. I searched online, and realised that there is no way, or, at least this is what I have been able to gather yet.

This answer makes use of gdown library, but probably google drive denies access because of cookies issues. I tried mounting my cookies to the kaggle environment. But this was of no use.

I might need to write some other scripts for downloading the data itself. May be first storing the links of individual files in the google drive folder and then traversing over the links in the kaggle environment would help. But I was lazy.

I wanted to know if there is some way this could be done?


Solution

  • Although I could not find a better way. But here is what worked for me, as @PaoloJ42 suggested me:

    1. Download the dataset from the Google drive. (It will be in zipped form already)

    2. Instead of uploading the zipped folder to Upload Data option in kaggle environment, it is better to create your own dataset in the Kaggle. Make use of Dataset > New Dataset. You can choose to make it private.

    3. After it is uploaded, the zipped folder gets automatically unzipped. You can make use of the link. Just add the following snippet:

      import os
      os.environ['KAGGLE_USERNAME'] = 'username'
      os.environ['KAGGLE_KEY'] = 'kaggle_key'
      

    You can get the above from Settings > Create New Token (under API section)

    This way you don't have to upload every now and then.