pythonlinuxjupyter-notebookgoogle-drive-apigoogle-colaboratory

How to upload files to Colab from Google Drive so that the notebook can be shared?


I want to use Colab notebook that I share with a colleague. The notebook should upload files from a folder in Google Drive to the workspace. The folder in Google Drive also contain the actual notebook. The files to be uploaded are not very big. The access to the folder is given to the colleague using standard "Google-procedure" and restricted to the colleague.

Here seems to be many ways to do something like this but often involves some manual step for the colleague that I want to be minimal or better eliminated, if possible. I am not sure if should use Google functionality or user PyDrive.

A basic example.

On Google Drive we have the file-tree:

Colab Notebooks/Test1
    notebook1.ipynb
    file1.py
    file2.mat

The notebook1 looks something like this:

from google.colab import drive
drive.mount('/content/drive')
%cd /content/My Drive/Colab Notebooks/Test1 # <---(1)  

from script.io import loadmat

data=loadmat('file2.mat')    # <---(2)
run -i file1.py

The code works if I run it myself, i.e. that for (1) my "My Drive" is the same as the one that will be sent as a link to the colleague. Thus, the script does not work for my colleague. I need an absolute address path to the folder and not a relative.

I want to change "My Drive" so that it works for both me and my colleague. It may call for change at (2) as well.

Got some inspiration from a post here at Stackoverflow and answer by eemilk, see Google Colab: how to read data from my google drive?

But it is not enough.

I think the problem is rather generic and here ought to be a simple solution - appreciate your help!


Solution

  • The answer from "tesltrader" is good but I had troubles with the CSV-file since there are many different kinds. Therefore I changed to an excel-file.

    Step 1 and 2 as above

    Step 3 extract the file id right of the gotten link and add '/export?format=xlsx'

    Step 4 as above

    The script will then be as follows.

    import pandas as pd 
    import openpyxl 
    
    file _url = ... 
    df_xlsx = pd.read_excel(file_url, engine='openpyxl') 
    df_xlsx
    

    Thanks again "teslatrader"!