Here trying to import a dataset from **Kaggle **to **DataBricks **(community) with their Kaggle' API, but I'm falling and lost 3 days. Please a kind soul can help me.
Trying 1:
!pip install kaggle
import os
import kaggle
os.environ['KAGGLE_USERNAME'] = 'xxxxxx'
os.environ['KAGGLE_KEY'] = 'xxxxx'
from kaggle.api.kaggle_api_extended import KaggleApi
api = KaggleApi()
api.authenticate()
api.dataset_download_files('taricov/mobile-wallets-in-egypt-2020')
Trying 1 error:
Trying 2:
import os
import kaggle
from kaggle.api.kaggle_api_extended import KaggleApi
api = KaggleApi()
api_token_path = '/FileStore/tables/Kaggle_token/kaggle-2.json'
os.environ['KAGGLE_CONFIG_DIR'] = os.path.dirname(api_token_path)
api.dataset_download_files('taricov/mobile-wallets-in-egypt-2020', path='/FileStore/mobile-wallets-in-egypt-2020',unzip=True)
Trying 2 error:
My kaggle.json credential in Databricks:
I try two types of connections but its missing something or my credentials are wrong because the error is:
"Reason: Unauthorized".
Try the following:
import os
import kaggle
from kaggle.api.kaggle_api_extended import KaggleApi
api = KaggleApi()
api_token_path = '/FileStore/tables/Kaggle_token/kaggle-2.json'
os.environ['KAGGLE_CONFIG_DIR'] = os.path.dirname(api_token_path)
api.authenticate()
api.dataset_download_files('taricov/mobile-wallets-in-egypt-2020', path='/FileStore/mobile-wallets-in-egypt-2020',unzip=True)
It seems that in your first snippet there is missing the environment variables and in the second one there is no api.authenticate() being called. Inside this method there is the read_config_environment method that is responsible to get those keys.