I am new to databricks and currently learning about managed tables. I have created a managed table in databricks and on inspection, it is getting created in below location:
dbfs:/user/hive/warehouse/demo.db/race_results_python
The microsoft documentation states that this is a root directory.
So I have two questions:
For reference I have looked in the below documentation: https://learn.microsoft.com/en-us/azure/databricks/dbfs/root-locations
Answering your two sub questions individually below:
Does this mean that databricks is storing tables in the default Storage Account created during the creation of Databricks workspace ?
If the answer to above question is Yes, then is it a good practice to store tables here or should we store it in a separate Storage Account?
If you are using the default schema/database within the hive_metastore then yes it would store the tables in that default location. If you create table without specifying the external location where you want to store the data then it would create those tables as managed tables and if you provide the external path, then it would create them as external tables.
Now if you are creating the tables in any other custom schema/database that you have created within hive_metastore catalog then you have two options:
Just to make sure you understand the difference between managed tables and external tables, I am describing that below:
Managed Tables are one in which when you drop the tables, it deletes the tables meta data information in the hive_metastore as well as deletes the actual data files. you don't need to handle separately the deletion of the data files if that is what your use case is.
External tables are one in which when you drop the tables, it only deletes the tables meta data information but does not delete the actual data files. In case of external tables if you want to delete the actual data files you will have to have an external process to delete those data files if that is what your use case wants.