I've added an external volume named "logs" to my Databricks Unity Catalog. Within a Databricks notebook I can verify that it exists (os.path.exists(path='/Volumes/my_catalog/schema_name/logs'
) and even write a file to it that I can see within the Databricks UI using the following syntax:
with open('/Volumes/my_catalog/schema_name/logs/test_file.txt', 'w') as file:
file.write('my test')
But when I try to use loguru to write logs to the volume they don't appear within the volume. I've used the following syntax:
from datetime import datetime
import os
from loguru import logger
LOGS_FOLDER_PATH = '/Volumes/my_catalog/schema_name/logs/'
DATE_TIME_STRING = datetime.now().strftime('%Y-%m-%d__%H_%M_%S')
file_handler_path = LOGS_FOLDER_PATH + DATE_TIME_STRING + '.log'
logger.add(file_handler_path)
logger.info((f'Logging is set up including the file handler that saves the logs to the '
f'following destination: {file_handler_path}'))
logger.error('an error has happened')
logger.warning('my warning')
logger.info('my info')
The first logger.info statement returns the following path as the destination: /Volumes/my_catalog/schema_name/logs/2025-04-30__10_01_24.log
But within the Databricks UI I can't see a log file! os.listdir(path='/Volumes/my_catalog/schema_name/logs')
tells me that the file exists but after restarting my cluster it tells me the file does not exist any more (so probably it has never existed at all).
Within another Databricks project I've used to same syntax but the file handler path was a mounted Azure blob storage. In this project saving the logs worked. So, I guess that the problem is somewhere within the external volume that I added within the Databricks unity catalog...
Update: When I add a folder to the path of the file handler (e. g. " /Volumes/my_catalog/schema_name/logs/additional_folder/2025-04-30__10_01_24.log" instead of " /Volumes/my_catalog/schema_name/logs/2025-04-30__10_01_24.log") the folder "additional_folder" is actually created within the external volume "logs" but still the log file is not created.
Here is what I have done.
First my basic handler class:
import logging
class DatabricksTableHandler(logging.Handler):
def __init__(self, table):
super().__init__()
self.table = table
if not spark.catalog.tableExists(self.table):
query = f'CREATE TABLE {self.table} (TIMESTAMP TIMESTAMP, LEVEL STRING, MESSAGE STRING);'
spark.sql(query)
def emit(self, record):
query = (
f'INSERT INTO {self.table} ('
f' SELECT '
f' TIMESTAMP({record.created}) AS TIMESTAMP, '
f' "{record.levelname}" AS LEVEL, '
f' "{record.getMessage()}" AS MESSAGE '
f');'
)
spark.sql(query)
Then set up a logger:
my_table_logger = logging.getLogger("my_table_logger")
# optionally set logging level
my_table_logger.setLevel(logging.DEBUG)
# optionally stop propagation of entries to other loggers
my_table_logger.propagate = False
my_table_logger.addHandler(DatabricksTableHandler("my_catalog.my_schema.my_table"))
Finally, log stuff:
my_table_logger.debug("This is a debug message")
my_table_logger.info("This is an info message")
my_table_logger.warning("This is a warning message")
my_table_logger.error("This is an error message")
my_table_logger.critical("This is a critical message")
And inspect the table:
spark.sql("SELECT * FROM my_catalog.my_schema.my_table").display()
TIMESTAMP LEVEL MESSAGE
2025-05-13T18:16:54.939535Z CRITICAL This is a critical message
2025-05-13T18:16:50.476143Z WARNING This is a warning message
2025-05-13T18:16:52.763707Z ERROR This is an error message
2025-05-13T18:16:39.388126Z DEBUG This is a debug message
2025-05-13T18:16:48.141574Z INFO This is an info message