If I run a snakemake workflow, the start time for each job is logged to the console, for example
[Fri Jul 12 10:39:15 2024]
How can I adapt the logging to also show the duration for each job and the total duration in ms?
a) Is there some logging config file, where I could adapt a format string etc.?
b) Is there some profiling tool for snakemake, that helps to monitor and optimize the performance of snakemake workflows?
My use case is to compare different solvers for pypsa-eur and find out more about their limitations. I could manually subtract the start times of the jobs but that does not make sense to me.
Assuming unrestricted shared filesystem usage.
Building DAG of jobs...
Provided cores: 8
Rules claiming more threads will be scaled down.
Job stats:
job count
-------- -------
run_main 1
total 1
Select jobs to execute...
Execute 1 jobs...
[Fri Jul 12 10:39:15 2024]
localrule run_main:
input: input/input.txt
output: output/output.txt
jobid: 0
reason: Code has changed since last execution
resources: tmpdir=C:\Users\eis\AppData\Local\Temp\PyCharmPortableTemp
Running snake_script.py at C:\python_env\workspace\resilient\snakemake_demo
Processed data written to output/output.txt
[Fri Jul 12 10:39:15 2024]
Finished job 0.
1 of 1 steps (100%) done
The documentation of snakemake contains a section about logging:
https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#log-files
However, that section only describes how to redirect the logs for jobs to some file instead of printing them all to the console.
A. Benchmark Rules
Snakemake support a benchmark
keyword in its rules:
https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#benchmark-rules
If the file path for a benchmark is specified, snakemake creates a correponding text file including benchmark information.
PyPSA-Eur applies this feature to create benchmarking information under /results/benchmarks
Example content:
s h:m:s max_rss max_vms max_uss max_pss io_in io_out mean_load cpu_time
27121.7106 7:32:01 4083.46 6278.57 3738.98 3873.53 0.12 153.27 100.29 27196.06
Explanation of parameters from https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#benchmark-rules
s: Wall clock time (in seconds),
h:m:s: Wall clock time (in hour:minutes:seconds),
max_rss: Max RSS memory usage (in megabytes),
max_vms: Max VMS memory usage (in megabytes),
max_uss: Max USS memory usage (in megabytes),
max_pss: Max PSS memory usage (in megabytes),
io_in: I/O read (in bytes),
io_out: I/O written (in bytes),
mean_load: CPU load = CPU time (cpu_usage) divided by wall clock time (s),
cpu_time: CPU time user+system (seconds),
(Since version 8.11.0 of snakemake, it is possible to have extra benchmark metrics, see snakemake documentation.)
The benchmark is based on a custom implementation of Snakemake:
https://snakemake.readthedocs.io/en/v7.1.0/_modules/snakemake/benchmark.html
B. html report feature
Snakemake has a report feature:
snakemake --report report.html
https://snakemake.readthedocs.io/en/stable/snakefiles/reporting.html#rendering-reports
and is able to create plots about the job runtimes.
Example report:
https://snakemake.github.io/resources/report.html# => Statistics
(I find the snakemake documentation about the report feature a little confusing because it first goes into detail and the most relevant section on how to actually create the report is located at the end of the page.
You can use the above command without adapting the Snakefile. The usage of report
directives inside the Snakefile is optional and helpful, if you want to include additional information in the report.)
C. Python logging module
Snakemake uses the python logging module under the hood and you could adapt the logging format in your Snakefile to log additional information to the console:
a) use predefined logging parameters
import logging
# doc on logging format:
# https://docs.python.org/3/howto/logging.html#changing-the-format-of-displayed-messages
# https://docs.python.org/3/library/logging.html#logrecord-attributes
logging.basicConfig(
format="[%(relativeCreated)d] %(message)s",
)
The parameter realtiveCreated
represents are relative time:
Time in milliseconds when the LogRecord was created, relative to the time the logging module was loaded.
https://docs.python.org/3/library/logging.html#logrecord-attributes
b) use custom log handler
In order to log the time difference to the last log message, a custom log handler could be used:
class DeltaTimeHandler(logging.StreamHandler):
def __init__(self):
super().__init__()
self.last_time = None
def emit(self, record):
current_time = record.relativeCreated
if self.last_time is not None:
delta = current_time - self.last_time
record.delta_time = delta
else:
record.delta_time = 0
self.last_time = current_time
super().emit(record)
logging.basicConfig(
format="[%(relativeCreated)d] [%(delta_time)d ms] %(message)s",
handlers=[DeltaTimeHandler()],
)
Here is another variant that only considers specific messages for the time difference:
class DeltaTimeHandler(logging.StreamHandler):
def __init__(self):
super().__init__()
self.last_time = None
def emit(self, record):
message = record.msg
if message.startswith('['):
current_time = record.relativeCreated
if self.last_time is not None:
delta = int(current_time - self.last_time)
if delta > 0:
record.msg = f"[duration: {delta} ms]"
super().emit(record)
self.last_time = current_time
logging.basicConfig(
format="[%(relativeCreated)d] %(message)s",
handlers=[DeltaTimeHandler()],
)
c) Adapt snakemake logging code
If the above solutions are not sufficient, you could adapt the logging part of snakemake:
https://github.com/snakemake/snakemake/blob/main/snakemake/logging.py
D. Python profiling tools
For general python tools about benchmarks and profiling see
https://realpython.com/python-profiling/
https://docs.python.org/3/library/profile.html
https://github.com/sumerc/yappi
PyCharm Professional edition and Visual Studio provide GUI profiling features, while PyCharm Community Edition and VsCode don't.
Also see
Is there a visual profiler for Python?
https://jiffyclub.github.io/snakeviz/
E. Memory profiling in PYSA-EUR
PYPSA-EUR includes a memory_logger function to log memory usage while solving:
https://github.com/PyPSA/pypsa-eur/blob/master/scripts/_benchmark.py
that is based on memory_profiler
Example usage:
with memory_logger(
filename=getattr(snakemake.log, "memory", None), interval=30.0
) as mem:
n = solve_network(
n,
config=snakemake.config,
solving=snakemake.params.solving,
log_fn=snakemake.log.solver,
)
logger.info(f"Maximum memory usage: {mem.mem_usage}")
Source: https://github.com/PyPSA/pypsa-eur/blob/master/scripts/solve_network.py
Example output:
elec_s_37_lvopt___2050_memory.log
MEM 563.062500 1721309193.8355
MEM 563.562500 1721309193.8607
MEM 2816.871094 1721309223.9120
MEM 3454.433594 1721309253.9648
MEM 3897.300781 1721309284.0169
The numbers represent:
In one case I was wondering why the numbers shown by gnome System Monitor were significantly higher than the logged numbers. I had some additional zombie-python-processes running from canceled snakemake workflows...that were still consuming memory.