I've used the following code to log a metric
wandb.log({"metric": [4, 5, 6]})
but then found out that Wandb doesn't support plotting a list by default. I want to create a line plot where the y-axis is the first element of the metric array, and the x-axis is the step.
I've read the Custom Charts section of the document. I think I should use Vega to access the first element of the array. Here are the steps that I've taken: For the custom chart, I've set the data source as "history" and selected the "metric" key.
query {
runSets
(runSets: "${runSets}" ) {
id
name
history
(keys: ["metric" ] )
}
}
In the Vega script, I tried to flatten the array, using this part of the documentation
"transform": {
...
{"type": "flatten", "fields": ["${field:metric}"]},
}
This gives me a warning that "type" and "fields" arguments are not allowed, which means I should include this flattening block somewhere else (not in the transform
section).
I'm afraid I don't know where, and how I can achieve this.
Is this even possible?
If not, I think in my notebook I should write a script that accesses the wandb.run
log data, and transform the data for each run. if so, any tips for that solution is also appreciated.
EDIT: A better solution for this problem would be the Wandb "Weave Table". Check here for the complete explanation.
As others have pointed out, the correct way to log a list for it to be plotted correctly with Wandb is to log each item separately.
But my problem was that I had already trained a model while logging the metrics as lists, and I didn't want to train the model all over again, to have the plots be displayed correctly.
Wandb API supports retrieving log history for a run, and adding new logs to it. Using this approach, we can retrieve the history for the metric that is logged incorrectly as a list, and basically re-log it with the correct format (=each item of the list separately).
As an example, The following code block can be used to plot an already-logged list metric. It assumes that a run for a project called project_name
, has logged a key named metric
as a list. It retrieves the log history and plots the result as a new (properly working) plot in Wandb.
import matplotlib.pyplot as plt
import wandb
runs = api.runs('workspace/project_name')
for i in range(len(runs))[:1]:
run_history = runs[i].scan_history(keys=["metric"])
values = [row['metric'] for row in run_history]
plt.plot(values)
plt.ylabel("metric")
wandb.init(
project="project_name",
id=run_history.run.id,
resume=True
)
wandb.log({"metric_plot": plt})
To avoid this issue in the future, I use the following function. It makes sure that all lists in a dictionary are converted to distinct items.
from copy import deepcopy
def to_wandb_format(d: dict) -> dict:
"""
Unpack list values in the dictionary, as wandb can't plot list values.
Example:
Input: {"metric": [99, 88, 77]}
Output: {"metric_0": 99, "metric_1": 88, "metric_2": 77}
"""
new_d = deepcopy(d)
for key, val in d.items():
if isinstance(val, list):
new_d.pop(key)
new_d.update({
f'{key}_{i}': v for i, v in enumerate(val)
})
return new_d
Which can be used as follows:
wandb.log(to_wandb_format(epoch_train_metrics))
assuming epoch_train_metrics
is a dictionary that may have lists as values.