I am trying to create a Gantt chart in Python. Some of the tasks that I have to include in the chart have a duration of 0 days, meaning they have to be completed on the same day.
I've tried this code which I've found online that creates a basic Gantt chart with plotly:
df = pd.DataFrame([
dict(Task="1", Start='2023-03-15', End='2023-03-15'),
dict(Task="2", Start='2023-03-03', End='2023-03-10'),
dict(Task="3", Start='2023-03-10', End='2023-03-15'),
])
print(df)
fig = px.timeline(df, x_start="Start", x_end="End", y="Task")
fig.update_yaxes(autorange="reversed")
fig.show()
It works fine for tasks that have a duration of at least 1 day (like Task 2 and 3). However, tasks that have to be completed on the same day, like Task 1 in the example above, are not displayed in the Gantt chart after plotting it. The resulting chart only contains Task 2 and 3. The space next to the label of Task 1 stays empty.
Is there a way to display Task 1 (and other tasks that have to be completed on the same day) in the same Gantt chart as Task 2 and 3?
The Gantt chart doesn't have to be necessarily created with Plotly. Could be also with Matplotlib. Whatever works best and is the easiest most useful option.
Grateful for any help!!
The example below provides similar functionality using matplotlib
. It is adapted from the similar case at https://stackoverflow.com/a/76836805/21896093 .
When there's a task that has a duration of 0 days, a small duration is assigned (0.1 days) so that it shows up. You can adjust it as desired.
Output:
import pandas as pd
from matplotlib import patches
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.dates as mdates
#
# Example data
#
#Original data
df = pd.DataFrame(
{'Task': ['1', '2', '3'],
'Start': ['2023-03-15', '2023-03-03', '2023-03-10'],
'End': ['2023-03-15', '2023-03-10', '2023-03-15'],
}
)
#Conver to datetime, as we'll do some simple arithmetic between dates
for date_col in ['Start', 'End']:
df[date_col] = pd.to_datetime(df[date_col], format='%Y-%m-%d')
df
#
# Create plot
#
height = 0.9
f, ax = plt.subplots(figsize=(10, 6))
for idx in range(len(df)):
y0 = (idx + 1) - height / 2
x0 = df.iloc[idx].Start
width = df.iloc[idx].End - x0
if not width:
width = pd.Timedelta(days=0.1)
ax.add_patch( patches.Rectangle((x0, y0), width, height) )
ax.hlines(y0 + height / 2,
xmin=df.Start.min(),
xmax=x0,
color='k', linestyles=':', linewidth=0.5)
#DateFormatter required as we're building the plot using patches,
#rather than supplying entire series
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
ax.xaxis.set_major_locator(mdates.DayLocator())
ax.set_xticklabels(ax.get_xticklabels(), rotation=30)
ax.set_xlabel('Date')
ax.set_ylabel('Task')
ax.set_yticks(range(1, len(df) + 1))
ax.set_yticklabels(df.Task)
plt.show()
Update Version with segmented bars, as per request in comments.
import pandas as pd
from matplotlib import patches
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.dates as mdates
#
# Example data
#
#Original data
df = pd.DataFrame(
{'Task': ['1', '2', '3'],
'Start': ['2023-03-15', '2023-03-03', '2023-03-10'],
'End': ['2023-03-15', '2023-03-10', '2023-03-15'],
}
)
#Conver to datetime, as we'll do some simple arithmetic between dates
for date_col in ['Start', 'End']:
df[date_col] = pd.to_datetime(df[date_col], format='%Y-%m-%d')
df
#
# Create plot
#
height = 0.9
zero_width = pd.Timedelta(days=0.1)
segmentation_width = pd.Timedelta(days=1)
gap_between_days = pd.Timedelta(days=0.05)
one_day = pd.Timedelta(days=1)
f, ax = plt.subplots(figsize=(10, 6))
for idx in range(len(df)):
y0 = (idx + 1) - height / 2
x0 = df.iloc[idx].Start
width = df.iloc[idx].End - x0
if not width:
width = pd.Timedelta(days=0.1)
n_days = width // segmentation_width
days_remainder = width % segmentation_width
for day in range(n_days):
day_td = pd.Timedelta(days=day)
ax.add_patch( patches.Rectangle((x0 + day_td, y0),
one_day - gap_between_days, height) )
n_days_td = pd.Timedelta(days=n_days)
ax.add_patch(patches.Rectangle((x0 + n_days_td, y0),
days_remainder,
height))
ax.hlines(y0 + height / 2,
xmin=df.Start.min(),
xmax=x0,
color='k', linestyles=':', linewidth=0.5)
#DateFormatter required as we're building the plot using patches,
#rather than supplying entire series
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
ax.xaxis.set_major_locator(mdates.DayLocator())
plt.xticks(rotation=30)
ax.set_xlabel('Date')
ax.set_ylabel('Task')
ax.set_yticks(range(1, len(df) + 1))
ax.set_yticklabels(df.Task)
plt.show()