pythonmatplotlibplotlygantt-chart

How to create a Gantt chart in Python (with plotly) including tasks of a duration of 0 days (start and end date of the task are the same date)


I am trying to create a Gantt chart in Python. Some of the tasks that I have to include in the chart have a duration of 0 days, meaning they have to be completed on the same day.

I've tried this code which I've found online that creates a basic Gantt chart with plotly:

df = pd.DataFrame([
    dict(Task="1", Start='2023-03-15', End='2023-03-15'),
    dict(Task="2", Start='2023-03-03', End='2023-03-10'),
    dict(Task="3", Start='2023-03-10', End='2023-03-15'),
])

print(df)

fig = px.timeline(df, x_start="Start", x_end="End", y="Task")
fig.update_yaxes(autorange="reversed") 
fig.show()

It works fine for tasks that have a duration of at least 1 day (like Task 2 and 3). However, tasks that have to be completed on the same day, like Task 1 in the example above, are not displayed in the Gantt chart after plotting it. The resulting chart only contains Task 2 and 3. The space next to the label of Task 1 stays empty.

Is there a way to display Task 1 (and other tasks that have to be completed on the same day) in the same Gantt chart as Task 2 and 3?

The Gantt chart doesn't have to be necessarily created with Plotly. Could be also with Matplotlib. Whatever works best and is the easiest most useful option.

Grateful for any help!!


Solution

  • The example below provides similar functionality using matplotlib. It is adapted from the similar case at https://stackoverflow.com/a/76836805/21896093 .

    When there's a task that has a duration of 0 days, a small duration is assigned (0.1 days) so that it shows up. You can adjust it as desired.

    Output:

    enter image description here

    import pandas as pd
    from matplotlib import patches
    import matplotlib.pyplot as plt
    import numpy as np
    
    import matplotlib.dates as mdates
    
    #
    # Example data
    #
    
    #Original data
    df = pd.DataFrame(
        {'Task': ['1', '2', '3'],
         'Start': ['2023-03-15', '2023-03-03', '2023-03-10'],
         'End': ['2023-03-15', '2023-03-10', '2023-03-15'],
         }
    )
    
    #Conver to datetime, as we'll do some simple arithmetic between dates
    for date_col in ['Start', 'End']:
        df[date_col] = pd.to_datetime(df[date_col], format='%Y-%m-%d')
    df
    
    #
    # Create plot
    #
    height = 0.9
    
    f, ax = plt.subplots(figsize=(10, 6))
    for idx in range(len(df)):
        y0 = (idx + 1) - height / 2
        x0 = df.iloc[idx].Start
        width = df.iloc[idx].End - x0
        if not width:
             width = pd.Timedelta(days=0.1)
        ax.add_patch( patches.Rectangle((x0, y0), width, height) )
        ax.hlines(y0 + height / 2,
                  xmin=df.Start.min(),
                  xmax=x0,
                  color='k', linestyles=':', linewidth=0.5)
    
    #DateFormatter required as we're building the plot using patches,
    #rather than supplying entire series    
    ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
    ax.xaxis.set_major_locator(mdates.DayLocator())
    ax.set_xticklabels(ax.get_xticklabels(), rotation=30)
    
    ax.set_xlabel('Date')
    ax.set_ylabel('Task')
    ax.set_yticks(range(1, len(df) + 1))
    ax.set_yticklabels(df.Task)
    plt.show()
    

    Update Version with segmented bars, as per request in comments.

    enter image description here

    import pandas as pd
    from matplotlib import patches
    import matplotlib.pyplot as plt
    import numpy as np
    
    import matplotlib.dates as mdates
    
    #
    # Example data
    #
    
    #Original data
    df = pd.DataFrame(
        {'Task': ['1', '2', '3'],
         'Start': ['2023-03-15', '2023-03-03', '2023-03-10'],
         'End': ['2023-03-15', '2023-03-10', '2023-03-15'],
         }
    )
    
    #Conver to datetime, as we'll do some simple arithmetic between dates
    for date_col in ['Start', 'End']:
        df[date_col] = pd.to_datetime(df[date_col], format='%Y-%m-%d')
    df
    
    #
    # Create plot
    #
    height = 0.9
    zero_width = pd.Timedelta(days=0.1)
    segmentation_width = pd.Timedelta(days=1)
    gap_between_days = pd.Timedelta(days=0.05)
    one_day = pd.Timedelta(days=1)
    
    f, ax = plt.subplots(figsize=(10, 6))
    for idx in range(len(df)):
        y0 = (idx + 1) - height / 2
        x0 = df.iloc[idx].Start
        width = df.iloc[idx].End - x0
    
        if not width:
             width = pd.Timedelta(days=0.1)
        
        n_days = width // segmentation_width
        days_remainder = width % segmentation_width
        
        for day in range(n_days):
            day_td = pd.Timedelta(days=day)
            ax.add_patch( patches.Rectangle((x0 + day_td, y0),
                                            one_day - gap_between_days, height) )
        
        n_days_td = pd.Timedelta(days=n_days)
        ax.add_patch(patches.Rectangle((x0 + n_days_td, y0),
                                       days_remainder,
                                       height))
        
        ax.hlines(y0 + height / 2,
                  xmin=df.Start.min(),
                  xmax=x0,
                  color='k', linestyles=':', linewidth=0.5)
    
    #DateFormatter required as we're building the plot using patches,
    #rather than supplying entire series    
    ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
    ax.xaxis.set_major_locator(mdates.DayLocator())
    plt.xticks(rotation=30)
    
    ax.set_xlabel('Date')
    ax.set_ylabel('Task')
    ax.set_yticks(range(1, len(df) + 1))
    ax.set_yticklabels(df.Task)
    plt.show()