pythonpandasplotlypypdf

How to save all Plotly express graphs created using a for loop in a PDF?


I am trying to save all the charts generated through a for loop in a SINGLE PDF file.

My sample data:

import pandas as pd
import numpy as np
import plotly.io as pio
import plotly.express as px
import plotly.graph_objects as go

np.random.seed(0)
df = pd.DataFrame({'State' : np.repeat(['NY', 'TX', 'FL', 'PA'], 12),
                   'Month' : np.tile(pd.date_range('2023-09-01', '2024-08-01', freq = 'MS'), 4),
                   'Actual' : np.random.randint(1000, 1500, size = 48),
                   'Forecast' : np.random.randint(1000, 1500, size = 48)})

df['Month'] = pd.to_datetime(df['Month'])
df.set_index('Month', inplace = True)

I am able to generate the charts in jupyter notebook using:

for s in df['State'].unique():
    d = df.loc[df['State'] == s, ['Actual', 'Forecast']]
    fig = px.line(d, x = d.index, y = d.columns)
    fig.update_layout(title = 'Actuals vs Forecast for ' + s, template = 'plotly_dark', xaxis_title = 'Month')
    fig.update_xaxes(tickformat = '%Y-%B', dtick = 'M1')
    fig.show()

Can someone please let me know how to get all the charts in a single PDF file?

I tried usingfig_trace and make_subplots approach that uses Graph Object go.Scatter inplace of px.line, but it's not working.


Solution

  • At the core, you need to install Kaleido to save the Plotly Express graphs/plots to images. See here for more about that.

    Then you can take the images and make a composite image and save as PDF. I used Pillow but there are other ways spelled out more here.

    You didn't say if you wanted single-page or multi-page PDF file. (Both are single files.)

    Single page output example here.

    Multi-page output example here.

    In each the header to each is a link to where you can go to get a MyBinder-served session where those notebooks will work as presented.

    Here's the essential code for the single-page example with comments that point out the source for additional code, I believe:

    import pandas as pd
    import numpy as np
    import plotly.io as pio
    import plotly.express as px
    import plotly.graph_objects as go
    
    np.random.seed(0)
    df = pd.DataFrame({'State' : np.repeat(['NY', 'TX', 'FL', 'PA'], 12),
                       'Month' : np.tile(pd.date_range('2023-09-01', '2024-08-01', freq = 'MS'), 4),
                       'Actual' : np.random.randint(1000, 1500, size = 48),
                       'Forecast' : np.random.randint(1000, 1500, size = 48)})
    
    df['Month'] = pd.to_datetime(df['Month'])
    df.set_index('Month', inplace = True)
    images_made = []
    for s in df['State'].unique():
        d = df.loc[df['State'] == s, ['Actual', 'Forecast']]
        fig = px.line(d, x = d.index, y = d.columns)
        fig.update_layout(title = 'Actuals vs Forecast for ' + s, template = 'plotly_dark', xaxis_title = 'Month')
        fig.update_xaxes(tickformat = '%Y-%B', dtick = 'M1')
        image_filename = "img_file_"+str(s)+".png"
        images_made.append(image_filename)
        fig.write_image(image_filename)
        #fig.show()
    
    '''
    # make them into PDF using Pillow according to https://stackoverflow.com/a/63436357/8508004
    from PIL import Image
    composite_image = Image.open(images_made[0]).convert("RGB") #start what will be composite after rest appended
    images_to_append = [Image.open(x).convert("RGB") for x in images_made[1:]]
    composite_image.save("composite_of_plots.pdf", save_all=True, append_images=images_to_append)
    '''
    # make them into single page PDF using Pillow based on https://stackoverflow.com/a/59042517/8508004
    from PIL import Image
    # get images    
    #img1 = Image.open('image1.png')
    #img2 = Image.open('image2.png')
    #img3 = Image.open('image3.png')
    #img4 = Image.open('image4.png')
    images = [Image.open(x) for x in images_made]
    
    # get width and height
    #w1, h1 = img1.size
    #w2, h2 = img2.size
    #w3, h3 = img3.size
    #w4, h4 = img4.size
    images_width_height_specs = [(x.size) for x in images]
    
    # to calculate size of new image 
    #w = max(w1, w2, w3, w4)
    #h = max(h1, h2, h3, h4)
    w_each = [x[0] for x in images_width_height_specs]
    w = max(w_each)
    h_each = [x[1] for x in images_width_height_specs]
    h = max(h_each)
    
    # create big empty image with place for images
    composite_image = Image.new('RGB', (w, h*4))
    
    # put images on new_image
    composite_image.paste(images[0], (0, 0))
    for idx,x in enumerate(images):
        composite_image.paste(x, (0, h*idx))
    #new_image.paste(img3, (0, h))
    #new_image.paste(img4, (w, h))
    
    # save it
    composite_image.save('composite_as_image.png')
    composite_image.save("composite_of_plots.pdf")