A function I am working on takes a dictionary with some data, plots based on the data to axes of a noninteractive matplotlib figure (so without showing it) and renders that figure to an PIL image that is saved to the dictionary. The updated dictionary is returned. That returned dictionary is converted to a pandas.DataFrame where I want to be able to view the image very quickly, which should work because it has been rendered already.
Writing to a BytesIO()
buffer has shown to be good in terms of performance, but it behaves irrationally, because the saved image is not displayable.
If I open the buffer using a "with" statement - which I thought was the standard for reading/writing files & buffers in python - then the rendered image cannot be displayed.
e.g. IPython.display.display(image)
only returns the handle; image.show()
gives an I/O error.
I have reproduced this behaviour in the to_PIL_direct()
function. See the error/unwanted behaviour at very bottom of the code.
It is possible to circumvent this by:
converting the buffer to bytes and reading the PIL image from bytes as done in to_PIL_bytes()
or
not using a "with" statement for the buffer at all as done in to_PIL_noclose()
I would like to use a function that gives the desired output as seen for images from to_PIL_bytes()
and to_PIL_noclose()
, that are fast and also close & delete the buffer.
The latter is not happening with to_PIL_noclose()
, which seems odd to me as well as some others regarding this topic: See answer on How to convert Matplotlib figure to PIL Image object (without saving image) kotchwane and the comment by Anton Troitsky. Converting to bytes to then read with a different PIL function as in to_PIL_bytes()
feels like an unnecessary detour and also slows down the code (see the gist below)
The original spark for wanting to implement this came from dizcza's answer on Save plot to numpy array, specifically the plot2()
function they provide:
import io
import matplotlib
matplotlib.use('agg') # turn off interactive backend
import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots()
ax.plot(range(10))
def plot1():
...
def plot2():
with io.BytesIO() as buff:
fig.savefig(buff, format='png')
buff.seek(0)
im = plt.imread(buff)
def plot3():
...
My question is how can I make the to_PIL_direct()
function work without resorting to a different idea or large workaround f.e. making a canvas and drawing it, saving to a file etc ..
MWE is in this link to the gist with the notebook. Happy about any help!
It gives error ValueError: I/O operation on closed file.
because open()
is "lazy"
and it doesn't load it at once but when you try to use image - but you try to display it after leaving with io.BytesIO() as buffer:
and buffer
is already closed and it can't read from buffer.
You may use .load() to force it to load image:
img = PIL.Image.open(buffer)
img.load() # <-- force PIL to load image
def to_PIL_direct(figure_dict_passed):
figure_dict = copy.deepcopy(figure_dict_passed)
figure_dict["info"] = "info"
mpl.use('agg')
fig, ax = plt.subplots(1,1)
ax.scatter(np.arange(100_000),np.random.randn(100_000))
with io.BytesIO() as buffer:
fig.savefig(buffer, bbox_inches='tight')
buffer.seek(0)
img = PIL.Image.open(buffer)
img.load() # <-- force PIL to load image
figure_dict["figure_img_direct"] = img
plt.close()
return figure_dict
For other people: useful page in Pillow doc (found by @vboettcher ): File Handling in Pillow