pythonpython-2.7python-pptx

Extracting images from presentation file


I am working on python-pptx package. For my code I need to extract all the images that are present inside the presentation file. Can anybody help me through this ?

Thanks in advance for help.

my code looks like this:

import pptx

prs = pptx.Presentation(filename)

for slide in prs.slides:
    for shape in slide.shapes:
        print(shape.shape_type)

while using shape_type it is showing PICTURE(13) present in the ppt. But i want the pictures extracted in the folder where the code is present.


Solution

  • A Picture (shape) object in python-pptx provides access to the image it displays:

    from pptx import Presentation
    from pptx.enum.shapes import MSO_SHAPE_TYPE
    
    def iter_picture_shapes(prs):
        for slide in prs.slides:
            for shape in slide.shapes:
                if shape.shape_type == MSO_SHAPE_TYPE.PICTURE:
                    yield shape
    
    for picture in iter_picture_shapes(Presentation(filename)):
        image = picture.image
        # ---get image "file" contents---
        image_bytes = image.blob
        # ---make up a name for the file, e.g. 'image.jpg'---
        image_filename = 'image.%s' % image.ext
        with open(image_filename, 'wb') as f:
            f.write(image_bytes)
    

    Generating a unique file name is left to you as an exercise. All the other bits you need are here.

    More details on the Image object are available in the documentation here:
    https://python-pptx.readthedocs.io/en/latest/api/image.html#image-objects