pythonpytorchapple-m1huggingface-transformersstable-diffusion

Why does StableDiffusionPipeline return black images when generating multiple images at once?


I am using the StableDiffusionPipeline from the Hugging Face Diffusers library in Python 3.10.2, on an M2 Mac (I tagged it because this might be the issue). When I try to generate 1 image from 1 prompt, the output looks fine, but when I try to generate multiple images using the same prompt, the images are all either black squares or a random image (see example below). What could be the issue?

My code is as follows (where I change n_imgs from 1 to more than 1 to break it):

from diffusers import StableDiffusionPipeline

pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
pipe = pipe.to("mps")  # for M1/M2 chips
pipe.enable_attention_slicing()

prompt = "a photo of an astronaut driving a car on mars"

# First-time "warmup" pass (because of weird M1 behaviour)
_ = pipe(prompt, num_inference_steps=1)

# generate images
n_imgs = 1
imgs = pipe([prompt] * n_imgs).images

I also tried setting num_images_per_prompt instead of creating a list of repeated prompts in the pipeline call, but this gave the same bad results.

Example output (for multiple images):

white noise image generated by machine learning transformer model

[edit/update]: When I generate the images in a loop surrounding the pipe call instead of passing an iterable to the pipe call, it does work:

# generate images
n_imgs = 3
for i in range(n_imgs):
    img = pipe(prompt).images[0]
    # do something with img

But it is still a mystery to me as to why.


Solution

  • Apparently it is indeed an Apple Silicon (M1/M2) issue, of which Hugging Face is not yet sure why this is happening, see this GitHub issue for more details.