Skip to content

Duplicate timesteps when using Karras sigmas can break Img2Img with low strength #5687

@nhnt11

Description

@nhnt11

Describe the bug

When using Karras sigmas, at high num_inference_steps (e.g. 100) and low strength (e.g. 0.1), image to image breaks with an error like a Tensor with 2 elements cannot be converted to Scalar.

The root cause seems to be duplicate timesteps. The final few sigmas are so small that they end up mapping to the same discrete timestep values, resulting in timesteps like [... 5 4 4 3 2 2 1 1 1 0 0].

This breaks logic in a few spots that assume that timesteps are unique. For example when inferring the step index in add_noise

Reproduction

import torch
from diffusers import StableDiffusionXLImg2ImgPipeline
from typing import cast
from diffusers import DPMSolverMultistepScheduler, StableDiffusionPipeline, 

sdxl_img2img_model = cast(StableDiffusionXLImg2ImgPipeline, StableDiffusionXLImg2ImgPipeline.from_pretrained(
    'stabilityai/stable-diffusion-xl-base-1.0',
    torch_dtype=torch.float16,
    use_safetensors=True,
    variant="fp16",
    revision="76d28af79639c28a79fa5c6c6468febd3490a37e",
)).to('cuda')

common_config = {'beta_start': 0.00085, 'beta_end': 0.012, 'beta_schedule': 'scaled_linear'}
scheduler =  DPMSolverMultistepScheduler(**common_config, use_karras_sigmas=True)

init_img_url = 'https://storage.googleapis.com/pai-images/ca5dae9cef0541798c2d03f244365fde.jpeg'
from urllib.request import urlopen
from PIL import Image

init_img = Image.open(urlopen(init_img_url))
init_img = init_img.resize((1024, 1024))

generator = torch.Generator(device='cuda')
generator.manual_seed(12345)
img2img_params = {
    'prompt': ['a frog'],
    'negative_prompt': [''],
    "num_inference_steps": 100,
    "guidance_scale": 7,
    "image": init_img,
    "strength": 0.1
}

sdxl_img2img_model.scheduler = scheduler
sdxl_img2img_res = sdxl_img2img_model(**img2img_params, generator=generator, output_type='pil')

display(sdxl_img2img_res.images[0])

Logs

No response

System Info

  • diffusers version: 0.22.1
  • Platform: Linux-5.4.0-166-generic-x86_64-with-glibc2.31
  • Python version: 3.11.5
  • PyTorch version (GPU?): 2.1.0+cu121 (True)
  • Huggingface_hub version: 0.17.1
  • Transformers version: 4.34.0
  • Accelerate version: 0.22.0
  • xFormers version: not installed
  • Using GPU in script?: yes
  • Using distributed or parallel set-up in script?: no

Who can help?

@yiyixuxu @patrickvonplaten

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingschedulerstaleIssues that haven't received updates

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions