-
Notifications
You must be signed in to change notification settings - Fork 6.3k
Closed
Labels
bugSomething isn't workingSomething isn't workingschedulerstaleIssues that haven't received updatesIssues that haven't received updates
Description
Describe the bug
When using the LMS scheduler with SDXL Img2Img pipeline, there is a lot of noise leftover in the image especially when strength
is closer to 0
. In other words, when the total number of performed steps is "low" (e.g. num_inference_steps=50
and strength=0.1
), the result images are unusably noisy.
Reproduction
Here's some code that first does a prompt-to-image generation, and then an image-to-image from that result with strength =0.1
. The image-to-image result looks like an intermediate latent. Note that the prompt-to-image result looks completely fine. This is reproducible with any input image - I just used a p2i gen because it felt easier to share here.
import torch
from diffusers import StableDiffusionXLPipeline, StableDiffusionXLImg2ImgPipeline
from typing import cast
from diffusers import LMSDiscreteScheduler
sdxl_model = cast(StableDiffusionXLPipeline, StableDiffusionXLPipeline.from_pretrained(
'stabilityai/stable-diffusion-xl-base-1.0',
torch_dtype=torch.float16,
use_safetensors=True,
variant="fp16",
revision="76d28af79639c28a79fa5c6c6468febd3490a37e",
)).to('cuda')
sdxl_img2img_model = cast(StableDiffusionXLImg2ImgPipeline, StableDiffusionXLImg2ImgPipeline.from_pretrained(
'stabilityai/stable-diffusion-xl-base-1.0',
torch_dtype=torch.float16,
use_safetensors=True,
variant="fp16",
revision="76d28af79639c28a79fa5c6c6468febd3490a37e",
)).to('cuda')
common_config = {'beta_start': 0.00085, 'beta_end': 0.012, 'beta_schedule': 'scaled_linear'}
scheduler = LMSDiscreteScheduler(**common_config)
sdxl_model.scheduler = scheduler
sdxl_img2img_model.scheduler = scheduler
sdxl_model.watermark = None
generator = torch.Generator(device='cuda')
generator.manual_seed(12345)
params = {
'prompt': ['evening sunset scenery blue sky nature, glass bottle with a galaxy in it'],
'negative_prompt': ['text, watermark'],
"negative_prompt": [''],
"num_inference_steps": 50,
"guidance_scale": 7,
"width": 1024,
"height": 1024
}
sdxl_res = sdxl_model(**params, generator=generator, output_type='pil')
sdxl_img = sdxl_res.images[0]
display(sdxl_img)
img2img_params = {
'prompt': ['evening sunset scenery blue sky nature, glass bottle with a galaxy in it'],
'negative_prompt': ['text, watermark'],
"negative_prompt": [''],
"num_inference_steps": 50,
"guidance_scale": 7,
"image": sdxl_img,
"strength": 0.1
}
sdxl_img2img_res = sdxl_img2img_model(**img2img_params, generator=generator, output_type='pil')
display(sdxl_img2img_res.images[0])
Logs
No response
System Info
diffusers
version: 0.21.4- Platform: Linux-5.4.0-163-generic-x86_64-with-glibc2.31
- Python version: 3.11.5
- PyTorch version (GPU?): 2.1.0+cu121 (True)
- Huggingface_hub version: 0.17.1
- Transformers version: 4.34.0
- Accelerate version: 0.22.0
- xFormers version: not installed
- Using GPU in script?: yes
- Using distributed or parallel set-up in script?: no
Who can help?
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingschedulerstaleIssues that haven't received updatesIssues that haven't received updates