-
Notifications
You must be signed in to change notification settings - Fork 6.6k
Description
Describe the bug
I was trying the demo from https://huggingface.co/docs/diffusers/en/using-diffusers/controlnet#inpainting
Only change I made was that I have used a different scheduler, i.e. instead of UniPCMultistepScheduler I have used EulerAncestralDiscreteScheduler
And the results are horrible, here is the code:
from diffusers import StableDiffusionControlNetInpaintPipeline, ControlNetModel, EulerAncestralDiscreteScheduler
controlnet = ControlNetModel.from_pretrained("lllyasviel/control_v11p_sd15_inpaint", torch_dtype=torch.float16, use_safetensors=True)
pipe = StableDiffusionControlNetInpaintPipeline.from_pretrained(
"runwayml/stable-diffusion-v1-5", controlnet=controlnet, torch_dtype=torch.float16, use_safetensors=True
)
# Setup VAE
vae = AutoencoderKL.from_pretrained("stabilityai/sd-vae-ft-mse", torch_dtype=torch.float16)
pipe.vae = vae
# Setup scheduler
pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(pipe.scheduler.config)
pipe.enable_model_cpu_offload()output = pipe(
"corgi face with large ears, detailed, pixar, animated, disney",
num_inference_steps=20,
eta=1.0,
image=init_image,
mask_image=mask_image,
control_image=control_image,
).images[0]
make_image_grid([init_image, mask_image, output], rows=1, cols=3)Everything else is same as the tutorial:
Here are the results:

Am I missing something?
Reproduction
from diffusers.utils import load_image, make_image_grid
init_image = load_image(
"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet-inpaint.jpg"
)
init_image = init_image.resize((512, 512))
mask_image = load_image(
"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet-inpaint-mask.jpg"
)
mask_image = mask_image.resize((512, 512))
make_image_grid([init_image, mask_image], rows=1, cols=2)
import numpy as np
import torch
def make_inpaint_condition(image, image_mask):
image = np.array(image.convert("RGB")).astype(np.float32) / 255.0
image_mask = np.array(image_mask.convert("L")).astype(np.float32) / 255.0
assert image.shape[0:1] == image_mask.shape[0:1]
image[image_mask > 0.5] = -1.0 # set as masked pixel
image = np.expand_dims(image, 0).transpose(0, 3, 1, 2)
image = torch.from_numpy(image)
return image
control_image = make_inpaint_condition(init_image, mask_image)
from diffusers import StableDiffusionControlNetInpaintPipeline, ControlNetModel, EulerAncestralDiscreteScheduler
controlnet = ControlNetModel.from_pretrained("lllyasviel/control_v11p_sd15_inpaint", torch_dtype=torch.float16, use_safetensors=True)
pipe = StableDiffusionControlNetInpaintPipeline.from_pretrained(
"runwayml/stable-diffusion-v1-5", controlnet=controlnet, torch_dtype=torch.float16, use_safetensors=True
)
Setup VAE
vae = AutoencoderKL.from_pretrained("stabilityai/sd-vae-ft-mse", torch_dtype=torch.float16)
pipe.vae = vae
# Setup scheduler
pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(pipe.scheduler.config)
pipe.enable_model_cpu_offload()
output = pipe(
"corgi face with large ears, detailed, pixar, animated, disney",
num_inference_steps=20,
eta=1.0,
image=init_image,
mask_image=mask_image,
control_image=control_image,
).images[0]
make_image_grid([init_image, mask_image, output], rows=1, cols=3)
Logs
No response
System Info
RTX 3090 24GB, Runpod instance running ubuntu
Who can help?
Diffusers version?
v0.27.0