-
Notifications
You must be signed in to change notification settings - Fork 6.6k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
I have tried to use controlnet-xs pipeline with depth control, but there are some bugs here. I cannot find any instructions on how to use it on the Depth map in diffusers (only a canny image). It would be great if the author can provide some instructions on Control-XS on depth map @sayakpaul
My diffusers version: 2a111bc [origin/main] [Advanced Training Script] Fix pipe example (#6106)
Reproduction
from transformers import pipeline
prompt = "aerial view, a futuristic research complex in a bright foggy jungle, hard lighting"
negative_prompt = "low quality, bad quality, sketches"
depth_estimator = pipeline('depth-estimation')
image = Image.open('images_stormtrooper.png')
depth_image = depth_estimator(image)['depth']
image = np.array(image)
image = image[:, :, None]
image = np.concatenate([image, image, image], axis=2)
depth_image = Image.fromarray(image)
depth_image.save('depth.png')
controlnet_conditioning_scale = 0.5 # recommended for good generalization
# initialize the models and pipeline
controlnet_conditioning_scale = 0.5 # recommended for good generalization
controlnet = ControlNetXSModel.from_pretrained("UmerHA/ConrolNetXS-SDXL-depth", torch_dtype=torch.float16)
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
pipe = StableDiffusionXLControlNetXSPipeline.from_pretrained(
... "stabilityai/stable-diffusion-xl-base-1.0", controlnet=controlnet, vae=vae, torch_dtype=torch.float16
... )
pipe.enable_model_cpu_offload()
image = pipe(
prompt, controlnet_conditioning_scale=controlnet_conditioning_scale, image=depth_image
).images[0]
image.save('test.png')
Logs
File "/home/josha/reference/diffusers/src/diffusers/models/controlnetxs.py", line 741, in forward
h_base = h_base + next(it_up_convs_out)(hs_ctrl.pop()) * next(scales) # add info from ctrl encoder
RuntimeError: The size of tensor a (30) must match the size of tensor b (29) at non-singleton dimension 3
### System Info
diffusersversion: 0.25.0.dev0- Platform: Linux-6.2.0-34-generic-x86_64-with-glibc2.35
- Python version: 3.10.9
- PyTorch version (GPU?): 2.0.1+cu117 (True)
- Huggingface_hub version: 0.19.4
- Transformers version: 4.33.2
- Accelerate version: 0.21.0
- xFormers version: 0.0.21
- Using GPU in script?:
- Using distributed or parallel set-up in script?:
### Who can help?
@sayakpaul Hi Sayak, thanks for your supporting on ControlNet-XS, It would be great if you can reply to this information
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working