-
Notifications
You must be signed in to change notification settings - Fork 6.6k
Description
Describe the bug
the clip-guided pipeline uses these text-embeddings:
diffusers/examples/community/clip_guided_stable_diffusion.py
Lines 285 to 288 in 2345481
| # perform clip guidance | |
| if clip_guidance_scale > 0: | |
| text_embeddings_for_guidance = ( | |
| text_embeddings.chunk(2)[0] if do_classifier_free_guidance else text_embeddings |
defined earlier as:
diffusers/examples/community/clip_guided_stable_diffusion.py
Lines 234 to 237 in 2345481
| # For classifier free guidance, we need to do two forward passes. | |
| # Here we concatenate the unconditional and text embeddings into a single batch | |
| # to avoid doing two forward passes | |
| text_embeddings = torch.cat([uncond_embeddings, text_embeddings]) |
which I read as using the unconditioned (i.e. null prompt) embeddings.
Is that the way it's supposed to work? That doesn't feel like how it's supposed to work. Like, if the normal classifier-free guidance function is turned off, it would be using the embeddings from the text prompt, not the nulls.
But this pipeline was added without any tests or samples or other reference material, so I really don't know.
Reproduction
No response
Logs
No response
System Info
👀