-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Fix wrong conditioning used #3595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
This may just be a case of me copying an incorrect comment in diffusers guess mode code or sd-webui-controlnet control mode code. Should probably check the appropriate places in those two code bases to see if its just comment issue or if my implementation is actually wrong relative to those.
|
|
It should be really strange technique then - use unconditioning(negative) embeddings to generate noise which used as conditioning(positive) noise %_% |
lstein
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
I'm going ahead and approving this, but I'll wait for Gregg to say whether it should be merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this looks good! Thanks for finding this bug.
And apologies it took so long to get to approving this fix!
I was on the road the last few weeks and away from my dev box. So although this fix looked right, my main debugging tool was staring at (many) previously generated ControlNet images that used cfg_injection ("more control" and "mega control" modes). And they all seemed to behave as expected -- paying more attention to the ControlNet than the prompt. Now that I'm back and testing this PR, I can see differences between the results with and without this fix. But in many cases much more subtle than I would have thought. Canny especially seems little changed:

Left is Canny edge detection preprocessed image. Prompt for all images is "old man", with canny ControlNet model. Right top (moving left to right) is controlnet mode = "balanced", "more prompt", "more control", with this PR's cfg_injection fix.
Right bottom is controlnet mode = "balanced", "more_prompt", "more control", without this PR's injection fix.
Only one where this PR should apply is for "more control", rest should be the same for with or without PR fix.
There is some difference for "more_control" (easiest to see is in gap between neck and hat/scarf), but very subtle for what I would have expected to have a large effect.
But here's something less subtle -- same as previous image but with Midas depth estimator and ControlNet depth model:
Difference between PR and non-PR "more control" for this one is pretty obvious.

As it said in comment to this branch we want to use conditioning run:
But in code used unconditioning embeddings(
conditioning_data.unconditioned_embeddings).Later in code confirms that we want to run conditioning generation by comment and tensor concatenation order(as all code expect to get [uc, c] tensor):