Skip to content

Conversation

@StAlKeR7779
Copy link
Contributor

@StAlKeR7779 StAlKeR7779 commented Jun 26, 2023

As it said in comment to this branch we want to use conditioning run:

if cfg_injection:  # only applying ControlNet to conditional instead of in unconditioned

But in code used unconditioning embeddings(conditioning_data.unconditioned_embeddings).

Later in code confirms that we want to run conditioning generation by comment and tensor concatenation order(as all code expect to get [uc, c] tensor):

if cfg_injection:
    # Inferred ControlNet only for the conditional batch.
    # To apply the output of ControlNet to both the unconditional and conditional batches,
    #   add 0 to the unconditional batch to keep it unchanged.
    down_samples = [torch.cat([torch.zeros_like(d), d]) for d in down_samples]
    mid_sample = torch.cat([torch.zeros_like(mid_sample), mid_sample])

@GreggHelt2
Copy link
Contributor

GreggHelt2 commented Jun 27, 2023

This may just be a case of me copying an incorrect comment in diffusers guess mode code or sd-webui-controlnet control mode code. Should probably check the appropriate places in those two code bases to see if its just comment issue or if my implementation is actually wrong relative to those.
Most important thing here I think is to match up implementation options with stated intent in UI:

  • "Prompt" mode should favor the prompt
  • "Control" mode should favor the control
  • "Balanced" should be balance of prompt and control
  • "Mega Control" should be way in favor of control (or if code changes to way favor prompt, should be renamed "Mega Prompt" -- and matching up with diffusers or sd-webui-control is irrelevant for this one)

@StAlKeR7779
Copy link
Contributor Author

StAlKeR7779 commented Jun 27, 2023

It should be really strange technique then - use unconditioning(negative) embeddings to generate noise which used as conditioning(positive) noise %_%

Copy link
Collaborator

@lstein lstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lstein lstein enabled auto-merge June 28, 2023 19:44
@lstein lstein disabled auto-merge June 28, 2023 19:45
@lstein
Copy link
Collaborator

lstein commented Jun 28, 2023

I'm going ahead and approving this, but I'll wait for Gregg to say whether it should be merged.

@lstein lstein enabled auto-merge July 1, 2023 18:36
@lstein lstein disabled auto-merge July 1, 2023 18:36
@lstein lstein enabled auto-merge July 3, 2023 16:37
Copy link
Contributor

@GreggHelt2 GreggHelt2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this looks good! Thanks for finding this bug.
And apologies it took so long to get to approving this fix!
I was on the road the last few weeks and away from my dev box. So although this fix looked right, my main debugging tool was staring at (many) previously generated ControlNet images that used cfg_injection ("more control" and "mega control" modes). And they all seemed to behave as expected -- paying more attention to the ControlNet than the prompt. Now that I'm back and testing this PR, I can see differences between the results with and without this fix. But in many cases much more subtle than I would have thought. Canny especially seems little changed:

canny_seed0_pr3595_top
Left is Canny edge detection preprocessed image. Prompt for all images is "old man", with canny ControlNet model. Right top (moving left to right) is controlnet mode = "balanced", "more prompt", "more control", with this PR's cfg_injection fix.
Right bottom is controlnet mode = "balanced", "more_prompt", "more control", without this PR's injection fix.
Only one where this PR should apply is for "more control", rest should be the same for with or without PR fix.
There is some difference for "more_control" (easiest to see is in gap between neck and hat/scarf), but very subtle for what I would have expected to have a large effect.

But here's something less subtle -- same as previous image but with Midas depth estimator and ControlNet depth model:

depth_seed0_pr3595_top

Difference between PR and non-PR "more control" for this one is pretty obvious.

@lstein lstein merged commit 8fc0ce7 into main Jul 13, 2023
@lstein lstein deleted the fix/controlnet_cfg_inj_cond branch July 13, 2023 13:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants