Feat/controlnet nodes #3405

GreggHelt2 · 2023-05-12T19:01:14Z

This PR adds support for ControlNet (and multiple ControlNets) within the nodes backend and nodes UI.
It adds:

image processing nodes for each of the ControlNet v1.1 annotators
a ControlNet node with options to specify:
- ControlNet model
- proprocessed image,
- weight: how much influence ControlNet has on generated image
- start and end for range of diffusion steps to apply ControlNet to (specified as fraction of total steps)
"control" input port to the TextToLatents node

Usage:

Single controlnet using preprocessed image

Multiple controlnets using image preprocessors

One limitation on current implementation is that there must be a Collect node between the controlnet(s) control output and the TextToLatents control input. Directly connecting a ControlNet node to TextToLatent node will result in an error. This is because I haven't figured out how to set up a polymorphic input port that can take either a single ControlField item or a list of ControlField items. I'm pretty sure it can be done but everything I've tried so far results in errors. Will reach out on discord for help on this, but I don't think it's a big enough issue to block the PR.

psychedelicious · 2023-05-14T10:09:21Z

It works!

I've just done some playing around and found one issue - the SD 2.1 768x768 model causes an error:

Traceback (most recent call last):
  File "/home/bat/Documents/Code/InvokeAI/invokeai/app/services/processor.py", line 70, in __process
    outputs = invocation.invoke(
  File "/home/bat/Documents/Code/InvokeAI/invokeai/app/invocations/latent.py", line 326, in invoke
    result_latents, result_attention_map_saver = model.latents_from_embeddings(
  File "/home/bat/Documents/Code/InvokeAI/invokeai/backend/stable_diffusion/diffusers_pipeline.py", line 545, in latents_from_embeddings
    result: PipelineIntermediateState = infer_latents_from_embeddings(
  File "/home/bat/Documents/Code/InvokeAI/invokeai/backend/stable_diffusion/diffusers_pipeline.py", line 212, in __call__
    for result in self.generator_method(*args, **kwargs):
  File "/home/bat/Documents/Code/InvokeAI/invokeai/backend/stable_diffusion/diffusers_pipeline.py", line 600, in generate_latents_from_embeddings
    step_output = self.step(
  File "/home/bat/invokeai/.venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/bat/Documents/Code/InvokeAI/invokeai/backend/stable_diffusion/diffusers_pipeline.py", line 682, in step
    down_samples, mid_sample = control_datum.model(
  File "/home/bat/invokeai/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/bat/invokeai/.venv/lib/python3.10/site-packages/diffusers/models/controlnet.py", line 526, in forward
    sample, res_samples = downsample_block(
  File "/home/bat/invokeai/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/bat/invokeai/.venv/lib/python3.10/site-packages/diffusers/models/unet_2d_blocks.py", line 867, in forward
    hidden_states = attn(
  File "/home/bat/invokeai/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/bat/invokeai/.venv/lib/python3.10/site-packages/diffusers/models/transformer_2d.py", line 265, in forward
    hidden_states = block(
  File "/home/bat/invokeai/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/bat/invokeai/.venv/lib/python3.10/site-packages/diffusers/models/attention.py", line 331, in forward
    attn_output = self.attn2(
  File "/home/bat/invokeai/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/bat/invokeai/.venv/lib/python3.10/site-packages/diffusers/models/attention_processor.py", line 267, in forward
    return self.processor(
  File "/home/bat/invokeai/.venv/lib/python3.10/site-packages/diffusers/models/attention_processor.py", line 733, in __call__
    key = attn.to_k(encoder_hidden_states)
  File "/home/bat/invokeai/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/bat/invokeai/.venv/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 114, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (154x1024 and 768x320)

Tried with Canny and OpenPose models.

GreggHelt2 · 2023-05-16T19:02:48Z

@psychedelicious

I've just done some playing around and found one issue - the SD 2.1 768x768 model causes an error:

Looking at your node graph, what should be happening is that the OpenPose preprocessor node resizes the image to 512x512 pixels before it is fed to the OpenPose algorithm. And redundantly resizing the resulting image again to 512x512 pixels before it is sent to the ControlNet node, which doesn't do any resizing. However OpenPose node is also sending width=512 and height=512 params to Noise node. So possibly the 768x768 SD2.1 model doesn't like the 64x64 latent being passed in from Noise? Or it could be something in ControlNet code in latent.py -- I'll do some testing...

hipsterusername · 2023-05-16T20:13:32Z

I think it is likely because he's not using 2.1 Controlnet models, but is using SD 2.1.

psychedelicious · 2023-05-16T22:43:05Z

@GreggHelt2 ill have to get back to you on if this breaks when I use the correct 2.1 control net models

However this is unexpected, why is the control net preprocessor resizing my image? The control image is not 512x512 and the results are not either.

Also, what are the two integer parameters use for on the preprocessor? If we are expression some image size or resolution, wouldn't be want to include both width and height?

GreggHelt2 · 2023-05-18T22:45:21Z

@GreggHelt2 ill have to get back to you on if this breaks when I use the correct 2.1 control net models

However this is unexpected, why is the control net preprocessor resizing my image? The control image is not 512x512 and the results are not either.

Also, what are the two integer parameters use for on the preprocessor? If we are expression some image size or resolution, wouldn't be want to include both width and height?

The image preprocessors are definitely a mixed bag. Some have detect_resolution and image_resolution parameters that I've exposed. The idea is that, depending on the image, for some processors it makes sense to internally resize up/down before doing the core image analysis, then resize back down/up for output. The convention with these preprocessors is to assume uniform height/width scaling. The single resolution specified is used for the min(height,weight), and the size of other dimension is calculated based on uniform scaling.

See Mikubill/sd-webui-controlnet#924 for discussion of resizing complexity that led to adding a "pixel_perfect" option in ControlNet auto1111 extension that autoresizes. I have not yet implemented that option but it is on my TODO list (probably another PR after this one gets merged).

hipsterusername

Tested - Aside from minor nits I've called out previously (follow-on fixes) this thing is ready for main!

hipsterusername · 2023-05-24T15:34:37Z

@GreggHelt2 - Seeing a number of conflicts in here - May be due to other PRs being merged in?

psychedelicious · 2023-05-24T15:44:02Z

Just realized, this is going to need some care while rebasing due to the images refactor pr. I'm happy to have a go at that tomorrow since I'm familiar with the changes, and the resolution may not be trivial.

blessedcoolant · 2023-05-24T15:49:25Z

Tested - Aside from minor nits I've called out previously (follow-on fixes) this thing is ready for main!

+1 .. I ran it quite a bit today. The ControlNet part itself is pretty solid. There's some UI/UX stuff that might need refining but can happen in a future PR. Also the Model Management PR needs to be merged in so it can handle the ControlNet models from disk rather than loading from HF.

blessedcoolant · 2023-05-24T15:50:35Z

Whoever merges this, maybe do a squash merge coz there's a large number of commits and a traceback might get harder. Unless @GreggHelt2 wants to retain the commit history.

GreggHelt2 · 2023-05-24T18:22:12Z

Update: Recent modifications to PR including pinning to the newly released controlnet_aux v0.0.4, reinstating the Zoe depth preprocessor node, and adding a Mediapipeface preprocessor node. Also, thanks to a great session with @psychedelicious , we got polymorphic input ports on nodes working. So now the control input port on TextToLatents can take either a single ControlField input or a list of ControlFields. So a single ControlNet can connect directly to TextToLatents without going through a Collect node, like:

GreggHelt2 · 2023-05-24T18:29:28Z

If you've manually installed controlnet_aux v0.0.4 to test this PR, you may want to check what version of the timm package is installed. It needs to be <= 0.6.13 in order for Zoe processors to work (see issue isl-org/ZoeDepth#26). Currently in the InvokeAI pyproject.toml I'm forcing the timm version to 0.6.13, so I think a pip install -e . on this branch will fix too.

GreggHelt2 · 2023-05-24T18:43:33Z

@GreggHelt2 - Seeing a number of conflicts in here - May be due to other PRs being merged in?

I'm not worried about the conflicts. Most of the currently reported conflicts are in the autogenerated stuff, I think from doing a yarn api:web. So only "real" conflict is in latent.py, which makes sense.

GreggHelt2 · 2023-05-24T18:49:31Z

Whoever merges this, maybe do a squash merge coz there's a large number of commits and a traceback might get harder. Unless @GreggHelt2 wants to retain the commit history.

I do like retaining commit history. "git bisect" with more precise history has saved me more than once. Maybe see how messy a final rebase is before deciding? I've been rebasing pretty much every time there's another PR merged to main just to make sure this PR isn't straying too far away. Though I haven't rebased after image refactoring PR (mentioned by @psychedelicious above) was merged yesterday/yesternight.

hipsterusername · 2023-05-25T14:16:53Z

@GreggHelt2 - I think mediapipe may need to be added to pyproject.toml after your latest commits.

Were you going to address conflicts or were you waiting for @psychedelicious to do that?

GreggHelt2 · 2023-05-26T18:42:53Z

@GreggHelt2 - I think mediapipe may need to be added to pyproject.toml after your latest commits.

Thanks for catching the issue with mediapipe. For now like you suggested I added requirement to pyproject.toml. I'll put in a PR to controlnet_aux repo to add to its requirements instead, so eventually we should be able to remove again from pyproject.toml.

Were you going to address conflicts or were you waiting for @psychedelicious to do that?

I'll deal with those conflicts today.

hipsterusername · 2023-05-26T19:39:42Z

Sounds great - pushing the big ol' merge button once they are! :)

…Txt2Img in backend/generator. Although backend/generator will likely disappear by v3.x, right now they are very useful for testing core ControlNet and MultiControlNet functionality while node codebase is rapidly evolving.

MidasDepth ZoeDepth MLSD NormalBae Pidi LineartAnime ContentShuffle Removed pil_output options, ControlNet preprocessors should always output as PIL. Removed diagnostics and other general cleanup.

… node, stripped controlnet stuff form image processing/analysis nodes.

…sor mismatch issue.

…section

…data struct. Also redid how multiple controlnets are handled.

each ControlNet, and which step to end using each controlnet (specified as fraction of total steps)

…gnostic printing. Also fixed error when there is no controlnet input.

…aux release that supports it.

…t of popular ControlNet model names.

…urned of pre-processor params that were added post v0.0.3. Also change defaults for shuffle.

…extToLatents.invoke(), and make upcoming integration with LatentsToLatents easier.

Also hacked in ability to specify HF subfolder when loading ControlNet models from string.

…s and min_confidence params.

…nOutput. Also added FloatOutput.

…ntrolnet_aux package adds mediapipe to its requirements.

…ts to work with revised Image services.

GreggHelt2 · 2023-05-26T23:58:52Z

Sounds great - pushing the big ol' merge button once they are! :)

All working now and rebased to main!

GreggHelt2 requested review from JPPhoto, Kyle0654, blessedcoolant, damian0815, ebr, lstein and psychedelicious as code owners May 12, 2023 19:01

GreggHelt2 force-pushed the feat/controlnet-nodes branch from 0bd7b62 to 84727a7 Compare May 13, 2023 11:22

GreggHelt2 requested a review from hipsterusername as a code owner May 23, 2023 20:40

hipsterusername approved these changes May 23, 2023

View reviewed changes

GreggHelt2 requested review from StAlKeR7779 and maryhipp as code owners May 26, 2023 18:08

GreggHelt2 added 2 commits May 26, 2023 12:40

Core implementation of ControlNet and MultiControlNet.

bf7d24f

GreggHelt2 and others added 26 commits May 26, 2023 14:26

Added more preprocessor nodes for:

60b228b

MidasDepth ZoeDepth MLSD NormalBae Pidi LineartAnime ContentShuffle Removed pil_output options, ControlNet preprocessors should always output as PIL. Removed diagnostics and other general cleanup.

Prep for splitting pre-processor and controlnet nodes

4dcebdc

Refactored controlnet nodes: split out controlnet stuff into separate…

3bedc04

… node, stripped controlnet stuff form image processing/analysis nodes.

Added resizing of controlnet image based on noise latent. Fixes a ten…

1739740

…sor mismatch issue.

Cleaning up TextToLatent arg testing

abcee6c

Cleaning up mistakes after rebase.

bc37242

Removed last bits of dtype and and device hardwiring from controlnet …

5a4714a

…section

Refactored ControNet support to consolidate multiple parameters into …

b9a085a

…data struct. Also redid how multiple controlnets are handled.

Added support for specifying which step iteration to start using

92a18d7

each ControlNet, and which step to end using each controlnet (specified as fraction of total steps)

Cleaning up prior to submitting ControlNet PR. Mostly turning off dia…

f3c5d11

…gnostic printing. Also fixed error when there is no controlnet input.

Commented out ZoeDetector. Will re-instate once there's a controlnet-…

f806374

…aux release that supports it.

Switched CotrolNet node modelname input from free text to default lis…

71fbefe

…t of popular ControlNet model names.

Fix to work with current stable release of controlnet_aux (v0.0.3). T…

1c5afe0

…urned of pre-processor params that were added post v0.0.3. Also change defaults for shuffle.

Refactored most of controlnet code into its own method to declutter T…

1976056

…extToLatents.invoke(), and make upcoming integration with LatentsToLatents easier.

Cleaning up after ControlNet refactor in TextToLatentsInvocation

4713ce4

Extended node-based ControlNet support to LatentsToLatentsInvocation.

c8883f5

chore(ui): regen api client

291e542

fix(ui): fix node ui type hints

73cae3a

fix(nodes): controlnet input accepts list or single controlnet

f146cc8

Added Mediapipe image processor for use as ControlNet preprocessor.

450b7e9

Also hacked in ability to specify HF subfolder when loading ControlNet models from string.

Fixed bug where MediapipFaceProcessorInvocation was ignoring max_face…

00c8a52

…s and min_confidence params.

Added nodes for float params: ParamFloatInvocation and FloatCollectio…

095f66c

…nOutput. Also added FloatOutput.

Added mediapipe install requirement. Should be able to remove once co…

7cb5079

…ntrolnet_aux package adds mediapipe to its requirements.

Added float to FIELD_TYPE_MAP ins constants.ts

e402979

Progress toward improvement in fieldTemplateBuilder.ts getFieldType()

fd5b73e

Fixed controlnet preprocessors and controlnet handling in TextToLaten…

a4b0140

…ts to work with revised Image services.

GreggHelt2 force-pushed the feat/controlnet-nodes branch from 3a23956 to a4b0140 Compare May 26, 2023 23:55

hipsterusername merged commit 9a79636 into main May 27, 2023

hipsterusername deleted the feat/controlnet-nodes branch May 27, 2023 01:44

Feat/controlnet nodes #3405

Feat/controlnet nodes #3405

Uh oh!

Conversation

GreggHelt2 commented May 12, 2023

Uh oh!

psychedelicious commented May 14, 2023

Uh oh!

GreggHelt2 commented May 16, 2023

Uh oh!

hipsterusername commented May 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

psychedelicious commented May 16, 2023

Uh oh!

GreggHelt2 commented May 18, 2023

Uh oh!

hipsterusername left a comment

Choose a reason for hiding this comment

Uh oh!

hipsterusername commented May 24, 2023

Uh oh!

psychedelicious commented May 24, 2023

Uh oh!

blessedcoolant commented May 24, 2023

Uh oh!

blessedcoolant commented May 24, 2023

Uh oh!

GreggHelt2 commented May 24, 2023

Uh oh!

GreggHelt2 commented May 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

GreggHelt2 commented May 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

GreggHelt2 commented May 24, 2023

Uh oh!

hipsterusername commented May 25, 2023

Uh oh!

GreggHelt2 commented May 26, 2023

Uh oh!

hipsterusername commented May 26, 2023

Uh oh!

GreggHelt2 commented May 26, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

hipsterusername commented May 16, 2023 •

edited

Loading

GreggHelt2 commented May 24, 2023 •

edited

Loading

GreggHelt2 commented May 24, 2023 •

edited

Loading