Skip to content

Overcoming the 77 token limit in diffusers #2136

@jslegers

Description

@jslegers

Description of the problem

CLIP has a 77 token limit, which is much too small for many prompts.

Several GUIs have found a way to overcome this limit, but not the diffusers library.

The solution I'd like

I would like diffusers to be able to run longer prompts and overcome the 77 token limit of CLIP for any model, much like the AUTOMATIC1111/stable-diffusion-webui already does.

Alternatives I've considered

  • I tried reverse-engineering the prompt interpretation logic from one of the other GUIs out there (not sure which one), but I couldn't find the code responsible.

  • I tried running the BAAI/AltDiffusion in diffusers, which uses AltCLIP instead of CLIP. Since AltCLIP has a max_position_embeddings value of 514 for its text encoder instead of 77, I had hoped I could just replace the text encoder and tokenizer of my models with those of BAAI/AltDiffusion to overcome the 77 token limit, but I couldn't get the BAAI/AltDiffusion to work in diffusers

Additional context

This is how the AUTOMATIC1111 overcomes the token limit, according to their documentation :

Typing past standard 75 tokens that Stable Diffusion usually accepts increases prompt size limit from 75 to 150. Typing past that increases prompt size further. This is done by breaking the prompt into chunks of 75 tokens, processing each independently using CLIP's Transformers neural network, and then concatenating the result before feeding into the next component of stable diffusion, the Unet.

For example, a prompt with 120 tokens would be separated into two chunks: first with 75 tokens, second with 45. Both would be padded to 75 tokens and extended with start/end tokens to 77. After passing those two chunks though CLIP, we'll have two tensors with shape of (1, 77, 768). Concatenating those results in (1, 154, 768) tensor that is then passed to Unet without issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    staleIssues that haven't received updates

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions