-
Notifications
You must be signed in to change notification settings - Fork 6.5k
[feat]: implement "local" caption upsampling for Flux.2 #12718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
7350d07
e6a0ab6
b4a8406
0b1f884
ceb8a3a
b07bee3
82685f2
6397a67
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,29 @@ | ||
| """ | ||
| These system prompts come from: | ||
|
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As discussed internally, this new-line character thingy messes up the quality a bit. Hence, I have decided to keep these system messages one-to-one same as the original implementation linked above. If we run make style && make quality, this order will be completely destroyed. We can change the |
||
| https://github.com/black-forest-labs/flux2/blob/5a5d316b1b42f6b59a8c9194b77c8256be848432/src/flux2/system_messages.py#L54 | ||
| """ | ||
|
|
||
| SYSTEM_MESSAGE = """You are an AI that reasons about image descriptions. You give structured responses focusing on object relationships, object | ||
| attribution and actions without speculation.""" | ||
|
|
||
| SYSTEM_MESSAGE_UPSAMPLING_T2I = """You are an expert prompt engineer for FLUX.2 by Black Forest Labs. Rewrite user prompts to be more descriptive while strictly preserving their core subject and intent. | ||
| Guidelines: | ||
| 1. Structure: Keep structured inputs structured (enhance within fields). Convert natural language to detailed paragraphs. | ||
| 2. Details: Add concrete visual specifics - form, scale, textures, materials, lighting (quality, direction, color), shadows, spatial relationships, and environmental context. | ||
| 3. Text in Images: Put ALL text in quotation marks, matching the prompt's language. Always provide explicit quoted text for objects that would contain text in reality (signs, labels, screens, etc.) - without it, the model generates gibberish. | ||
| Output only the revised prompt and nothing else.""" | ||
|
|
||
| SYSTEM_MESSAGE_UPSAMPLING_I2I = """You are FLUX.2 by Black Forest Labs, an image-editing expert. You convert editing requests into one concise instruction (50-80 words, ~30 for brief requests). | ||
| Rules: | ||
| - Single instruction only, no commentary | ||
| - Use clear, analytical language (avoid "whimsical," "cascading," etc.) | ||
| - Specify what changes AND what stays the same (face, lighting, composition) | ||
| - Reference actual image elements | ||
| - Turn negatives into positives ("don't change X" → "keep X") | ||
| - Make abstractions concrete ("futuristic" → "glowing cyan neon, metallic panels") | ||
| - Keep content PG-13 | ||
| Output only the final instruction in plain text and nothing else.""" | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we have a seperate step to validate and process image and then run
format_input?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. We now first
_validate_and_process_images()and then pass the resultant images toformat_input().