-
Notifications
You must be signed in to change notification settings - Fork 407
Add Argmax DiffusionKit Snippet #869
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Argmax DiffusionKit Snippet #869
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking great, thanks @ardaatahan! I've added some comments to improve the integration and then we should be good. Don't forget to tag models on the Hub as diffusionkit
(we usually prefer to have a few examples before merging 😉)
@@ -905,4 +905,33 @@ whisperkit-cli transcribe --audio-path /path/to/audio.mp3 | |||
# Or use your preferred model variant | |||
whisperkit-cli transcribe --model "large-v3" --model-prefix "distil" --audio-path /path/to/audio.mp3 --verbose`, | |||
]; | |||
|
|||
export const diffusionkit = (): string[] => [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move this section up in the file to preserve alphabetical order as much as possible? (just after diffusers
for instance)
prettyLabel: "DiffusionKit", | ||
repoName: "DiffusionKit", | ||
repoUrl: "https://github.com/argmaxinc/DiffusionKit", | ||
docsUrl: "https://github.com/argmaxinc/DiffusionKit?tab=readme-ov-file#-image-generation-with-python-mlx", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docsUrl: "https://github.com/argmaxinc/DiffusionKit?tab=readme-ov-file#-image-generation-with-python-mlx", |
No need to provide a docsUrl
when it's the same as repoUrl
@@ -174,6 +174,13 @@ export const MODEL_LIBRARIES_UI_ELEMENTS = { | |||
filter: true, | |||
/// diffusers has its own more complex "countDownloads" query | |||
}, | |||
diffusionkit: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to have some models listed in https://huggingface.co/models?other=diffusionkit before merging this PR. You can do that by tagging models as diffusionkit
in the model card metadata.
I opened a first PR to do that: https://huggingface.co/argmaxinc/mlx-FLUX.1-schnell/discussions/2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merged 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that the SD3 models have the DiffusionKit
tag: https://huggingface.co/models?other=DiffusionKit. The correct tag should be diffusionkit
(lowercased). Make sure to update them to benefit from this PR.
(the correct key is the key defined here, so diffusionkit
in this case. Below we are defining prettyLabel
and repoName
but that's just for aesthetic in the UI)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We generated these models with diffusionkit much before we had an MLX implementation, and in fact don't yet have a public CoreML inference implementation of DiffusionKit (in the works). These models are actually more directly usable by the huggingface swift-coreml-diffusers app, although they were generated with diffusionkit. So in this case we might want to just remove the lowercase diffusionkit
tag? @pcuenca what do you think about a local-app deeplink for Diffusers.app? We had discussed adding arbitrary repos in the past.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated our tags and model cards: https://huggingface.co/models?other=diffusionkit
`# Install CLI with pip | ||
pip install diffusionkit-cli | ||
|
||
# View all available options | ||
diffusionkit-cli --help | ||
|
||
# Generate image using default FLUX.1-schnell and save it to out.png | ||
diffusionkit-cli --prompt "a beautiful sunset over the ocean" | ||
|
||
# To use Stable Diffusion 3 accept the terms before downloading the checkpoint: https://huggingface.co/stabilityai/stable-diffusion-3-medium | ||
# Once you accept the terms, sign in with your Hugging Face hub token with read access to contents of all public gated repos you can access: | ||
huggingface-cli login --token YOUR_HF_HUB_TOKEN | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "code snippet" part is generally a very short snippet to help users getting started. It should not contain installation or configuration steps and should be focused on "how to use this model with diffusionkit". Code snippets are shown on each model page and are customized to be ready to use.
`# Install CLI with pip | |
pip install diffusionkit-cli | |
# View all available options | |
diffusionkit-cli --help | |
# Generate image using default FLUX.1-schnell and save it to out.png | |
diffusionkit-cli --prompt "a beautiful sunset over the ocean" | |
# To use Stable Diffusion 3 accept the terms before downloading the checkpoint: https://huggingface.co/stabilityai/stable-diffusion-3-medium | |
# Once you accept the terms, sign in with your Hugging Face hub token with read access to contents of all public gated repos you can access: | |
huggingface-cli login --token YOUR_HF_HUB_TOKEN | |
` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with @Wauplin. In addition, I'd recommend that different examples or use cases are returned as multiple strings, like in the audioseal example. This way snippets are separated visually and can be copied independently:

There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even better yes!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great example, was curious how to do this!
# Use specific model and set custom output path | ||
diffusionkit-cli --prompt "a futuristic cityscape" --model-version stable-diffusion-3-medium --output-path /path/to/output.png |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Use specific model and set custom output path | |
diffusionkit-cli --prompt "a futuristic cityscape" --model-version stable-diffusion-3-medium --output-path /path/to/output.png | |
# Generate an image and save it to out.png | |
diffusionkit-cli --prompt "a futuristic cityscape" --model-version ${model.id} |
# Set seed for reproducibility, specify number of steps, and set custom output image dimensions | ||
diffusionkit-cli --prompt "detailed cinematic dof render of a \ | ||
detailed MacBook Pro on a wooden desk in a dim room with items \ | ||
around, messy dirty room. On the screen are the letters 'FLUX on \ | ||
DiffusionKit' glowing softly. High detail hard surface render" \ | ||
--height 768 \ | ||
--width 1360 \ | ||
--seed 1001 \ | ||
--step 4 \ | ||
--output ~/Desktop/flux_on_mac.png`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Set seed for reproducibility, specify number of steps, and set custom output image dimensions | |
diffusionkit-cli --prompt "detailed cinematic dof render of a \ | |
detailed MacBook Pro on a wooden desk in a dim room with items \ | |
around, messy dirty room. On the screen are the letters 'FLUX on \ | |
DiffusionKit' glowing softly. High detail hard surface render" \ | |
--height 768 \ | |
--width 1360 \ | |
--seed 1001 \ | |
--step 4 \ | |
--output ~/Desktop/flux_on_mac.png`, | |
# Specify dimensions, seed, number of steps and destination file | |
diffusionkit-cli \ | |
--prompt "detailed cinematic dof render of a \ | |
detailed MacBook Pro on a wooden desk in a dim room with items \ | |
around, messy dirty room. On the screen are the letters 'FLUX on \ | |
DiffusionKit' glowing softly. High detail hard surface render" \ | |
--model-version ${model.id} | |
--height 768 \ | |
--width 1360 \ | |
--seed 1001 \ | |
--step 4 \ | |
--output ~/Desktop/flux_on_mac.png`, |
Add --model-version ${model.id}
+ some nits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @ardaatahan - thanks for the PR. 🤗
Quick thought, for the libraries PR let's use the actual modeling code snippets i.e. https://github.com/argmaxinc/DiffusionKit?tab=readme-ov-file#code
This also makes it easy for people to quickly able to take the snippet and run (this is similar to how we show for transformers)
and then for the CLI we can do a separate LocalApps integration - local apps integration would be similar to how we currently show drawthings, and other Text to Image apps?
drawthings: { |
I think this would give you the best distribution overall since you promote your library as a modeling/ conversion library plus as an app as well.
cc: @atiorh @enzostvs @pcuenca @Wauplin
Thoughts?
If you think this is a good idea then the next step would be to:
- Replace the CLI code snippet here with Python inference snippets.
- Open a new PR similar to how the drawthings integration I linked above is.
Agree with everything here, will adjust based on review. @Vaibhavs10 thanks for your comment, also curious about thoughts from the rest of the team. I think its a great suggestion. One caveat is that the mlx implementation of diffusionkit is not really a fully fledge app, but more of a cli utility into the python library (similar to how |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
app, but more of a cli utility into the python library (similar to howwhisperkit-cli provides an interface into the swift library). On the other hand, it will work with the source repos of some very specific text-to-image models that we support so far (flux and sd3) since we do the model dict adjustments inside the library, so will investigate how to filter that down similar to drawthings.
Ah makes sense - the current structure makes sense to me with actual model inference snippets over CLI. We can discuss CLI LocalApps integration a bit later once DiffusionKit matures a bit more potentially.
Reviewing the PR real quick.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pipeline = DiffusionPipeline( | ||
shift=3.0, | ||
use_t5=False, | ||
model_version="argmaxinc/mlx-stable-diffusion-3-medium", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So is it not possible to do it model-specific at this time, as @Wauplin mentioned?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree that it would be best to showcase only the relevant code snippet here. How does the CLI determines which pipeline to instantiate? I would reuse the same logic here. Otherwise, it's also possible to tag the models on the Hub as "flux"
or stable-diffusion
(using the tags
list in the model card metadata) and use this information to generate the snippets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your feedback! Both of your proposals are excellent suggestions. I will be implementing the latter approach, using tags in the model card metadata to determine which snippets to generate.
latent_size=(HEIGHT // 8, WIDTH // 8), | ||
)`; | ||
|
||
return [sd3Snippet, fluxSnippet, generateSnippet]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we have multiple snippets, it may be useful to include a comment line at the beginning of each one to explain what they are about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed and added, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey! Sorry being late to the party. Let's add it only as a library for now. Note that since we are displaying multiple snippets, it is still possible to have "Python" snippets followed by "CLI" snippets.
pipeline = DiffusionPipeline( | ||
shift=3.0, | ||
use_t5=False, | ||
model_version="argmaxinc/mlx-stable-diffusion-3-medium", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree that it would be best to showcase only the relevant code snippet here. How does the CLI determines which pipeline to instantiate? I would reuse the same logic here. Otherwise, it's also possible to tag the models on the Hub as "flux"
or stable-diffusion
(using the tags
list in the model card metadata) and use this information to generate the snippets.
1c0d5a9
to
6055b69
Compare
Hey @Vaibhavs10, thank you for your feedback! After your original comment about opening a new PR similar to how the drawthings integration is, I made similar changes to the local-apps file for DiffusionKit and I'm curious about your thoughts: + const snippetDiffusionKit = (model: ModelData): LocalAppSnippet[] => {
+ const command = (binary: string) =>
+ [
+ "# Load and run the model:",
+ `${binary} \\`,
+ ` --model-version ${model.id}" \\`,
+ ' --prompt "a futuristic cityscape" \\',
+ ` --height 768 \\`,
+ ` --width 1360 \\`,
+ ` --seed 1001 \\`,
+ ` --step 4 \\`,
+ ` --output ~/Desktop/out.png`,
+ ].join("\n");
+ return [
+ {
+ title: "Install from pip",
+ setup: "pip install diffusionkit",
+ content: command("diffusionkit-cli"),
+ },
+ {
+ title: "Build from source code",
+ setup: [
+ // prettier-ignore
+ "git clone https://github.com/argmaxinc/DiffusionKit.git",
+ "cd DiffusionKit",
+ "pip install -e .",
+ ].join("\n"),
+ content: command("diffusionkit-cli"),
+ },
+ ];
+ };
Inside LOCAL_APPS:
+ diffusionkit: {
+ prettyLabel: "DiffusionKit",
+ docsUrl: "https://github.com/argmaxinc/DiffusionKit",
+ mainTask: "text-to-image",
+ macOSOnly: true,
+ displayOnModelPage: (model) => model.library_name === "diffusionkit" && model.pipeline_tag === "text-to-+image",
+ snippet: snippetDiffusionKit,
+ }, Given that these local app changes are already implemented, would you recommend creating a new PR to handle them? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes! :)
NUM_STEPS = 4 # 4 for FLUX.1-schnell, 50 for SD3 | ||
CFG_WEIGHT = 0. # for FLUX.1-schnell, 5. for SD3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NUM_STEPS = 4 # 4 for FLUX.1-schnell, 50 for SD3 | |
CFG_WEIGHT = 0. # for FLUX.1-schnell, 5. for SD3 | |
NUM_STEPS = ${model.tags.includes("flux") ? 4 : 50} | |
CFG_WEIGHT = ${model.tags.includes("flux") ? 0. : 5} |
Let's reuse the flux
tag to generate the pipeline config as well :)
(disclaimer: not tested)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion! I just tested and made the changes.
@@ -174,6 +174,13 @@ export const MODEL_LIBRARIES_UI_ELEMENTS = { | |||
filter: true, | |||
/// diffusers has its own more complex "countDownloads" query | |||
}, | |||
diffusionkit: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that the SD3 models have the DiffusionKit
tag: https://huggingface.co/models?other=DiffusionKit. The correct tag should be diffusionkit
(lowercased). Make sure to update them to benefit from this PR.
(the correct key is the key defined here, so diffusionkit
in this case. Below we are defining prettyLabel
and repoName
but that's just for aesthetic in the UI)
@Vaibhavs10 @pcuenca @Wauplin Let me know if there's any other changes needed here or if it is good to go. Open to any suggestions you'd think would help the users! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything looks good to me!
@Vaibhavs10 and/or @pcuenca please have a last check as well and we should be good to merge :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, thanks a lot for iterating here 🙌 Let's wait for @Wauplin and @Vaibhavs10 thoughts!
Edit: had the review tab open while @Wauplin was submitting his review.
|
||
const pipelineSnippet = model.tags.includes("flux") ? fluxSnippet : sd3Snippet; | ||
|
||
return [pipelineSnippet, generateSnippet]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We usually show a single snippet per task (in this case it would be pipeline preparation followed by generation), but I'm not opposed to using two to differentiate instantiation from generation.
Linter fixed, but two tests failing. Is it ok to merge @Wauplin @SBrandeis ? |
Thanks for the linter :) I would say yes since the failing tests are unrelated to this PR (already happening on |
This PR:
diffusionkit
snippet in model-libraries-snippets.ts.diffusionkit
library with necessary information inMODEL_LIBRARIES_UI_ELEMENTS
in model-libraries.ts.