Skip to content

mtmd : Support jinja in libmtmd (Only for QwenVL and Qwen Omni) #14730

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

alielmorsy
Copy link

That code is part of a private repo I’ve been working on. It provides essential support for Jinja in a multi-model setup.
The PR adds two new optional metadata fields for GGUF:

  1. tokenizer.ggml.image_token_id:For the image token, if it exists.
  2. tokenizer.ggml.audio_token_id: For the audio token, if it exists.

If these tokens do not exist, a fallback is used, similar to the FIM lookup. The current tokens used for images are <|IMAGE|> and <IMAGE>

For the MTMD tokenizer, I maintained backward compatibility and updated the split function to support multiple delimiters, allowing it to work with both the old marker and the preserved tokens.

One final change (only for Qwen models): I removed the image_start and image_end tokens as the model has its own special tokens already.

@alielmorsy
Copy link
Author

@ggerganov Could you please check this one?

Copy link
Collaborator

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this is quite unnecessary complicated. The current system already been able to support image/audio input with jinja templates, without even using image/audio tokens. Why do we need this PR?

@alielmorsy
Copy link
Author

@ngxson It simplifies the API to directly convert a dictionary filled with messages to a full prompt the whole magic of the legacy function which surprisely depend on the template under the hood but with some extra layers to clean the messages add the marker,...etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants