Skip to content

Conversation

@ngxson
Copy link
Collaborator

@ngxson ngxson commented Apr 25, 2025

I don't quite like the abstract naming like "resample", "merger", etc. It can be useful if one projector can be reused by various vision models. But unfortunately, that has hardly been the case.

The cumbersome bool has_*_projector pattern is also removed. The only variable being kept is has_llava_projector, because both MLP, MLP_NORM, LDP, LDPV2 are considered variants of llava projector.

Test result:

OK:   llama-mtmd-cli ggml-org/SmolVLM-500M-Instruct-GGUF:Q8_0
OK:   llama-mtmd-cli ggml-org/SmolVLM2-2.2B-Instruct-GGUF:Q4_K_M
OK:   llama-mtmd-cli ggml-org/SmolVLM2-500M-Video-Instruct-GGUF:Q8_0
OK:   llama-mtmd-cli ggml-org/gemma-3-4b-it-GGUF:Q4_K_M
OK:   llama-mtmd-cli guinmoon/MobileVLM-3B-GGUF:Q4_K_M
OK:   llama-mtmd-cli THUDM/glm-edge-v-5b-gguf:Q4_K_M
OK:   llama-mtmd-cli second-state/Llava-v1.5-7B-GGUF:Q2_K
OK:   llama-mtmd-cli cjpais/llava-1.6-mistral-7b-gguf:Q3_K
OK:   llama-mtmd-cli ibm-research/granite-vision-3.2-2b-GGUF:Q4_K_M
OK:   llama-mtmd-cli second-state/MiniCPM-Llama3-V-2_5-GGUF:Q2_K
OK:   llama-mtmd-cli openbmb/MiniCPM-V-2_6-gguf:Q2_K
OK:   llama-mtmd-cli openbmb/MiniCPM-o-2_6-gguf:Q4_0
OK:   llama-qwen2vl-cli bartowski/Qwen2-VL-2B-Instruct-GGUF:Q4_K_M
OK:   llama-mtmd-cli ggml-org/pixtral-12b-GGUF:Q4_K_M

@ngxson ngxson requested a review from ggerganov April 25, 2025 22:18
PROJECTOR_TYPE_MINICPMV,
PROJECTOR_TYPE_GLM_EDGE,
PROJECTOR_TYPE_MERGER,
PROJECTOR_TYPE_QWEN2VL,
Copy link
Collaborator Author

@ngxson ngxson Apr 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @HimariO , PROJECTOR_TYPE_RESAMPLER is renamed to PROJECTOR_TYPE_QWEN2VL

For qwen2.5, we can add PROJECTOR_TYPE_QWEN25VL. For code paths used by qwenvl, we will need to check ctx->proj_type == PROJECTOR_TYPE_QWEN2VL || ctx->proj_type == PROJECTOR_TYPE_QWEN25VL

But tbh the best way is to have a dedicated builder function for qwenvl, it makes the code much easier to read. I'll make a proposal in the next few days.

@ngxson ngxson merged commit 4753791 into ggml-org:master Apr 26, 2025
48 checks passed
pockers21 pushed a commit to pockers21/llama.cpp that referenced this pull request Apr 28, 2025
* clip : improve projector naming

* no more kv has_llava_projector

* rm unused kv

* rm more unused
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants