diff --git a/docs/source/models/supported_models.md b/docs/source/models/supported_models.md index 1fde1761672e..c9140bd06e89 100644 --- a/docs/source/models/supported_models.md +++ b/docs/source/models/supported_models.md @@ -847,7 +847,7 @@ See [this page](#generative-models) for more information on how to use generativ * ✅︎ * ✅︎ - * `PaliGemmaForConditionalGeneration` - * PaliGemma (see note), PaliGemma 2 (see note) + * PaliGemma ⚠️, PaliGemma 2 ⚠️ * T + IE * `google/paligemma-3b-pt-224`, `google/paligemma-3b-mix-224`, `google/paligemma2-3b-ft-docci-448`, etc. * @@ -917,6 +917,12 @@ See [this page](#generative-models) for more information on how to use generativ E Pre-computed embeddings can be inputted for this modality. + Multiple items can be inputted per text prompt for this modality. +:::{warning} +vLLM does not currently support PrefixLM attention mask, so our PaliGemma implementation uses regular causal attention, which causes the model output to be unstable. + +We may deprecate this model series in a future release. +::: + :::{note} `h2oai/h2ovl-mississippi-2b` will be available in V1 once we support backends other than FlashAttention. ::: @@ -930,10 +936,6 @@ The official `openbmb/MiniCPM-V-2` doesn't work yet, so we need to use a fork (` For more details, please see: ::: -:::{note} -Currently the PaliGemma model series is implemented without PrefixLM attention mask. This model series may be deprecated in a future release. -::: - :::{note} To use Qwen2.5-VL series models, you have to install Hugging Face Transformers library from source via `pip install git+https://github.com/huggingface/transformers`. :::