File tree Expand file tree Collapse file tree 1 file changed +9
-1
lines changed Expand file tree Collapse file tree 1 file changed +9
-1
lines changed Original file line number Diff line number Diff line change @@ -334,6 +334,14 @@ The following modalities are supported depending on the model:
334
334
- **V **\ ideo
335
335
- **A **\ udio
336
336
337
+ Any combination of modalities joined by :code: `+ ` are supported.
338
+
339
+ - e.g.: :code: `T + I ` means that the model supports text-only, image-only, and text-with-image inputs.
340
+
341
+ On the other hand, modalities separated by :code: `/ ` are mutually exclusive.
342
+
343
+ - e.g.: :code: `T / I ` means that the model supports text-only and image-only inputs, but not text-with-image inputs.
344
+
337
345
.. _supported_vlms :
338
346
339
347
Text Generation
@@ -492,7 +500,7 @@ Multimodal Embedding
492
500
- ✅︎
493
501
* - :code: `Phi3VForCausalLM `
494
502
- Phi-3-Vision-based
495
- - T / I / T + I
503
+ - T + I
496
504
- :code: `TIGER-Lab/VLM2Vec-Full `
497
505
- 🚧
498
506
- ✅︎
You can’t perform that action at this time.
0 commit comments