VLMs: major clean up 🧼 #34502

zucchini-nlp · 2024-10-30T09:04:43Z

What does this PR do?

We have updated all the configs for VLMs on the hub so this PR removes legacy path for models, as it has been there for already 3 releases from v4.44. Also it fixes some stuff that broke on the way, like generating from only text input in LLaVA models

For Video-LLaVA the hub configs cannot be updated as the hub owner has been silent for several mmonths already. And since there is only one model with such architecture, we can hardcode the default values for patch_num and also remove the legacy path

fixes #34824, fixes #35169 and fixes #35450, fixes #35424

HuggingFaceDocBuilderDev · 2024-10-30T09:53:08Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ArthurZucker

I don't think we need this, we deprecated the legacy path, we can just remove it now no?
I don't remember what we said for for 4.46 but better to go with non legacy now if we can!

zucchini-nlp · 2024-10-30T10:24:59Z

We can remove it after updating the files on the hub and that mean we also need to change warning to error so users have chance to see what is the reason for failure.

I think the earliest we can remove is next release, because the blocking PR will prob be merged next week. After that I will take time to update all hub configs. Maybe then we'll wait for the blocking PR and remove all deprecation warnings?

ArthurZucker · 2024-11-25T17:36:30Z

Sounds good, let's wait a bit!

zucchini-nlp · 2024-12-10T11:45:36Z

@ArthurZucker i think this can be review now :)

lcxrocks · 2024-12-22T04:07:51Z

Any updates on this PR? Looking forward to getting llava-hf/llava-v1.6-mistral-7b-hf running for inference without images.
Related to this issue on HF.

ArthurZucker

Most welcome, related to #35534 were the same is done!

src/transformers/models/llava_next_video/modeling_llava_next_video.py

src/transformers/models/llava_next_video/modular_llava_next_video.py

ArthurZucker · 2025-01-07T10:59:40Z

O o 👁️ 👁️

zucchini-nlp · 2025-01-07T11:16:14Z

Triggered slow tests on some models, will merge when those pass (or at least it doesn't add up more failing tests...)

Rocketknight1 · 2025-01-07T14:31:44Z

This is amazing! Let me know when it's merged so I can rebase Pixtral onto it

zucchini-nlp · 2025-01-08T08:59:35Z

Oke, so BLIP models apparently have one extra model class which was not modified for some reason, and their official checkpoints on the hub are also not updated. Therefore, I am not adding BLIP to this PR, it will only remove legacy from LLaVAs

Slow tests are passing on my end if compared to the main branch, there are tests failing on main due to tiny logit inconsistencies. I believe it might also be my setup/hardware, we are trying to match the runner's outputs usually

I will merge this in a one hour, just last time slow test runs and done

zucchini-nlp added the run-slow label Oct 30, 2024

zucchini-nlp requested a review from ArthurZucker October 30, 2024 10:05

ArthurZucker reviewed Oct 30, 2024

View reviewed changes

zucchini-nlp changed the title ~~Fix llava tests~~ VLMs: major clean up 🧼 Nov 24, 2024

zucchini-nlp mentioned this pull request Nov 25, 2024

VideoLLaVA: add default values #34916

Merged

zucchini-nlp mentioned this pull request Nov 26, 2024

LlavaProcessor replaces <image> with 576 <image> tokens. Is this normal? #34934

Closed

4 tasks

This was referenced Dec 6, 2024

VLMs: fix number of image tokens #34332

Merged

LlavaForConditionalGeneration._merge_input_ids_with_image_features throws error #35169

Closed

zucchini-nlp requested a review from ArthurZucker December 10, 2024 11:45

casper-hansen mentioned this pull request Dec 13, 2024

[WIP] Pixtral casper-hansen/AutoAWQ#681

Closed

zucchini-nlp mentioned this pull request Dec 18, 2024

Strange behavior with attn_implementation="eager" #35270

Closed

4 tasks

hsilva664 mentioned this pull request Jan 2, 2025

Adding minor numerical stability fixes for Llava zucchini-nlp/transformers#2

Open

ArthurZucker approved these changes Jan 6, 2025

View reviewed changes

src/transformers/models/llava_next_video/modeling_llava_next_video.py Outdated Show resolved Hide resolved

src/transformers/models/llava_next_video/modular_llava_next_video.py Outdated Show resolved Hide resolved

zucchini-nlp force-pushed the llavas branch from 96ae981 to 1a2d4df Compare January 7, 2025 11:01

only lllava models are modified

1c3a154

zucchini-nlp force-pushed the llavas branch from 3f076fe to 1c3a154 Compare January 8, 2025 08:56

zucchini-nlp merged commit d1681ec into huggingface:main Jan 8, 2025
25 checks passed

This was referenced Jan 8, 2025

BLIPs clean-up #35560

Merged

Fix inference error in LlavaNextForConditionalGeneration with text-only input #35554

Closed

CatcherInThePy mentioned this pull request Jan 8, 2025

LLaVA-1.5 Inference Without Images Not Working Properly haotian-liu/LLaVA#840

Open

zucchini-nlp mentioned this pull request Jan 9, 2025

tracker: generate compatibility with torch.compile #28981

Closed

33 tasks

zucchini-nlp mentioned this pull request Jan 20, 2025

Text Only input using LlaVa Next #35421

Closed

4 tasks

VLMs: major clean up 🧼 #34502

VLMs: major clean up 🧼 #34502

Uh oh!

Conversation

zucchini-nlp commented Oct 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Oct 30, 2024

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp commented Oct 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ArthurZucker commented Nov 25, 2024

Uh oh!

zucchini-nlp commented Dec 10, 2024

Uh oh!

lcxrocks commented Dec 22, 2024

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ArthurZucker commented Jan 7, 2025

Uh oh!

zucchini-nlp commented Jan 7, 2025

Uh oh!

Rocketknight1 commented Jan 7, 2025

Uh oh!

zucchini-nlp commented Jan 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

zucchini-nlp commented Oct 30, 2024 •

edited

Loading

zucchini-nlp commented Oct 30, 2024 •

edited

Loading