Skip to content

Conversation

rahul-tuli
Copy link
Collaborator

@rahul-tuli rahul-tuli commented Feb 14, 2025

This PR restores sparse compression for our 2of4 examples, which was previously disabled due to a bug in the vLLM Cutlass integration.

Background

A bug in the Cutlass integration caused certain sparse-only compressed models to produce gibberish results. To mitigate this issue, we temporarily turned off sparse compression for our 2of4 examples.

The bug has since been fixed by @tlrmchlsmth in vllm-project/vllm#13198. With this fix in place, we can safely re-enable sparse compression for these examples.

Changes

  • Re-enable sparse compression for 2of4 examples.

Testing

  • Verified that sparse-only compressed models now produce expected outputs.

@rahul-tuli rahul-tuli self-assigned this Feb 14, 2025
Copy link

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

@rahul-tuli rahul-tuli marked this pull request as ready for review February 14, 2025 15:43
kylesayrs
kylesayrs previously approved these changes Feb 14, 2025
Copy link
Collaborator

@kylesayrs kylesayrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a comment about prev vllm versions

Copy link
Member

@tlrmchlsmth tlrmchlsmth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥳

Copy link
Collaborator

@brian-dellabetta brian-dellabetta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥

horheynm
horheynm previously approved these changes Feb 14, 2025
Copy link
Collaborator

@dsikka dsikka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we update the E2E test case as well?

Copy link
Collaborator

@dsikka dsikka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider adding a comment to the example about disable_sparse_compression (just that it exists/can be used)?

@rahul-tuli
Copy link
Collaborator Author

It is there in the readme as a separate section, if anyone wants to try

Signed-off-by: Rahul Tuli <[email protected]>
Signed-off-by: Rahul Tuli <[email protected]>
@rahul-tuli
Copy link
Collaborator Author

Can we update the E2E test case as well?

Done and tested!

@dsikka dsikka added the ready When a PR is ready for review label Feb 14, 2025
@dsikka dsikka enabled auto-merge (squash) February 14, 2025 21:48
@dsikka dsikka merged commit cdf686f into main Feb 14, 2025
7 checks passed
@dsikka dsikka deleted the update-2of4-example branch February 14, 2025 23:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready When a PR is ready for review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants