Skip to content

[Feature] Support QuaRot quantization scheme #1489

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
serser opened this issue Apr 24, 2024 · 1 comment
Open

[Feature] Support QuaRot quantization scheme #1489

serser opened this issue Apr 24, 2024 · 1 comment

Comments

@serser
Copy link

serser commented Apr 24, 2024

Motivation

QuaRot is out https://arxiv.org/abs/2404.00456 for three weeks. Preliminary results are convincing. Also see discussions in llama.cpp with the QuaRot authors. It would be amazing to have it supported in LMDeploy as default.

Best.

Related resources

ggml-org/llama.cpp#6444
https://arxiv.org/abs/2404.00456

Additional context

No response

@lvhan028 lvhan028 assigned lzhangzz and unassigned lzhangzz Apr 26, 2024
@lvhan028
Copy link
Collaborator

@pppppM @AllentDan @lzhangzz may investigate QuaRot quantization algorithm, very promising

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants