Skip to content

Conversation

@kylesayrs
Copy link
Collaborator

@kylesayrs kylesayrs commented Oct 28, 2025

Purpose

  • Create a pathway which can quantize model weights without needing a model definition or the use of a calibration pipeline. Such a pathway provides fast and reliable support for models which:
    • Do not have a HF model definition yet
    • Have complications with sequential pipelines (very large vision towers, tracing failure, long calibration runtime)

Usage

model_free_ptq(
    model_stub="meta-llama/Llama-3.2-1B-Instruct",
    save_directory="Llama-3.2-1B-Instruct-FP8_block",
    scheme="FP8_BLOCK",
    ignore=["model.embed_tokens", "lm_head"],
    max_workers=15,
    device="cuda:0",
):

Testing

  • Added test_model_free_ptq_matches_oneshot which tests that saved tensors and configs exactly match between model_free_ptq and oneshot entrypoints for the same arguments. This test takes about 10 seconds to run.

Future Extensions

  • Mixed-precision quantization (multiple recipes/targets)
  • Multi-GPU support (work is already parallelized by threads, but if GPU is the bottleneck we can split the work across GPUs)
  • Multi-process support (is python processing is the bottleneck, we can replace multithreading with multiprocessing)

@github-actions
Copy link

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

@kylesayrs kylesayrs force-pushed the kylesayrs/weights-only branch from f4423c1 to 294a78a Compare October 30, 2025 20:21
@kylesayrs kylesayrs changed the base branch from main to 03_untie_fix October 31, 2025 02:41
Base automatically changed from 03_untie_fix to main October 31, 2025 16:22
@kylesayrs kylesayrs changed the title [PTQ] weights_ptq pathway for day-zero weight quantization support [Weights-only] weights_ptq pathway for day-zero weight quantization support Nov 3, 2025
@kylesayrs kylesayrs changed the title [Weights-only] weights_ptq pathway for day-zero weight quantization support [Weights-only] ptq_weights pathway for day-zero weight quantization support Nov 3, 2025
@kylesayrs kylesayrs changed the title [Weights-only] ptq_weights pathway for day-zero weight quantization support [Weights-only] ptq_weights pathway for day-zero weight quantization support Nov 3, 2025
Signed-off-by: Kyle Sayers <[email protected]>
@kylesayrs kylesayrs force-pushed the kylesayrs/weights-only branch from 1c56a75 to 6fe9db9 Compare November 3, 2025 16:32
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
@kylesayrs kylesayrs marked this pull request as ready for review November 3, 2025 18:02
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
@kylesayrs kylesayrs changed the title [Weights-only] ptq_weights pathway for day-zero weight quantization support [model_free_ptq] Add pathway for day-zero weight quantization support Nov 3, 2025
Signed-off-by: Kyle Sayers <[email protected]>
Copy link
Collaborator

@dsikka dsikka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think keeping this as a separate entrypoint makes sense given the point is speed of execution / not wanting it getting bogged down from other flows in llmcomp but it seems like the lifecycle / steps we're applying are very similar to the datafree pipeline?

@kylesayrs
Copy link
Collaborator Author

@dsikka Yes

Copy link
Collaborator

@brian-dellabetta brian-dellabetta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great addition! One nit, and probably want to add an example or README to explain when one would want to use this over oneshot, but can also tackle that in a follow up as a good first issue.

Copy link
Collaborator

@shanjiaz shanjiaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!!

@kylesayrs kylesayrs enabled auto-merge (squash) November 6, 2025 21:05
@kylesayrs kylesayrs added the ready When a PR is ready for review label Nov 6, 2025
@kylesayrs kylesayrs merged commit 1c85a66 into main Nov 6, 2025
12 checks passed
@kylesayrs kylesayrs deleted the kylesayrs/weights-only branch November 6, 2025 21:59
@kylesayrs kylesayrs restored the kylesayrs/weights-only branch November 7, 2025 18:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready When a PR is ready for review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants