Skip to content

Update torchao.prototype.parq and add 4-bit Llama 3.2 1B benchmark #2017

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 4, 2025

Conversation

lisjin
Copy link
Contributor

@lisjin lisjin commented Apr 4, 2025

We would like to merge recent changes from our open sourced library at https://github.com/facebookresearch/parq.

We have also benchmarked 4-bit Llama 3.2 1B fine-tuned for 25K steps on fineweb-edu using torchtune. We used PARQ's MaxUnifQuantizer and ProxHardQuant proximal mapping, which is equivalent to STE. Below are the relevant training config changes to the llama3_2/1B_full.yaml recipe.

batch_size: 8
epochs: 1
optimizer:
  _component_: torch.optim.AdamW
  lr: 4e-5
  weight_decay: 0.0
  betas: [0.9, 0.95]
  fused: True

lr_scheduler:
  _component_: torchtune.training.lr_schedulers.get_cosine_schedule_with_warmup
  num_warmup_steps: 2500

As shown in the table below, the resulting 4-bit model achieves well under 10% accuracy on most commonsense reasoning benchmarks relative to the pre-trained model.

    Tasks    16-bit 4-bit % diff
arc_challenge 0.3805 0.3575 -6.4
arc_easy     0.6309 0.6077 -3.7
hellaswag    0.6081 0.5423 -10.8
piqa         0.7410 0.7122 -3.9
winogrande   0.6022 0.5549 -7.8

Copy link

pytorch-bot bot commented Apr 4, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2017

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit f4ee2d7 with merge base 6922733 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 4, 2025
@andrewor14
Copy link
Contributor

Hi @lisjin, thanks for the update! The results look great. As discussed offline, we usually only add submodules if other parts of torchao are using the submodules, which is not the case here. Do you mind making the latest changes to PARQ in the prototype version here in torchao instead?

@lisjin lisjin changed the title Replace torchao.prototype.parq with facebookresearch/parq submodule Update torchao.prototype.parq and add 4-bit Llama 3.2 1B benchmark Apr 4, 2025
@lisjin
Copy link
Contributor Author

lisjin commented Apr 4, 2025

Do you mind making the latest changes to PARQ in the prototype version here in torchao instead?

Of course! I just removed the submodule and updated the prototype version instead.

@lisjin lisjin added the topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories) label Apr 4, 2025
@lisjin lisjin requested a review from andrewor14 April 4, 2025 18:32
@lisjin lisjin merged commit 3bbf42a into pytorch:main Apr 4, 2025
18 of 19 checks passed
@lisjin lisjin deleted the parq branch April 4, 2025 21:42
jainapurva pushed a commit that referenced this pull request Apr 8, 2025
…2017)

Replace torchao.prototype.parq with facebookresearch/parq submodule
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants