[ET-VK][Ops] quantization op shaders and impl #11369

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Merged

facebook-github-bot merged 16 commits into gh/ahmtox/11/base from gh/ahmtox/11/head

Jun 17, 2025

Contributor

ahmtox commented Jun 4, 2025 •

edited

Loading

Stack from ghstack (oldest at bottom):

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are half (fp16) and float (fp32). The only output types supported are byte (uint8), char (int8), short (int16), int (int32).

Differential Revision: D75959064


          [ET-VK][Ops] quantization op shaders and impl

0c9c7a6

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

ahmtox requested a review from SS-JIA as a code owner

June 4, 2025 18:03

This was referenced Jun 4, 2025

[ET-VK] double, short, and uint16 dtype runtime support #11365

Merged

[ET-VK][Ops] quantize ops skeleton test framework #11366

Merged

[ET-VK][Ops] quantize_per_token.default test setup #11367

Merged

pytorch-bot bot commented Jun 4, 2025 •

edited

Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/11369

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 9dc4092 with merge base 3b1c7fd ():

NEW FAILURE - The following job has failed:

Build Presets / linux (pybind, linux.2xlarge, executorch-ubuntu-22.04-clang12) / build (gh)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ahmtox mentioned this pull request

[ET-VK][Ops] quantize_per_tensor.default test setup #11368

Merged

ahmtox pushed a commit that referenced this pull request


          [ET-VK][Ops] quantization op shaders and impl

36f2cb5

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

ghstack-source-id: 288187842
Pull Request resolved: #11369

facebook-github-bot added the CLA Signed label

Contributor

facebook-github-bot commented Jun 4, 2025

This pull request was exported from Phabricator. Differential Revision: D75959064

facebook-github-bot added the fb-exported label


          Update on "[ET-VK][Ops] quantization op shaders and impl"

f2c2380

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

This was referenced Jun 9, 2025

[ET] enabling half dtype input for quantization #11479

Merged

[ET-VK][Ops] dequantize ops skeleton test framework #11480

Merged

[ET-VK][Ops] dequantize_per_tensor.default test setup #11481

Merged

[ET-VK][Ops] dequantize_per_token.default test setup #11482

Merged

[ET-VK][Ops] dequantization op shaders and impl #11483

Merged

Contributor

facebook-github-bot commented Jun 9, 2025

This pull request was exported from Phabricator. Differential Revision: D75959064


          Update on "[ET-VK][Ops] quantization op shaders and impl"

e2cb320

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

Contributor

facebook-github-bot commented Jun 9, 2025

This pull request was exported from Phabricator. Differential Revision: D75959064


          Update on "[ET-VK][Ops] quantization op shaders and impl"

cb4bcfe

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

This was referenced Jun 11, 2025

[ET] enabling half dtype output for dequantization and making logic consistent #11552

Merged

[ET-VK][Ops] enabling double support for quantization and dequantization ops #11553

Merged

[ET-VK][Ops] choose_qparams ops skeleton test framework #11554

Merged

[ET-VK][Ops] choose_qparams.tensor test setup #11555

Merged

[ET-VK][Ops] choose_qparams_per_token_asymmetric.default test setup #11556

Merged

[ET-VK][Ops] choose_qparams op shaders and impl #11557

Merged

Contributor

facebook-github-bot commented Jun 11, 2025

This pull request was exported from Phabricator. Differential Revision: D75959064


          Update on "[ET-VK][Ops] quantization op shaders and impl"

3615a76

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

Contributor

facebook-github-bot commented Jun 11, 2025

This pull request was exported from Phabricator. Differential Revision: D75959064

ahmtox mentioned this pull request

[ET-VK][Ops] common test utils for converting aten types to vulkan types #11575

Merged

Contributor

facebook-github-bot commented Jun 11, 2025

This pull request was exported from Phabricator. Differential Revision: D75959064


          Update on "[ET-VK][Ops] quantization op shaders and impl"

9f7d105

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

Contributor

facebook-github-bot commented Jun 12, 2025

This pull request was exported from Phabricator. Differential Revision: D75959064


          Update on "[ET-VK][Ops] quantization op shaders and impl"

d49d3a2

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

Contributor

facebook-github-bot commented Jun 12, 2025

This pull request was exported from Phabricator. Differential Revision: D75959064


          Update on "[ET-VK][Ops] quantization op shaders and impl"

499dbfd

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

Contributor

facebook-github-bot commented Jun 12, 2025

This pull request was exported from Phabricator. Differential Revision: D75959064


          Update on "[ET-VK][Ops] quantization op shaders and impl"

de2298b

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

Contributor

facebook-github-bot commented Jun 12, 2025

This pull request was exported from Phabricator. Differential Revision: D75959064


          Update on "[ET-VK][Ops] quantization op shaders and impl"

06734c3

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

Contributor

facebook-github-bot commented Jun 13, 2025

This pull request was exported from Phabricator. Differential Revision: D75959064

SS-JIA approved these changes

View reviewed changes


          Update on "[ET-VK][Ops] quantization op shaders and impl"

10bcfe7

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

Contributor

facebook-github-bot commented Jun 13, 2025

This pull request was exported from Phabricator. Differential Revision: D75959064


          Update on "[ET-VK][Ops] quantization op shaders and impl"

67e425b

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

Contributor

facebook-github-bot commented Jun 13, 2025

This pull request was exported from Phabricator. Differential Revision: D75959064


          Update on "[ET-VK][Ops] quantization op shaders and impl"

15a7258

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

Contributor

facebook-github-bot commented Jun 13, 2025

This pull request was exported from Phabricator. Differential Revision: D75959064


          Update on "[ET-VK][Ops] quantization op shaders and impl"

c09bd60

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

Contributor

facebook-github-bot commented Jun 16, 2025

This pull request was exported from Phabricator. Differential Revision: D75959064

SS-JIA approved these changes

View reviewed changes

SS-JIA approved these changes

View reviewed changes


          Update on "[ET-VK][Ops] quantization op shaders and impl"

9dc4092

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

Contributor

facebook-github-bot commented Jun 17, 2025

This pull request was exported from Phabricator. Differential Revision: D75959064

facebook-github-bot merged commit 4ffc98a into gh/ahmtox/11/base

95 of 98 checks passed

facebook-github-bot deleted the gh/ahmtox/11/head branch

June 17, 2025 22:03

facebook-github-bot temporarily deployed to cherry-pick-bot

June 17, 2025 22:03

— with

GitHub Actions Inactive

pytorchbot mentioned this pull request

[ET-VK][Ops] quantization op shaders and impl #11767

Merged

cccclai pushed a commit that referenced this pull request


          [ET-VK][Ops] quantization op shaders and impl (#11767)

d984a2c

This PR was created by the merge bot to help merge the original PR into
the main branch.
ghstack PR number: #11369 by
@ahmtox
^ Please use this as the source of truth for the PR details, comments,
and reviews
ghstack PR base:
https://github.com/pytorch/executorch/tree/gh/ahmtox/11/base
ghstack PR head:
https://github.com/pytorch/executorch/tree/gh/ahmtox/11/head
Merge bot PR base: https://github.com/pytorch/executorch/tree/main
Merge bot PR head:
https://github.com/pytorch/executorch/tree/gh/ahmtox/11/orig
@diff-train-skip-merge

Co-authored-by: morelos <[email protected]>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed fb-exported release notes: vulkan