Skip to content

Conversation

@jerryzh168
Copy link
Contributor

Summary:
att, we added support of quantization conv3d weights, with Float8DynamicActivationFloat8WeightConfig

API:

config = Float8DynamicActivationFloat8WeightConfig(
   granularity=PerTensor(),
)

_is_conv3d = lambda m, fqn: isinstance(m, torch.nn.Conv3d)

quantize_(quantized_model, config, filter_fn=_is_conv3d)

Test Plan:
pytest test/quantization/quantize_/workflows/float8/test_float8_tensor.py -k test_fp8_conv_variants

Reviewers:

Subscribers:

Tasks:

Tags:

@pytorch-bot
Copy link

pytorch-bot bot commented Oct 20, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3215

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 94c2e60 with merge base 7e5d907 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 20, 2025
@jerryzh168
Copy link
Contributor Author

currently waiting for the fbgemm conv op to be available in nightly

uv pip install --pre fbgemm-gpu-genai --index-url https://download.pytorch.org/whl/nightly/cu128

@jerryzh168 jerryzh168 changed the title Add per tensor fp8 quantization support conv3d Add per tensor fp8 quantization support for conv3d Oct 21, 2025
@jerryzh168 jerryzh168 force-pushed the add-fp8-conv-weight-quant-support branch 4 times, most recently from 2569be1 to b5c8ca5 Compare October 27, 2025 20:53
@jerryzh168 jerryzh168 added the topic: new feature Use this tag if this PR adds a new feature label Oct 27, 2025
@jerryzh168 jerryzh168 force-pushed the add-fp8-conv-weight-quant-support branch from b5c8ca5 to 2ccc619 Compare October 27, 2025 20:54
@jerryzh168 jerryzh168 marked this pull request as ready for review October 27, 2025 20:55
@jerryzh168 jerryzh168 force-pushed the add-fp8-conv-weight-quant-support branch from 2ccc619 to 4c6e979 Compare October 28, 2025 00:35
activation_granularity, weight_granularity = granularity

if not _fp8_mm_compat(weight):
if weight.dim() != 5 and not _fp8_mm_compat(weight):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 seems a bit arbitrary here without the context, should we add a comment that this is for conv3d?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK added

"is_MI300",
"is_sm_at_least_89",
"is_sm_at_least_90",
"is_sm_at_least_100",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not super related to this PR but I wonder if we should stop exposing these, we don't expect users to call these themselves

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, what should user call?

Copy link
Contributor Author

@jerryzh168 jerryzh168 Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh should we use is_sm_version? but I think we need is_sm_at_least

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I mean should users even call these helper functions? They know what GPUs they're running on. If they really want to check then maybe they should just check torch.cuda.get_device_capability() >= (10, 0) themselves instead of importing our utils. I'd like to keep them private if possible (in a separate PR)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I see, makes sense, yeah this should just be dev only, not user facing

stride,
padding,
dilation,
):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we also need to check kernel preference? Like if it's "torch" maybe we should throw an exception since we don't support that yet?

@jerryzh168 jerryzh168 force-pushed the add-fp8-conv-weight-quant-support branch 2 times, most recently from 3339fd7 to 642cec4 Compare October 30, 2025 22:40
Summary:
att, we added support of quantization conv3d weights, with
Float8DynamicActivationFloat8WeightConfig

API:
```
config = Float8DynamicActivationFloat8WeightConfig(
   granularity=PerTensor(),
)

_is_conv3d = lambda m, fqn: isinstance(m, torch.nn.Conv3d)

quantize_(quantized_model, config, filter_fn=_is_conv3d)
```

Test Plan:
pytest test/quantization/quantize_/workflows/float8/test_float8_tensor.py -k test_fp8_conv_variants

Reviewers:

Subscribers:

Tasks:

Tags:
@jerryzh168 jerryzh168 force-pushed the add-fp8-conv-weight-quant-support branch from 642cec4 to 94c2e60 Compare October 31, 2025 00:58
@jerryzh168 jerryzh168 merged commit 258387a into main Oct 31, 2025
18 of 19 checks passed
namgyu-youn pushed a commit to namgyu-youn/ao that referenced this pull request Nov 21, 2025
Add per tensor fp8 quantization support conv3d

Summary:
att, we added support of quantization conv3d weights, with
Float8DynamicActivationFloat8WeightConfig

API:
```
config = Float8DynamicActivationFloat8WeightConfig(
   granularity=PerTensor(),
)

_is_conv3d = lambda m, fqn: isinstance(m, torch.nn.Conv3d)

quantize_(quantized_model, config, filter_fn=_is_conv3d)
```

Test Plan:
pytest test/quantization/quantize_/workflows/float8/test_float8_tensor.py -k test_fp8_conv_variants

Reviewers:

Subscribers:

Tasks:

Tags:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: new feature Use this tag if this PR adds a new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants