Skip to content

Commit 5bd4e3b

Browse files
committed
Update
[ghstack-poisoned]
1 parent 821bd2b commit 5bd4e3b

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

torchao/testing/training/roofline_utils.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -207,6 +207,7 @@ def get_tensor_memory_traffic_ovhd_s(
207207
# across dim0 and dim1. input and grad_output still 1x32.
208208

209209
if tensor_role in ("input", "grad_output"):
210+
# TODO(future): update all of the mx rooflines to just read once
210211
# kernel 1: x_bf16 -> x_mxfp8_dim0
211212
# kernel 2: x_bf16 -> x_mxfp8_dim1
212213
if fuse_with_prev:

0 commit comments

Comments
 (0)