[FlexAttention][TF32] Handle uninitialized torch.backends.cuda.matmul.fp32_precision (pytorch#161102)

eqy · pytorchmergebot · commit 117f11adb4b4 · 2025-08-21T03:36:52.000Z
For pytorch#161022 The warning says the old API will be deprecated in 2.9+ anyway, leaving it up to the author of pytorch#125888 to decide on initialization behavior then Pull Request resolved: pytorch#161102 Approved by: https://github.com/ngimel, https://github.com/drisspg, https://github.com/BoyuanFeng
diff --git a/torch/_inductor/kernel/flex/flex_attention.py b/torch/_inductor/kernel/flex/flex_attention.py
@@ -54,6 +54,8 @@ def flex_attention_grid(batch_size, q_heads, num_queries, d_model, meta, *, cdiv
 def get_float32_precision():
     if (
         torch.backends.cuda.matmul.fp32_precision == "ieee"
+        if torch.backends.cuda.matmul.fp32_precision != "none"
+        else torch.get_float32_matmul_precision() == "highest"
         or torch.version.hip
         or torch.mtia.is_available()
     ):