fix: handling of default attrs in SimplifiedLayerNormalization + LayerNormalization🐛 #2396

KarelZe · 2025-06-17T03:37:03Z

SkipLayerNormFusion does currently not fuse ops, if stash_type is at default (=1) or epsilon is at default (=1e-5) for LayerNormalization and SimplifiedLayerNormalization

This pr:

fixes handling default attrs in LayerNormalization, SimplifiedLayerNormalization
adds BART encoder as new test model. I added this model as some of the stash types are at default. The model is versatile and can also be used to test other fusions e.g., EmbedLayerNormalization.
allows for commuted inputs.

Closes #2378.

@shubhambhokare1 @justinchuby Could you please review? Any feedback is greatly appreciated.

…rNormalization🐛

codecov · 2025-06-17T04:45:44Z

Codecov Report

Attention: Patch coverage is 22.48062% with 200 lines in your changes missing coverage. Please review.

Project coverage is 69.91%. Comparing base (59340c6) to head (ba4a971).
Report is 2 commits behind head on main.

Files with missing lines	Patch %	Lines
...cript/rewriter/ort_fusions/models/_bart_encoder.py	17.01%	198 Missing and 2 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2396      +/-   ##
==========================================
- Coverage   70.37%   69.91%   -0.47%     
==========================================
  Files         199      200       +1     
  Lines       25216    25470     +254     
  Branches     2686     2688       +2     
==========================================
+ Hits        17747    17807      +60     
- Misses       6540     6735     +195     
+ Partials      929      928       -1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

KarelZe · 2025-06-17T12:10:46Z

onnxscript/rewriter/ort_fusions/skip_normalization.py

-        skip_sum_pattern_2 = op.Add(input, skip)
-        skip_sum = pattern.OrValue([skip_sum_pattern_1, skip_sum_pattern_2], name="skip_sum")
-
+        skip_sum = op.Add(input, skip)
        if self._has_bias and not self._bias_pre_add:
            skip_sum = op.Add(skip_sum, bias)


I chose to enable commute(...), as we didn't check for all variants in this addition and only in the lines above.

I am curious, did you see patterns that this missed? In principle this is ok, but it could increase the fusion time. We also need to update the implementation of commute() to make use of pattern-disjunction, which will be more efficient.

Thanks for asking, I've only looked at a limited number of models so far, which all had the bias term as second input. I can undo this change to avoid performance regressions 👍

justinchuby · 2025-06-17T15:57:52Z

@gramalingam

onnxscript/rewriter/ort_fusions/models/_bart_encoder.py

Copilot

Pull Request Overview

This PR fixes how default attributes (epsilon, stash_type) are handled in both LayerNormalization and SimplifiedLayerNormalization fusions, adds a BART encoder model to the fusion tests, and introduces commuted-input support for SkipLayerNormalization rules.

Extract default epsilon from the matched node instead of requiring it in the pattern signature
Add test_bart_encoder to validate fusion with default-attribute cases
Enable commuted-input variants by applying .commute() to fusion rules

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File	Description
skip_normalization_test.py	Added `test_bart_encoder` to cover default-attribute fusions
skip_normalization.py	Refactored patterns to drop default attrs, extract `epsilon` in rewrite, and apply rule commutation

Comments suppressed due to low confidence (2)

onnxscript/rewriter/ort_fusions/skip_normalization_test.py:73

The test uses fuse_skip_layer_normalization(model) but there is no import for that symbol in this file. Please add from onnxscript.rewriter.ort_fusions.skip_normalization import fuse_skip_layer_normalization (or adjust the import path) to ensure the function is available.

        fuse_skip_layer_normalization(model)

onnxscript/rewriter/ort_fusions/skip_normalization.py:231

The new .commute() calls are applied only to the full SkipLayerNormalization rules. To allow commuted inputs for SkipSimplifiedLayerNormalization as well, you should apply .commute() to the simplified-layer ruleset (if defined) or include those here before applying apply_fusion_rules.

skip_layer_normalization_ruleset = pattern.RewriteRuleSet(

Copilot · 2025-06-17T16:01:19Z

onnxscript/rewriter/ort_fusions/skip_normalization.py

        **_,
    ):
+        epsilon = simplified_layer_norm.producer().attributes.get_float("epsilon")


You extract epsilon from the matched node but do not extract or forward stash_type. If a non-default stash_type was used, it will be lost in the fused op. Consider retrieving stash_type = simplified_layer_norm.producer().attributes.get_int("stash_type") and passing it into SkipSimplifiedLayerNormalization.

Suggested change

epsilon = simplified_layer_norm.producer().attributes.get_float("epsilon")

epsilon = simplified_layer_norm.producer().attributes.get_float("epsilon")

stash_type = simplified_layer_norm.producer().attributes.get_int("stash_type")

I guess there is no stash type for fused layer norm ops? https://github.com/microsoft/onnxruntime/blob/main/docs/ContribOperators.md#com.microsoft.SkipLayerNormalization

I assume you mean also for SimplifiedLayerNorm and SkipSimplifierLayerNorm? Unfortunately, I don't see the doc for the first op. But if it is absent in both ops, it seems safe to ignore it.

Let's consider the two cases separately:

(a) For LayerNorm (the rule down below), we are starting with the ONNX op, which supports stash_type. But the SkipLayerNorm in ORT doesn't seem to support stash_type. Is my understanding correct? If so, the rewrite should have a check condition to see if the stash_type has the value supported by the SkipLayerNorm ... if not, we should skip the optimization.

(b) For SimplifiedLayerNorm, if stash_type is not supported by either op, we can ignore it.

However, for (a): we should understand what the default behavior of the ORT ops are: do they use a value of stash_type == FP32 or do they use a stash_type == input-type? The two are different.

Thank you. 👍 Yes, neither SkipLayerNorm nor SimplifiedSkipLayerNorm seems to support an external stash_type. (see op.LayerNormalization + SkipLayerNormalization ). I'm not very proficient with c/c++, but to my understanding the internal precision for statistics seems to be depend on strict mode and from the input type T e.g., float or bfloat16. For strict mode the computation should be done in fp32. For non-strict mode, the precision depends on the input precision https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/contrib_ops/cuda/bert/skip_layer_norm.cc @gramalingam Should we then check for stash_type=None or 1 to be safe?

KarelZe · 2025-06-17T18:19:05Z

onnxscript/rewriter/ort_fusions/skip_normalization.py

        if self._has_bias and not self._bias_pre_add:
            skip_sum = op.Add(skip_sum, bias)
+
        normalized = op.LayerNormalization(
            skip_sum,
            gamma,
            beta,


@gramalingam beta is an optional input. I'd lean toward matching both variants (w and w/o bias).

Sorry, do you mean all 4 combinations (w and w/o beta, w/wo bias)? That can be done by dropping beta here, and specifying _allow_other_inputs=True. The rewriter should then forward the corresponding inputs to the rewritten node.

Sorry meant with and w/o beta. I'm wondering, if I drop beta here, would it still be forwarded correctly, as beta is the third input to LayerNormalization but fourth input to SkipLayerNormalization? (https://onnx.ai/onnx/operators/onnx__LayerNormalization.html and https://github.com/microsoft/onnxruntime/blob/rel-1.20.0/docs/ContribOperators.md#commicrosoftskiplayernormalization ) In my quick tests, I got different outputs.

fix: handling of default attrs in SimplifiedLayerNormalization + Laye…

d550d38

…rNormalization🐛

github-project-automation bot added this to ONNX Script Review Board Jun 17, 2025

github-project-automation bot moved this to Todo in ONNX Script Review Board Jun 17, 2025

KarelZe added 4 commits June 17, 2025 09:54

tests: add new BART encoder test model⚡️

b5ffeac

test: fix typos in BART encoder test model⚡️

87122e8

style: rename variable names of bart encoder to lowercase💅

e9c0235

chore: final clean up of skip layer norm fusion

ac155dc

KarelZe commented Jun 17, 2025

View reviewed changes

KarelZe marked this pull request as ready for review June 17, 2025 12:13

KarelZe marked this pull request as draft June 17, 2025 12:47

justinchuby requested review from gramalingam and Copilot and removed request for gramalingam June 17, 2025 15:57

github-advanced-security bot found potential problems Jun 17, 2025

View reviewed changes

onnxscript/rewriter/ort_fusions/models/_bart_encoder.py Fixed Show fixed Hide fixed

onnxscript/rewriter/ort_fusions/models/_bart_encoder.py Fixed Show fixed Hide fixed

Copilot AI reviewed Jun 17, 2025

View reviewed changes

KarelZe added 2 commits June 17, 2025 19:28

fix: set allow_other_attributes for layer norm fusion

edad0ca

fix: address copilot review comments

ba4a971

KarelZe commented Jun 17, 2025

View reviewed changes

KarelZe marked this pull request as ready for review June 18, 2025 04:14

justinchuby approved these changes Jun 22, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: handling of default attrs in SimplifiedLayerNormalization + LayerNormalization🐛 #2396

fix: handling of default attrs in SimplifiedLayerNormalization + LayerNormalization🐛 #2396

Uh oh!

KarelZe commented Jun 17, 2025 •

edited

Loading

Uh oh!

codecov bot commented Jun 17, 2025 •

edited

Loading

Uh oh!

KarelZe Jun 17, 2025 •

edited

Loading

Uh oh!

gramalingam Jun 18, 2025

Uh oh!

KarelZe Jun 21, 2025

Uh oh!

justinchuby commented Jun 17, 2025

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jun 17, 2025

Uh oh!

KarelZe Jun 17, 2025

Uh oh!

gramalingam Jun 18, 2025

Uh oh!

gramalingam Jun 18, 2025

Uh oh!

KarelZe Jun 21, 2025

Uh oh!

KarelZe Jun 17, 2025

Uh oh!

gramalingam Jun 18, 2025

Uh oh!

KarelZe Jun 21, 2025

Uh oh!

Uh oh!

	epsilon = simplified_layer_norm.producer().attributes.get_float("epsilon")
	epsilon = simplified_layer_norm.producer().attributes.get_float("epsilon")
	stash_type = simplified_layer_norm.producer().attributes.get_int("stash_type")

fix: handling of default attrs in SimplifiedLayerNormalization + LayerNormalization🐛 #2396

Are you sure you want to change the base?

fix: handling of default attrs in SimplifiedLayerNormalization + LayerNormalization🐛 #2396

Uh oh!

Conversation

KarelZe commented Jun 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jun 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

KarelZe Jun 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

justinchuby commented Jun 17, 2025

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Jun 17, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

KarelZe commented Jun 17, 2025 •

edited

Loading

codecov bot commented Jun 17, 2025 •

edited

Loading

KarelZe Jun 17, 2025 •

edited

Loading