Skip to content

[mlir][Transforms] Dialect conversion: Fix bug in UnresolvedMaterializationRewrite rollback #105949

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 29, 2024

Conversation

matthias-springer
Copy link
Member

When an unresolved materialization (unrealized_conversion_cast op) is rolled back, the mapping should be rolled back as well, regardless of whether it is a source, target or argument materialization. Otherwise, we accumulate pointers to erased IR in the mapping. This is harmless in most cases, but can cause issues when a new operation is allocated at the same memory location and the pointer is "reused".

It is not possible to write a test case for this because I cannot trigger the pointer reuse programmatically.

@llvmbot llvmbot added mlir:core MLIR Core Infrastructure mlir labels Aug 24, 2024
@llvmbot
Copy link
Member

llvmbot commented Aug 24, 2024

@llvm/pr-subscribers-mlir-core

@llvm/pr-subscribers-mlir

Author: Matthias Springer (matthias-springer)

Changes

When an unresolved materialization (unrealized_conversion_cast op) is rolled back, the mapping should be rolled back as well, regardless of whether it is a source, target or argument materialization. Otherwise, we accumulate pointers to erased IR in the mapping. This is harmless in most cases, but can cause issues when a new operation is allocated at the same memory location and the pointer is "reused".

It is not possible to write a test case for this because I cannot trigger the pointer reuse programmatically.


Full diff: https://github.com/llvm/llvm-project/pull/105949.diff

1 Files Affected:

  • (modified) mlir/lib/Transforms/Utils/DialectConversion.cpp (+2-4)
diff --git a/mlir/lib/Transforms/Utils/DialectConversion.cpp b/mlir/lib/Transforms/Utils/DialectConversion.cpp
index 4058ed39621198..11e40b74b00424 100644
--- a/mlir/lib/Transforms/Utils/DialectConversion.cpp
+++ b/mlir/lib/Transforms/Utils/DialectConversion.cpp
@@ -1062,10 +1062,8 @@ void CreateOperationRewrite::rollback() {
 }
 
 void UnresolvedMaterializationRewrite::rollback() {
-  if (getMaterializationKind() == MaterializationKind::Target) {
-    for (Value input : op->getOperands())
-      rewriterImpl.mapping.erase(input);
-  }
+  for (Value input : op->getOperands())
+    rewriterImpl.mapping.erase(input);
   op->erase();
 }
 

Copy link
Member

@zero9178 zero9178 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fine to me, as in being a step in the right direction, but it doesn't seem guaranteed to me that the input operand was actually the operand that was originally mapped to the materilization:

mapping.map(mapping.lookupOrDefault(newOperand), castValue);

That said, this seems to be more of a pre-existing issue to me.

@matthias-springer
Copy link
Member Author

This seems fine to me, as in being a step in the right direction, but it doesn't seem guaranteed to me that the input operand was actually the operand that was originally mapped to the materilization:

mapping.map(mapping.lookupOrDefault(newOperand), castValue);

That said, this seems to be more of a pre-existing issue to me.

You are right, it's not a perfect rollback and there are probably other issues that are not fixed by this PR.

It's actually not just that line of code that you pointed out. Here's another one:

rewriterImpl.mapping.map(result, castValue);

I think this is due to the fact that:

  • An SSA value may be converted to multiple different types. E.g., different patterns with different type converters may create multiple conversions of the same SSA value to different types during remapValues. (And we want to convert the original value, not an already converted value.)
  • The mapping data structure is a "flat" IRMapping. An SSA value can be mapped only to one other SSA value. We emulate the fact that there may be multiple conversions by chaining mappings. E.g., the first conversion will map to the second conversion. Kind of like a linked list. When we should really have a mapping of Value to SmallVector<Value>.

I'm thinking about fixing that. But not sure if it's worth it, because ultimately I want to get to a state where all IR changes are materialized immediately, and there is no need for a mapping anymore.

(The purpose of this change was mostly to cut down on code size. Bonus points if it fixes something. Anything that reduces complexity and the lines of code will make it easier to maintain the code base and/or build a new conversion driver.)

@matthias-springer matthias-springer force-pushed the users/matthias-springer/rollback_unmat branch from 0b9a69b to 6a76672 Compare November 28, 2024 06:06
@matthias-springer
Copy link
Member Author

matthias-springer commented Nov 28, 2024

@zero9178 I never merged this PR, but this issue was now triggered by a Google-internal test case when I asked @jpienaar to test #116470 internally before merging it. (I originally put these changes into #116470, but I think it's better to merge two smaller PRs with a day in-between.)

My original fix may not have been enough. (Or maybe it was, I don't want to take any chances...) This is now rolling back the mapping exactly to the previous state, addressing your comment here. I'm still not able to write a test case for this. (Because ReplaceOperationRewrite::rollback() and ReplaceOperationRewrite::rollback() also roll back the mapping. The dangling mapping is for target materialization that converts the result of an argument materialization.) I think this can only be triggered by accidental pointer reuse.

Can you take an other brief look at this before I merge it?

Copy link
Member

@zero9178 zero9178 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you! Greatly prefer this more principled approache.

One minor possible enhancement in the comment.

@matthias-springer matthias-springer merged commit f2d500c into main Nov 29, 2024
8 checks passed
@matthias-springer matthias-springer deleted the users/matthias-springer/rollback_unmat branch November 29, 2024 02:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mlir:core MLIR Core Infrastructure mlir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants