Skip to content

Conversation

JDevlieghere
Copy link
Member

Currently, two SupportFiles with the same FileSpec are considered different if one of them has a Checksum and the other doesn't. However, this is overly strict. It's totally valid to mix LineTables that do and do not contain Checksums. This patch makes it so that the Checksum is only compared if both SupportFiles have a valid Checksum.

Currently, two SupportFiles with the same FileSpec are considered
different if one of them has a Checksum and the other doesn't. However,
this is overly strict. It's totally valid to mix LineTables that do and
do not contain Checksums. This patch makes it so that the Checksum is
only compared if both SupportFiles have a valid Checksum.
@JDevlieghere
Copy link
Member Author

Credit to @jasonmolenda for uncovering this bug.

@llvmbot
Copy link
Member

llvmbot commented Jun 14, 2024

@llvm/pr-subscribers-lldb

Author: Jonas Devlieghere (JDevlieghere)

Changes

Currently, two SupportFiles with the same FileSpec are considered different if one of them has a Checksum and the other doesn't. However, this is overly strict. It's totally valid to mix LineTables that do and do not contain Checksums. This patch makes it so that the Checksum is only compared if both SupportFiles have a valid Checksum.


Full diff: https://github.com/llvm/llvm-project/pull/95606.diff

1 Files Affected:

  • (modified) lldb/include/lldb/Utility/SupportFile.h (+5-1)
diff --git a/lldb/include/lldb/Utility/SupportFile.h b/lldb/include/lldb/Utility/SupportFile.h
index 7505d7f345c5d..d65156cea768f 100644
--- a/lldb/include/lldb/Utility/SupportFile.h
+++ b/lldb/include/lldb/Utility/SupportFile.h
@@ -30,8 +30,12 @@ class SupportFile {
 
   virtual ~SupportFile() = default;
 
+  /// Return true if both SupportFiles have the same FileSpec and, if both have
+  /// a valid Checksum, the Checksum is the same.
   bool operator==(const SupportFile &other) const {
-    return m_file_spec == other.m_file_spec && m_checksum == other.m_checksum;
+    if (m_checksum && other.m_checksum)
+      return m_file_spec == other.m_file_spec && m_checksum == other.m_checksum;
+    return m_file_spec == other.m_file_spec;
   }
 
   bool operator!=(const SupportFile &other) const { return !(*this == other); }

Copy link
Collaborator

@jimingham jimingham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous check does seem overly restrictive.

LGTM

Copy link
Collaborator

@jasonmolenda jasonmolenda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for coming up with the right fix, this looks good to me.

@JDevlieghere
Copy link
Member Author

Thanks for the overwhelming support :-)

@JDevlieghere JDevlieghere merged commit 19ad7a0 into llvm:main Jun 14, 2024
@JDevlieghere JDevlieghere deleted the checksum branch June 14, 2024 21:25
JDevlieghere added a commit to JDevlieghere/llvm-project that referenced this pull request Jun 14, 2024
When Jason was looking into the issue caused by llvm#95606 he suggested
using the Checksum from the original file in LineEntry. I like the idea
because it makes sense semantically, but also allows us to get rid of
the Update method and ensures we make a new copy, in case someone else
is holding onto the old SupportFile.
@labath
Copy link
Collaborator

labath commented Jun 17, 2024

Thanks for the overwhelming support :-)

I hate to ruin a party, but I don't think this is a good use of operator==, precisely because it does not define an equivalence relation (it's not transitive). Might I suggest named function instead? IsCompatible ?

JDevlieghere added a commit that referenced this pull request Jun 17, 2024
When Jason was looking into the issue caused by #95606 he suggested
using the Checksum from the original file in LineEntry. I like the idea
because it makes sense semantically, but also allows us to get rid of
the Update method and ensures we make a new copy, in case someone else
is holding onto the old SupportFile.
@jasonmolenda
Copy link
Collaborator

Thanks for the overwhelming support :-)

I hate to ruin a party, but I don't think this is a good use of operator==, precisely because it does not define an equivalence relation (it's not transitive). Might I suggest named function instead? IsCompatible ?

The place I hit a problem with this was in ThreadPlanStepRange::InRange where we're doing

    if (m_addr_context.line_entry.IsValid() &&
        new_context.line_entry.IsValid()) {
      if (*m_addr_context.line_entry.original_file_sp ==
          *new_context.line_entry.original_file_sp) {
        if (m_addr_context.line_entry.line == new_context.line_entry.line) {
          m_addr_context = new_context;
          const bool include_inlined_functions =
              GetKind() == eKindStepOverRange;
...

and only one of the SupportFile's here had a checksum, so the == test failed (and an inlined stepping bug came up)

If I read m_addr_context.line_entry.original_file_sp->IsCompatible(new_context.line_entry.original_file_sp) I would think "Oh is this considering maybe a remapped filename?" or something, I'm not sure I would see that as "the same file, ignoring checksum if only one has it".

There are probbly some other uses of this operator== and maybe I shouldn't focus on this one, but I'm trying to think of how the intention could be clearly expressed with a method name here.

@jasonmolenda
Copy link
Collaborator

I hate to ruin a party, but I don't think this is a good use of operator==, precisely because it does not define an equivalence relation (it's not transitive). Might I suggest named function instead? IsCompatible ?

I'd be fine with SupportFile::IsSameFile(SupportFile&), I think that would make it clear what is being evaluated in caller locations. But is it any different than operator==, which you'd expect to behave the same?

@labath
Copy link
Collaborator

labath commented Jun 19, 2024

I think that operator== should represent the strictest possible way to compare two objects, and ideally mean "the two objects are interchangable". Interchangeability implies transitivity, and that doesn't hold here because an object with no checksum can match two other objects with different checksums, but those two objects won't match each other directly.

This can have implications for this bug as well. I don't know how we can end up in a different line table here, but if that can happen twice, and the middle line table has no checksum, then it can make a difference in whether you compare to the previous line table, or the initial/starting one. (I don't think this is likely to happen, I'm just using it to illustrate the point.)

With that in mind, I think IsSameFile is still slightly better than operator== (because it's harder to use it by accident, so you're more likely to look up the definition and see the comment next to it -- which hopefully explains the subtlety), but it still implies some form of interchangability. And fundamentally, I think it's not correct -- without the checksum (and maybe even with it) we can't definitively say whether the two objects describe the same file. (we may decide we're going to treat them as if they were, but I think that's something different)

Unfortunately, I don't know of a concise way to express this difference, maybe because the concept is not concise to begin with. IsProbablySameFile, MayDescribeSameFile, PotentiallySameFile ?

@jasonmolenda
Copy link
Collaborator

Unfortunately, I don't know of a concise way to express this difference, maybe because the concept is not concise to begin with. IsProbablySameFile, MayDescribeSameFile, PotentiallySameFile ?

I was thinking about this too, and the best idea I had was IsProbablySame.

@JDevlieghere
Copy link
Member Author

IsProbablySame works for me

JDevlieghere added a commit to JDevlieghere/llvm-project that referenced this pull request Jun 29, 2024
This is an improved attempt to improve the semantics of SupportFile
equivalence, taking into account the feedback from llvm#95606.

Pavel's comment about the lack of a concise name because the concept
isn't trivial made me realize that I don't want to abstract this concept
away behind a helper function. Instead, I opted for a rather verbose
enum that forces the caller to consider exactly what kind of comparison
is appropriate for every call.
JDevlieghere added a commit that referenced this pull request Jul 1, 2024
This is an improved attempt to improve the semantics of SupportFile
equivalence, taking into account the feedback from #95606.

Pavel's comment about the lack of a concise name because the concept
isn't trivial made me realize that I don't want to abstract this concept
away behind a helper function. Instead, I opted for a rather verbose
enum that forces the caller to consider exactly what kind of comparison
is appropriate for every call.
cpiaseque pushed a commit to cpiaseque/llvm-project that referenced this pull request Jul 3, 2024
This is an improved attempt to improve the semantics of SupportFile
equivalence, taking into account the feedback from llvm#95606.

Pavel's comment about the lack of a concise name because the concept
isn't trivial made me realize that I don't want to abstract this concept
away behind a helper function. Instead, I opted for a rather verbose
enum that forces the caller to consider exactly what kind of comparison
is appropriate for every call.
kbluck pushed a commit to kbluck/llvm-project that referenced this pull request Jul 6, 2024
This is an improved attempt to improve the semantics of SupportFile
equivalence, taking into account the feedback from llvm#95606.

Pavel's comment about the lack of a concise name because the concept
isn't trivial made me realize that I don't want to abstract this concept
away behind a helper function. Instead, I opted for a rather verbose
enum that forces the caller to consider exactly what kind of comparison
is appropriate for every call.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants