Skip to content

[C++20][Modules] Fix crash when function and lambda inside loaded from different modules #109167

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

dmpolukhin
Copy link
Contributor

Summary:
Because AST loading code is lazy and happens in unpredictable order, it is possible that a function and lambda inside the function can be loaded from different modules. As a result, the captured DeclRefExpr won’t match the corresponding VarDecl inside the function. This situation is reflected in the AST as follows:

FunctionDecl 0x555564f4aff0 <Conv.h:33:1, line:41:1> line:33:35 imported in ./thrift_cpp2_base.h hidden tryTo 'Expected<Tgt, const char *> ()' inline
|-also in ./folly-conv.h
`-CompoundStmt 0x555564f7cfc8 <col:43, line:41:1>
  |-DeclStmt 0x555564f7ced8 <line:34:3, col:17>
  | `-VarDecl 0x555564f7cef8 <col:3, col:16> col:7 imported in ./thrift_cpp2_base.h hidden referenced result 'Tgt' cinit
  |   `-IntegerLiteral 0x555564f7d080 <col:16> 'int' 0
  |-CallExpr 0x555564f7cea8 <line:39:3, col:76> '<dependent type>'
  | |-UnresolvedLookupExpr 0x555564f7bea0 <col:3, col:19> '<overloaded function type>' lvalue (no ADL) = 'then_' 0x555564f7bef0
  | |-CXXTemporaryObjectExpr 0x555564f7bcb0 <col:25, col:45> 'Expected<bool, int>':'folly::Expected<bool, int>' 'void () noexcept' zeroing
  | `-LambdaExpr 0x555564f7bc88 <col:48, col:75> '(lambda at Conv.h:39:48)'
  |   |-CXXRecordDecl 0x555564f76b88 <col:48> col:48 imported in ./folly-conv.h hidden implicit <undeserialized declarations> class definition
  |   | |-also in ./thrift_cpp2_base.h
  |   | `-DefinitionData lambda empty standard_layout trivially_copyable literal can_const_default_init
  |   |   |-DefaultConstructor defaulted_is_constexpr
  |   |   |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param
  |   |   |-MoveConstructor exists simple trivial needs_implicit
  |   |   |-CopyAssignment trivial has_const_param needs_implicit implicit_has_const_param
  |   |   |-MoveAssignment
  |   |   `-Destructor simple irrelevant trivial constexpr needs_implicit
  |   `-CompoundStmt 0x555564f7d1a8 <col:58, col:75>
  |     `-ReturnStmt 0x555564f7d198 <col:60, col:67>
  |       `-DeclRefExpr 0x555564f7d0a0 <col:67> 'Tgt' lvalue Var 0x555564f7d0c8 'result' 'Tgt' refers_to_enclosing_variable_or_capture
  `-ReturnStmt 0x555564f7bc78 <line:40:3, col:11>
    `-InitListExpr 0x555564f7bc38 <col:10, col:11> 'void'

This diff modifies the AST deserialization process to load lambdas within the canonical function declaration sooner, immediately following the function, ensuring that they are loaded from the same module.

Re-land #104512 Added test case that caused crash due to multiple enclosed lambdas deserialization.

Test Plan: check-clang

…m different modules

Summary:
Because AST loading code is lazy and happens in unpredictable order, it is
possible that a function and lambda inside the function can be loaded from
different modules. As a result, the captured DeclRefExpr won’t match the
corresponding VarDecl inside the function. This situation is reflected in the
AST as follows:

```
FunctionDecl 0x555564f4aff0 <Conv.h:33:1, line:41:1> line:33:35 imported in ./thrift_cpp2_base.h hidden tryTo 'Expected<Tgt, const char *> ()' inline
|-also in ./folly-conv.h
`-CompoundStmt 0x555564f7cfc8 <col:43, line:41:1>
  |-DeclStmt 0x555564f7ced8 <line:34:3, col:17>
  | `-VarDecl 0x555564f7cef8 <col:3, col:16> col:7 imported in ./thrift_cpp2_base.h hidden referenced result 'Tgt' cinit
  |   `-IntegerLiteral 0x555564f7d080 <col:16> 'int' 0
  |-CallExpr 0x555564f7cea8 <line:39:3, col:76> '<dependent type>'
  | |-UnresolvedLookupExpr 0x555564f7bea0 <col:3, col:19> '<overloaded function type>' lvalue (no ADL) = 'then_' 0x555564f7bef0
  | |-CXXTemporaryObjectExpr 0x555564f7bcb0 <col:25, col:45> 'Expected<bool, int>':'folly::Expected<bool, int>' 'void () noexcept' zeroing
  | `-LambdaExpr 0x555564f7bc88 <col:48, col:75> '(lambda at Conv.h:39:48)'
  |   |-CXXRecordDecl 0x555564f76b88 <col:48> col:48 imported in ./folly-conv.h hidden implicit <undeserialized declarations> class definition
  |   | |-also in ./thrift_cpp2_base.h
  |   | `-DefinitionData lambda empty standard_layout trivially_copyable literal can_const_default_init
  |   |   |-DefaultConstructor defaulted_is_constexpr
  |   |   |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param
  |   |   |-MoveConstructor exists simple trivial needs_implicit
  |   |   |-CopyAssignment trivial has_const_param needs_implicit implicit_has_const_param
  |   |   |-MoveAssignment
  |   |   `-Destructor simple irrelevant trivial constexpr needs_implicit
  |   `-CompoundStmt 0x555564f7d1a8 <col:58, col:75>
  |     `-ReturnStmt 0x555564f7d198 <col:60, col:67>
  |       `-DeclRefExpr 0x555564f7d0a0 <col:67> 'Tgt' lvalue Var 0x555564f7d0c8 'result' 'Tgt' refers_to_enclosing_variable_or_capture
  `-ReturnStmt 0x555564f7bc78 <line:40:3, col:11>
    `-InitListExpr 0x555564f7bc38 <col:10, col:11> 'void'
```

This diff modifies the AST deserialization process to load lambdas within the
canonical function declaration sooner, immediately following the function,
ensuring that they are loaded from the same module.

Re-land llvm#104512
Added test case that caused crash due to multiple enclosed lambdas
deserialization.

Test Plan: check-clang
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:modules C++20 modules and Clang Header Modules labels Sep 18, 2024
@llvmbot
Copy link
Member

llvmbot commented Sep 18, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-modules

Author: Dmitry Polukhin (dmpolukhin)

Changes

Summary:
Because AST loading code is lazy and happens in unpredictable order, it is possible that a function and lambda inside the function can be loaded from different modules. As a result, the captured DeclRefExpr won’t match the corresponding VarDecl inside the function. This situation is reflected in the AST as follows:

FunctionDecl 0x555564f4aff0 &lt;Conv.h:33:1, line:41:1&gt; line:33:35 imported in ./thrift_cpp2_base.h hidden tryTo 'Expected&lt;Tgt, const char *&gt; ()' inline
|-also in ./folly-conv.h
`-CompoundStmt 0x555564f7cfc8 &lt;col:43, line:41:1&gt;
  |-DeclStmt 0x555564f7ced8 &lt;line:34:3, col:17&gt;
  | `-VarDecl 0x555564f7cef8 &lt;col:3, col:16&gt; col:7 imported in ./thrift_cpp2_base.h hidden referenced result 'Tgt' cinit
  |   `-IntegerLiteral 0x555564f7d080 &lt;col:16&gt; 'int' 0
  |-CallExpr 0x555564f7cea8 &lt;line:39:3, col:76&gt; '&lt;dependent type&gt;'
  | |-UnresolvedLookupExpr 0x555564f7bea0 &lt;col:3, col:19&gt; '&lt;overloaded function type&gt;' lvalue (no ADL) = 'then_' 0x555564f7bef0
  | |-CXXTemporaryObjectExpr 0x555564f7bcb0 &lt;col:25, col:45&gt; 'Expected&lt;bool, int&gt;':'folly::Expected&lt;bool, int&gt;' 'void () noexcept' zeroing
  | `-LambdaExpr 0x555564f7bc88 &lt;col:48, col:75&gt; '(lambda at Conv.h:39:48)'
  |   |-CXXRecordDecl 0x555564f76b88 &lt;col:48&gt; col:48 imported in ./folly-conv.h hidden implicit &lt;undeserialized declarations&gt; class definition
  |   | |-also in ./thrift_cpp2_base.h
  |   | `-DefinitionData lambda empty standard_layout trivially_copyable literal can_const_default_init
  |   |   |-DefaultConstructor defaulted_is_constexpr
  |   |   |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param
  |   |   |-MoveConstructor exists simple trivial needs_implicit
  |   |   |-CopyAssignment trivial has_const_param needs_implicit implicit_has_const_param
  |   |   |-MoveAssignment
  |   |   `-Destructor simple irrelevant trivial constexpr needs_implicit
  |   `-CompoundStmt 0x555564f7d1a8 &lt;col:58, col:75&gt;
  |     `-ReturnStmt 0x555564f7d198 &lt;col:60, col:67&gt;
  |       `-DeclRefExpr 0x555564f7d0a0 &lt;col:67&gt; 'Tgt' lvalue Var 0x555564f7d0c8 'result' 'Tgt' refers_to_enclosing_variable_or_capture
  `-ReturnStmt 0x555564f7bc78 &lt;line:40:3, col:11&gt;
    `-InitListExpr 0x555564f7bc38 &lt;col:10, col:11&gt; 'void'

This diff modifies the AST deserialization process to load lambdas within the canonical function declaration sooner, immediately following the function, ensuring that they are loaded from the same module.

Re-land #104512 Added test case that caused crash due to multiple enclosed lambdas deserialization.

Test Plan: check-clang


Full diff: https://github.com/llvm/llvm-project/pull/109167.diff

7 Files Affected:

  • (modified) clang/include/clang/Serialization/ASTReader.h (+9)
  • (modified) clang/lib/Serialization/ASTReader.cpp (+13-1)
  • (modified) clang/lib/Serialization/ASTReaderDecl.cpp (+10)
  • (modified) clang/lib/Serialization/ASTWriterDecl.cpp (+42)
  • (added) clang/test/Headers/crash-instantiated-in-scope-cxx-modules.cpp (+76)
  • (added) clang/test/Headers/crash-instantiated-in-scope-cxx-modules2.cpp (+30)
  • (added) clang/test/Headers/crash-instantiated-in-scope-cxx-modules3.cpp (+26)
diff --git a/clang/include/clang/Serialization/ASTReader.h b/clang/include/clang/Serialization/ASTReader.h
index 898f4392465fdf..1b3812e7fd4523 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -1188,6 +1188,15 @@ class ASTReader
   /// once recursing loading has been completed.
   llvm::SmallVector<NamedDecl *, 16> PendingOdrMergeChecks;
 
+  /// Lambdas that need to be loaded immediately after the function they belong
+  /// to. It is necessary to have a canonical declaration for the lambda class
+  /// from the same module as the enclosing function. This requirement ensures
+  /// the correct resolution of captured variables in the lambda. Without this,
+  /// due to lazy deserialization, canonical declarations for the function and
+  /// lambdas can be from different modules, and DeclRefExprs may refer to AST
+  /// nodes that do not exist in the function.
+  SmallVector<GlobalDeclID, 4> PendingLambdas;
+
   using DataPointers =
       std::pair<CXXRecordDecl *, struct CXXRecordDecl::DefinitionData *>;
   using ObjCInterfaceDataPointers =
diff --git a/clang/lib/Serialization/ASTReader.cpp b/clang/lib/Serialization/ASTReader.cpp
index 7efcc81e194d95..d7bad6bbad0c90 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -9777,7 +9777,8 @@ void ASTReader::finishPendingActions() {
       !PendingDeducedVarTypes.empty() || !PendingIncompleteDeclChains.empty() ||
       !PendingDeclChains.empty() || !PendingMacroIDs.empty() ||
       !PendingDeclContextInfos.empty() || !PendingUpdateRecords.empty() ||
-      !PendingObjCExtensionIvarRedeclarations.empty()) {
+      !PendingObjCExtensionIvarRedeclarations.empty() ||
+      !PendingLambdas.empty()) {
     // If any identifiers with corresponding top-level declarations have
     // been loaded, load those declarations now.
     using TopLevelDeclsMap =
@@ -9922,6 +9923,17 @@ void ASTReader::finishPendingActions() {
       }
       PendingObjCExtensionIvarRedeclarations.pop_back();
     }
+
+    // Load any pending lambdas. During the deserialization of pending lambdas,
+    // more lambdas can be discovered, so swap the current PendingLambdas with a
+    // local empty vector. Newly discovered lambdas will be deserialized in the
+    // next iteration.
+    if (!PendingLambdas.empty()) {
+      SmallVector<GlobalDeclID, 4> DeclIDs;
+      DeclIDs.swap(PendingLambdas);
+      for (auto ID : DeclIDs)
+        GetDecl(ID);
+    }
   }
 
   // At this point, all update records for loaded decls are in place, so any
diff --git a/clang/lib/Serialization/ASTReaderDecl.cpp b/clang/lib/Serialization/ASTReaderDecl.cpp
index 9272e23c7da3fc..20e577404d997d 100644
--- a/clang/lib/Serialization/ASTReaderDecl.cpp
+++ b/clang/lib/Serialization/ASTReaderDecl.cpp
@@ -1155,6 +1155,16 @@ void ASTDeclReader::VisitFunctionDecl(FunctionDecl *FD) {
   for (unsigned I = 0; I != NumParams; ++I)
     Params.push_back(readDeclAs<ParmVarDecl>());
   FD->setParams(Reader.getContext(), Params);
+
+  // For the first decl add all lambdas inside for loading them later,
+  // otherwise skip them.
+  unsigned NumLambdas = Record.readInt();
+  if (FD->isFirstDecl()) {
+    for (unsigned I = 0; I != NumLambdas; ++I)
+      Reader.PendingLambdas.push_back(Record.readDeclID());
+  } else {
+    Record.skipInts(NumLambdas);
+  }
 }
 
 void ASTDeclReader::VisitObjCMethodDecl(ObjCMethodDecl *MD) {
diff --git a/clang/lib/Serialization/ASTWriterDecl.cpp b/clang/lib/Serialization/ASTWriterDecl.cpp
index 555f6325da646b..732a6e21f340d6 100644
--- a/clang/lib/Serialization/ASTWriterDecl.cpp
+++ b/clang/lib/Serialization/ASTWriterDecl.cpp
@@ -18,6 +18,7 @@
 #include "clang/AST/Expr.h"
 #include "clang/AST/OpenMPClause.h"
 #include "clang/AST/PrettyDeclStackTrace.h"
+#include "clang/AST/StmtVisitor.h"
 #include "clang/Basic/SourceManager.h"
 #include "clang/Serialization/ASTReader.h"
 #include "clang/Serialization/ASTRecordWriter.h"
@@ -625,6 +626,33 @@ void ASTDeclWriter::VisitDeclaratorDecl(DeclaratorDecl *D) {
                                            : QualType());
 }
 
+static llvm::SmallVector<const Decl *, 2> collectLambdas(FunctionDecl *D) {
+  struct LambdaCollector : public ConstStmtVisitor<LambdaCollector> {
+    llvm::SmallVectorImpl<const Decl *> &Lambdas;
+
+    LambdaCollector(llvm::SmallVectorImpl<const Decl *> &Lambdas)
+        : Lambdas(Lambdas) {}
+
+    void VisitLambdaExpr(const LambdaExpr *E) {
+      VisitStmt(E);
+      Lambdas.push_back(E->getLambdaClass());
+    }
+
+    void VisitStmt(const Stmt *S) {
+      if (!S)
+        return;
+      for (const Stmt *Child : S->children())
+        if (Child)
+          Visit(Child);
+    }
+  };
+
+  llvm::SmallVector<const Decl *, 2> Lambdas;
+  if (D->hasBody())
+    LambdaCollector(Lambdas).VisitStmt(D->getBody());
+  return Lambdas;
+}
+
 void ASTDeclWriter::VisitFunctionDecl(FunctionDecl *D) {
   static_assert(DeclContext::NumFunctionDeclBits == 44,
                 "You need to update the serializer after you change the "
@@ -764,6 +792,19 @@ void ASTDeclWriter::VisitFunctionDecl(FunctionDecl *D) {
   Record.push_back(D->param_size());
   for (auto *P : D->parameters())
     Record.AddDeclRef(P);
+
+  // Store references to all lambda decls inside function to load them
+  // immediately after loading the function to make sure that canonical
+  // decls for lambdas will be from the same module.
+  if (D->isCanonicalDecl()) {
+    llvm::SmallVector<const Decl *, 2> Lambdas = collectLambdas(D);
+    Record.push_back(Lambdas.size());
+    for (const auto *L : Lambdas)
+      Record.AddDeclRef(L);
+  } else {
+    Record.push_back(0);
+  }
+
   Code = serialization::DECL_FUNCTION;
 }
 
@@ -2239,6 +2280,7 @@ getFunctionDeclAbbrev(serialization::DeclCode Code) {
   //
   // This is:
   //         NumParams and Params[] from FunctionDecl, and
+  //         NumLambdas, Lambdas[] from FunctionDecl, and
   //         NumOverriddenMethods, OverriddenMethods[] from CXXMethodDecl.
   //
   //  Add an AbbrevOp for 'size then elements' and use it here.
diff --git a/clang/test/Headers/crash-instantiated-in-scope-cxx-modules.cpp b/clang/test/Headers/crash-instantiated-in-scope-cxx-modules.cpp
new file mode 100644
index 00000000000000..80844a58ad825a
--- /dev/null
+++ b/clang/test/Headers/crash-instantiated-in-scope-cxx-modules.cpp
@@ -0,0 +1,76 @@
+// RUN: rm -fR %t
+// RUN: split-file %s %t
+// RUN: cd %t
+// RUN: %clang_cc1 -std=c++20 -emit-header-unit -xc++-user-header -Werror=uninitialized folly-conv.h
+// RUN: %clang_cc1 -std=c++20 -emit-header-unit -xc++-user-header -Werror=uninitialized thrift_cpp2_base.h
+// RUN: %clang_cc1 -std=c++20 -emit-header-unit -xc++-user-header -Werror=uninitialized -fmodule-file=folly-conv.pcm -fmodule-file=thrift_cpp2_base.pcm logger_base.h
+
+//--- Conv.h
+#pragma once
+
+template <typename _Tp, typename _Up = _Tp&&>
+_Up __declval(int);
+
+template <typename _Tp>
+auto declval() noexcept -> decltype(__declval<_Tp>(0));
+
+namespace folly {
+
+template <class Value, class Error>
+struct Expected {
+  template <class Yes>
+  auto thenOrThrow() -> decltype(declval<Value&>()) {
+    return 1;
+  }
+};
+
+struct ExpectedHelper {
+  template <class Error, class T>
+  static constexpr Expected<T, Error> return_(T) {
+    return Expected<T, Error>();
+  }
+
+  template <class This, class Fn, class E = int, class T = ExpectedHelper>
+  static auto then_(This&&, Fn&&)
+      -> decltype(T::template return_<E>((declval<Fn>()(true), 0))) {
+    return Expected<int, int>();
+  }
+};
+
+template <class Tgt>
+inline Expected<Tgt, const char*> tryTo() {
+  Tgt result = 0;
+  // In build with asserts:
+  // clang/lib/Sema/SemaTemplateInstantiate.cpp: llvm::PointerUnion<Decl *, LocalInstantiationScope::DeclArgumentPack *> *clang::LocalInstantiationScope::findInstantiationOf(const Decl *): Assertion `isa<LabelDecl>(D) && "declaration not instantiated in this scope"' failed.
+  // In release build compilation error on the line below inside lambda:
+  // error: variable 'result' is uninitialized when used here [-Werror,-Wuninitialized]
+  ExpectedHelper::then_(Expected<bool, int>(), [&](bool) { return result; });
+  return {};
+}
+
+} // namespace folly
+
+inline void bar() {
+  folly::tryTo<int>();
+}
+// expected-no-diagnostics
+
+//--- folly-conv.h
+#pragma once
+#include "Conv.h"
+// expected-no-diagnostics
+
+//--- thrift_cpp2_base.h
+#pragma once
+#include "Conv.h"
+// expected-no-diagnostics
+
+//--- logger_base.h
+#pragma once
+import "folly-conv.h";
+import "thrift_cpp2_base.h";
+
+inline void foo() {
+  folly::tryTo<unsigned>();
+}
+// expected-no-diagnostics
diff --git a/clang/test/Headers/crash-instantiated-in-scope-cxx-modules2.cpp b/clang/test/Headers/crash-instantiated-in-scope-cxx-modules2.cpp
new file mode 100644
index 00000000000000..5b1a904e928a68
--- /dev/null
+++ b/clang/test/Headers/crash-instantiated-in-scope-cxx-modules2.cpp
@@ -0,0 +1,30 @@
+// RUN: rm -fR %t
+// RUN: split-file %s %t
+// RUN: cd %t
+// RUN: %clang_cc1 -std=c++20 -emit-header-unit -xc++-user-header header.h
+// RUN: %clang_cc1 -std=c++20 -fmodule-file=header.pcm main.cpp
+
+//--- header.h
+template <typename T>
+void f(T) {}
+
+class A {
+  virtual ~A();
+};
+
+inline A::~A() {
+  f([](){});
+}
+
+struct B {
+  void g() {
+    f([](){
+      [](){};
+    });
+  }
+};
+// expected-no-diagnostics
+
+//--- main.cpp
+import "header.h";
+// expected-no-diagnostics
diff --git a/clang/test/Headers/crash-instantiated-in-scope-cxx-modules3.cpp b/clang/test/Headers/crash-instantiated-in-scope-cxx-modules3.cpp
new file mode 100644
index 00000000000000..646ff9f745710b
--- /dev/null
+++ b/clang/test/Headers/crash-instantiated-in-scope-cxx-modules3.cpp
@@ -0,0 +1,26 @@
+// RUN: %clang_cc1 %s -std=c++11 -emit-pch -o %t
+// RUN: %clang_cc1 %s -std=c++11 -include-pch %t -fsyntax-only -verify
+
+// expected-no-diagnostics
+#ifndef HEADER
+#define HEADER
+
+// No crash or assertion failure on multiple nested lambdas deserialization.
+template <typename T>
+void b() {
+  [] {
+    []{
+      []{
+        []{
+          []{
+          }();
+        }();
+      }();
+    }();
+  }();
+}
+
+void foo() {
+  b<int>();
+}
+#endif

@ChuanqiXu9
Copy link
Member

Would you like to explain more why this fail previously in more detail?

Also I am thinking if we can make the process more efficiently:
(1) Can we avoid the visitor in the writing process?
(2) Can we delay the loading of lambdas to the load of definitions of the functions?

I immediately expectation is that (1) is possible but (2) may not be not safe.

For (1), my idea is to record/register the information during the writing of LambdaExpr in ASTWriterStmt and then we can write this record after we write all the decls and types. Then we need to read this record eagerly. Maybe we can optimize it further by adding a slot in FunctionDecl to record the offset of such informations. So we can avoid the eager linear reading process.

For (2), whether or not it is safe is, can we we load the lambdas without loading the definition of the functions? If not, we can add a code when getting the definition of the functions like:

if (lambdas not loaded)
     getCanonicalDecl()->loadLambdas();

Then we can delay the loading of the lambdas to make it more efficient.

@dmpolukhin
Copy link
Contributor Author

Would you like to explain more why this fail previously in more detail?

Original code in ASTReader::finishPendingActions looked like this:

    for (auto ID : PendingLambdas)
      GetDecl(ID);
    PendingLambdas.clear();

The issue here is that the code uses implicit iterators for PendingLambdas, but GetDecl may insert more elements into PendingLambdas, which can invalidate the iterators. In a good case, when there is no vector relocation, the new elements will be skipped. However, in the reproducer, the insertion caused the vector to relocate, resulting in a crash due to reading invalid values from deallocated memory. To address this issue, I have enclosed more than 5 lambdas to cause relocation in the new test cases.

In the new code, before reading the lambdas, I copy the existing lambdas to a new array. Alternatively, I could use an integer index for iteration and read the size of the vector on each iteration. Both approaches work fine, but I decided that running other pending actions might be better before starting to deserialize new lambdas. I can change it if you think iteration with an integer index is better.

Also I am thinking if we can make the process more efficiently: (1) Can we avoid the visitor in the writing process? (2) Can we delay the loading of lambdas to the load of definitions of the functions?

I immediately expectation is that (1) is possible but (2) may not be not safe.

For (1), my idea is to record/register the information during the writing of LambdaExpr in ASTWriterStmt and then we can write this record after we write all the decls and types. Then we need to read this record eagerly. Maybe we can optimize it further by adding a slot in FunctionDecl to record the offset of such informations. So we can avoid the eager linear reading process.

Yeah, it will complicate things a lot, we visit statements only after serializing FunctionDecl. I run some profiling and it seems that collectLambdas inclusively takes only about 0.25% cycles own time is almost 0. I'll try to reduce it even further. I think we can scopes instead of statements - it should be more efficient and we do it for enumerating anonymous definitions and it seems to be fast enough.

For (2), whether or not it is safe is, can we we load the lambdas without loading the definition of the functions? If not, we can add a code when getting the definition of the functions like:

if (lambdas not loaded)
     getCanonicalDecl()->loadLambdas();

Then we can delay the loading of the lambdas to make it more efficient.

It will be too late in my example. Without my change lambdas get loaded during function body deserialization but it was too late because loading template specialization caused choosing canonical record for the lambda from another modules. Function itself is selected from the last loaded modules by lookup (i.e. it happens in reverse order of module loading) but template specialisations get loaded in direct order. I tried to change order of specilization loading to opposite but in real word it not always works because some modules can be loaded for example in case of -fmodule-file=<module-name>=<path/to/BMI>.

@dmpolukhin
Copy link
Contributor Author

dmpolukhin commented Sep 19, 2024

With DeclContext visitor collectLambdas takes 0.03% or smaller, sometime I don't even see it in sampling profile.

@ChuanqiXu9
Copy link
Member

Would you like to explain more why this fail previously in more detail?

Original code in ASTReader::finishPendingActions looked like this:

    for (auto ID : PendingLambdas)
      GetDecl(ID);
    PendingLambdas.clear();

The issue here is that the code uses implicit iterators for PendingLambdas, but GetDecl may insert more elements into PendingLambdas, which can invalidate the iterators. In a good case, when there is no vector relocation, the new elements will be skipped. However, in the reproducer, the insertion caused the vector to relocate, resulting in a crash due to reading invalid values from deallocated memory. To address this issue, I have enclosed more than 5 lambdas to cause relocation in the new test cases.

In the new code, before reading the lambdas, I copy the existing lambdas to a new array. Alternatively, I could use an integer index for iteration and read the size of the vector on each iteration. Both approaches work fine, but I decided that running other pending actions might be better before starting to deserialize new lambdas. I can change it if you think iteration with an integer index is better.

Thanks, it is clear.

Also I am thinking if we can make the process more efficiently: (1) Can we avoid the visitor in the writing process? (2) Can we delay the loading of lambdas to the load of definitions of the functions?
I immediately expectation is that (1) is possible but (2) may not be not safe.
For (1), my idea is to record/register the information during the writing of LambdaExpr in ASTWriterStmt and then we can write this record after we write all the decls and types. Then we need to read this record eagerly. Maybe we can optimize it further by adding a slot in FunctionDecl to record the offset of such informations. So we can avoid the eager linear reading process.

Yeah, it will complicate things a lot, we visit statements only after serializing FunctionDecl. I run some profiling and it seems that collectLambdas inclusively takes only about 0.25% cycles own time is almost 0. I'll try to reduce it even further. I think we can scopes instead of statements - it should be more efficient and we do it for enumerating anonymous definitions and it seems to be fast enough.

it won't be much more complicated. You only need:
(1) Add a new map in ASTWriter.
(2) In ASTWriterDecl, when you are writing a CXXRecord you can judge if it is lambda and if its enclosing decl context is a first function decl. And if all the conditions are tree, insert the id of function decl and the record id to that map. (I thought we could do this in ASTWriterStmt, but it may be more straight forward in ASTWriterDecl)
(3) In ASTWriter::WriteDeclAndTypes, after we set DoneWritingDeclsAndTypes = true, there are plenty examples that we did similar things here (write the information recorded during writing). We can add a new logic below
(4) Convert the map to RecordData (a vector), and then we can emit the record.
(5) In the reader side, convert the readed RecordData into a map.
(6) When we reads the corresponding function (or other possible conditions), loads the lambdas from the map.

I think the steps here are clear enough. On the one hand, 0.3% is not too small, we see 0.5% as significant in middle end. On the other hand, the steps are pretty common in Serializations. I think it will be helpful for you to understand the process.

Sorry in a head that I approve the previous solution but now ask for something more complex. But let's try to do things better if possible.

For (2), whether or not it is safe is, can we we load the lambdas without loading the definition of the functions? If not, we can add a code when getting the definition of the functions like:

if (lambdas not loaded)
     getCanonicalDecl()->loadLambdas();

Then we can delay the loading of the lambdas to make it more efficient.

It will be too late in my example. Without my change lambdas get loaded during function body deserialization but it was too late because loading template specialization caused choosing canonical record for the lambda from another modules. Function itself is selected from the last loaded modules by lookup (i.e. it happens in reverse order of module loading) but template specialisations get loaded in direct order. I tried to change order of specilization loading to opposite but in real word it not always works because some modules can be loaded for example in case of -fmodule-file=<module-name>=<path/to/BMI>.

So in short, in my question, can I treat your answer as, we can load a lambda or choose a lambda as the canonical decl without deserializing the corresponding body of the function decl? And if yes, can we try to explore in which situations? It will be helpful to optimize our strategy.

@dmpolukhin
Copy link
Contributor Author

@ChuanqiXu9 thank you for the detailed steps, I uploaded new version. Does it look like something you had in mind?

Copy link
Member

@ChuanqiXu9 ChuanqiXu9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with nits

Copy link
Contributor Author

@dmpolukhin dmpolukhin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All comments resolved, also I added TODO that we might want to use on disk hash table to allow lazy deserialization for the new record. I think performance difference should be negligible but I haven't tested this hypothesis.

Copy link
Member

@ChuanqiXu9 ChuanqiXu9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM then

@dmpolukhin dmpolukhin merged commit 2ccac07 into llvm:main Sep 25, 2024
8 checks passed
@eaeltsin
Copy link
Contributor

Heads-up - this commit might have introduced compilation non-determinism.

Will try to cook the reproducer soon. Meanwhile, can you please double-check there are no non-deterministic pieces, such as sorting by address, etc.?

@eaeltsin
Copy link
Contributor

A comment on another non-determinism issue that might be related - #106855 (comment)

@dmpolukhin
Copy link
Contributor Author

@eaeltsin do you know which artifacts become non-deterministic? Is it produced object file or serialized AST?

@eaeltsin
Copy link
Contributor

We see non-determinism in pcm files when compiling some stl stuff.

Bisection pointed to this commit, but there is still some chance this is because of our internal changes though.

@dmpolukhin
Copy link
Contributor Author

dmpolukhin commented Sep 26, 2024

I think it is due to this change :( I'm iterating over Decl* in FunctionToLambdasMap. I'll fix it by replacing the pointers with LocalDeclID that should be stable. Will send pull request soon.

@eaeltsin
Copy link
Contributor

Perfect, thanks!

dmpolukhin added a commit to dmpolukhin/llvm-project that referenced this pull request Sep 26, 2024
Summary:
llvm#109167 serialized
FunctionToLambdasMap in the order of pointers in DenseMap. It gives
different order with different memeory leayout. Fix this issue by using
LocalDeclID instead of pointers.

Test Plan: check-clang
dmpolukhin added a commit that referenced this pull request Sep 27, 2024
Summary:
#109167 serializes
FunctionToLambdasMap in the order of pointers in DenseMap. It gives
different order with different memory layouts. Fix this issue by using
LocalDeclID instead of pointers.

Test Plan: check-clang
Sterling-Augustine pushed a commit to Sterling-Augustine/llvm-project that referenced this pull request Sep 27, 2024
Summary:
llvm#109167 serializes
FunctionToLambdasMap in the order of pointers in DenseMap. It gives
different order with different memory layouts. Fix this issue by using
LocalDeclID instead of pointers.

Test Plan: check-clang
puja2196 pushed a commit to puja2196/LLVM-tutorial that referenced this pull request Sep 30, 2024
Summary:
llvm/llvm-project#109167 serializes
FunctionToLambdasMap in the order of pointers in DenseMap. It gives
different order with different memory layouts. Fix this issue by using
LocalDeclID instead of pointers.

Test Plan: check-clang
puja2196 pushed a commit to puja2196/LLVM-tutorial that referenced this pull request Oct 2, 2024
Summary:
llvm/llvm-project#109167 serializes
FunctionToLambdasMap in the order of pointers in DenseMap. It gives
different order with different memory layouts. Fix this issue by using
LocalDeclID instead of pointers.

Test Plan: check-clang
@aeubanks
Copy link
Contributor

aeubanks commented Oct 4, 2024

this has also caused the following issue to pop up in our modules build

error: variable 'param' cannot be implicitly captured in a lambda with no capture-default specified

hopefully the error provides some guidance for where to look, but I can try to get a reduced repro

@dmpolukhin
Copy link
Contributor Author

this has also caused the following issue to pop up in our modules build

error: variable 'param' cannot be implicitly captured in a lambda with no capture-default specified

hopefully the error provides some guidance for where to look, but I can try to get a reduced repro

Please create a reproducer without it is not possible to debug and fix. On a build without asserts my original example gives the same error. It means that there are cases when some lambda is not deserialized early enough.

@aeubanks
Copy link
Contributor

aeubanks commented Oct 4, 2024

ah there's a stack trace in an asserts build of clang

F0000 00:00:1728077756.977871    3296 logging.cc:62] assert.h assertion failed at third_party/llvm/llvm-project/clang/lib/Sema/SemaTemplateInstantiate.cpp:4602 in llvm::PointerUnion<Decl *, LocalInstantiationScope::DeclArgumentPack *> *clang::LocalInstantiationScope::findInstantiationOf(const Decl *): isa<
LabelDecl>(D) && "declaration not instantiated in this scope"                                                                                                                                                                                                                             
    @     0x563ef889e824  __assert_fail                                                                                                                                                                                                                                                                            
    @     0x563ef47a3bf4  clang::LocalInstantiationScope::findInstantiationOf()                                                                                                                                                                                                                                    
    @     0x563ef4836942  clang::Sema::FindInstantiatedDecl()                                                                                                                                                                                                                                                      
    @     0x563ef4810491  clang::TreeTransform<>::TransformLambdaExpr()                                                                                                                                                                                                                                            
    @     0x563ef47ffb91  (anonymous namespace)::TemplateInstantiator::TransformLambdaExpr()                                                                                                                                                                                                                       
    @     0x563ef47a1dc2  clang::TreeTransform<>::TransformExprs()                                                                                                                                                                                                                                                 
    @     0x563ef480293d  clang::TreeTransform<>::TransformCallExpr()                                                                                                                                                                                                                                              
    @     0x563ef479f7fa  clang::TreeTransform<>::TransformStmt()                                                                                                                                                                                                                                                  
    @     0x563ef4809f34  clang::TreeTransform<>::TransformCompoundStmt()                                                                                                                                                                                                                                          
    @     0x563ef479f788  clang::Sema::SubstStmt()                                                                                                                                                                                                                                                                 
    @     0x563ef484a01d  clang::Sema::InstantiateFunctionDefinition()                                                                                                                                                                                                                                             
    @     0x563ef484d0bf  clang::Sema::PerformPendingInstantiations()                                                                                                                                                                                                                                              
    @     0x563ef484a108  clang::Sema::InstantiateFunctionDefinition()                                                                                                                                                                                                                                             
    @     0x563ef484d0bf  clang::Sema::PerformPendingInstantiations()                                                                                                                                                                                                                                              
    @     0x563ef484a108  clang::Sema::InstantiateFunctionDefinition()                                                                                                                                                                                                                                             
    @     0x563ef484d0bf  clang::Sema::PerformPendingInstantiations()                                                                                                                                                                                                                                              
    @     0x563ef484a108  clang::Sema::InstantiateFunctionDefinition()                                                                                                                                                                                                                                             
    @     0x563ef484d0bf  clang::Sema::PerformPendingInstantiations()                                                                                                                                                                                                                                              
    @     0x563ef3dbb8fb  clang::Sema::ActOnEndOfTranslationUnitFragment()                                                                                                                                                                                                                                         
    @     0x563ef3dbc021  clang::Sema::ActOnEndOfTranslationUnit()                                                                                                                                                                                                                                                 
    @     0x563ef3ae663a  clang::Parser::ParseTopLevelDecl()                                                                                                                                                                                                                                                       

still trying to reduce the reproducer...

@dmpolukhin
Copy link
Contributor Author

still trying to reduce the reproducer...

Please create a reproducer, the stack trace doesn't have information to understand what is causing the issue.

@ilya-biryukov
Copy link
Contributor

Here is the reproducer that crashes at head.
lambdas.tgz

dmpolukhin added a commit to dmpolukhin/llvm-project that referenced this pull request Oct 11, 2024
…ical decl

Summary:
Fix crash from reproducer provided in llvm#109167 (comment)

Test Plan: TBD
@dmpolukhin
Copy link
Contributor Author

Here is the reproducer that crashes at head. lambdas.tgz

I reproduced this issue on my machine and confirm that it is indeed due to my changes. #111992 has fix that seems to fix the reproducer and doesn't introduce new issue as far as I know.
@ilya-biryukov could you please try it on your full test case?

@ilya-biryukov
Copy link
Contributor

Here is the reproducer that crashes at head. lambdas.tgz

I reproduced this issue on my machine and confirm that it is indeed due to my changes. #111992 has fix that seems to fix the reproducer and doesn't introduce new issue as far as I know. @ilya-biryukov could you please try it on your full test case?

I've checked #111992 and it does fix the problem, so let's land it?

@dmpolukhin
Copy link
Contributor Author

I've checked #111992 and it does fix the problem, so let's land it?

Yes, I would like to create a test case to don't regress this feature in future. I need to reduce libc++ functional header to something smaller.

@ilya-biryukov
Copy link
Contributor

I've checked #111992 and it does fix the problem, so let's land it?

Yes, I would like to create a test case to don't regress this feature in future. I need to reduce libc++ functional header to something smaller.

Right, definitely +1 to adding a test case. I decided to hold off that comment until the PR moves away from the draft state, because I thought that's something that's on your mind anyway.

dmpolukhin added a commit to dmpolukhin/llvm-project that referenced this pull request Oct 17, 2024
…ical decl

Summary:
Fix crash from reproducer provided in llvm#109167 (comment)

Test Plan: TBD
dmpolukhin added a commit to dmpolukhin/llvm-project that referenced this pull request Dec 16, 2024
…ical decl

Summary:
Fix crash from reproducer provided in llvm#109167 (comment)

Test Plan: TBD
dmpolukhin added a commit that referenced this pull request Dec 16, 2024
…cal decl (#111992)

Summary:
Fix crash from reproducer provided in
#109167 (comment)
Also fix issues with merged inline friend functions merged during deserialization.

Test Plan: check-clang
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:modules C++20 modules and Clang Header Modules clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants