Skip to content

Reland: [clang] Diagnose dangling issues for the "Container<GSLPointer>" case. #107213 #108344

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Sep 25, 2024

Conversation

hokein
Copy link
Collaborator

@hokein hokein commented Sep 12, 2024

This relands #107213, with with fixes to address false positives (make_optional(nullptr)).

@hokein hokein requested review from Xazax-hun and usx95 September 12, 2024 08:21
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Sep 12, 2024
@llvmbot
Copy link
Member

llvmbot commented Sep 12, 2024

@llvm/pr-subscribers-clang

Author: Haojian Wu (hokein)

Changes

This relands #107213, with a fix to the "make_optional(nullptr)" false positive (please see the second commit in this PR).


Full diff: https://github.com/llvm/llvm-project/pull/108344.diff

4 Files Affected:

  • (modified) clang/docs/ReleaseNotes.rst (+2)
  • (modified) clang/include/clang/Basic/AttrDocs.td (+14)
  • (modified) clang/lib/Sema/CheckExprLifetime.cpp (+34-10)
  • (modified) clang/test/Sema/warn-lifetime-analysis-nocfg.cpp (+82)
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 22749e96a7e3d3..9860b25f2e7fa6 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -301,6 +301,8 @@ Improvements to Clang's diagnostics
 
 - Clang now warns for u8 character literals used in C23 with ``-Wpre-c23-compat`` instead of ``-Wpre-c++17-compat``.
 
+- Clang now diagnoses cases where a dangling ``GSLOwner<GSLPointer>`` object is constructed, e.g. ``std::vector<string_view> v = {std::string()};`` (#GH100526).
+
 Improvements to Clang's time-trace
 ----------------------------------
 
diff --git a/clang/include/clang/Basic/AttrDocs.td b/clang/include/clang/Basic/AttrDocs.td
index 546e5100b79dd9..9f72456d2da678 100644
--- a/clang/include/clang/Basic/AttrDocs.td
+++ b/clang/include/clang/Basic/AttrDocs.td
@@ -6690,6 +6690,20 @@ When the Owner's lifetime ends, it will consider the Pointer to be dangling.
     P.getInt(); // P is dangling
   }
 
+If a template class is annotated with ``[[gsl::Owner]]``, and the first
+instantiated template argument is a pointer type (raw pointer, or ``[[gsl::Pointer]]``),
+the analysis will consider the instantiated class as a container of the pointer.
+When constructing such an object from a GSL owner object, the analysis will
+assume that the container holds a pointer to the owner object. Consequently,
+when the owner object is destroyed, the pointer will be considered dangling.
+
+.. code-block:: c++
+
+   int f() {
+     std::vector<std::string_view> v = {std::string()}; // v holds a dangling pointer.
+     std::optional<std::string_view> o = std::string(); // o holds a dangling pointer.
+   }
+
 }];
 }
 
diff --git a/clang/lib/Sema/CheckExprLifetime.cpp b/clang/lib/Sema/CheckExprLifetime.cpp
index f62e18543851c1..c3e203d56c9280 100644
--- a/clang/lib/Sema/CheckExprLifetime.cpp
+++ b/clang/lib/Sema/CheckExprLifetime.cpp
@@ -267,6 +267,27 @@ static bool isInStlNamespace(const Decl *D) {
   return DC->isStdNamespace();
 }
 
+// Returns true if the given Record decl is a form of `GSLOwner<Pointer>`
+// type, e.g. std::vector<string_view>, std::optional<string_view>.
+static bool isContainerOfPointer(const RecordDecl *Container) {
+  if (const auto *CTSD =
+          dyn_cast_if_present<ClassTemplateSpecializationDecl>(Container)) {
+    if (!CTSD->hasAttr<OwnerAttr>()) // Container must be a GSL owner type.
+      return false;
+    const auto &TAs = CTSD->getTemplateArgs();
+    return TAs.size() > 0 && TAs[0].getKind() == TemplateArgument::Type &&
+           (isRecordWithAttr<PointerAttr>(TAs[0].getAsType()) ||
+            TAs[0].getAsType()->isPointerType() ||
+            TAs[0].getAsType()->isNullPtrType());
+  }
+  return false;
+}
+
+static bool isGSLOwner(QualType T) {
+  return isRecordWithAttr<OwnerAttr>(T) &&
+         !isContainerOfPointer(T->getAsRecordDecl());
+}
+
 static bool shouldTrackImplicitObjectArg(const CXXMethodDecl *Callee) {
   if (auto *Conv = dyn_cast_or_null<CXXConversionDecl>(Callee))
     if (isRecordWithAttr<PointerAttr>(Conv->getConversionType()))
@@ -275,7 +296,7 @@ static bool shouldTrackImplicitObjectArg(const CXXMethodDecl *Callee) {
     return false;
   if (!isRecordWithAttr<PointerAttr>(
           Callee->getFunctionObjectParameterType()) &&
-      !isRecordWithAttr<OwnerAttr>(Callee->getFunctionObjectParameterType()))
+      !isGSLOwner(Callee->getFunctionObjectParameterType()))
     return false;
   if (Callee->getReturnType()->isPointerType() ||
       isRecordWithAttr<PointerAttr>(Callee->getReturnType())) {
@@ -413,7 +434,7 @@ static void visitFunctionCallArguments(IndirectLocalPath &Path, Expr *Call,
     // Once we initialized a value with a non gsl-owner reference, it can no
     // longer dangle.
     if (ReturnType->isReferenceType() &&
-        !isRecordWithAttr<OwnerAttr>(ReturnType->getPointeeType())) {
+        !isGSLOwner(ReturnType->getPointeeType())) {
       for (const IndirectLocalPathEntry &PE : llvm::reverse(Path)) {
         if (PE.Kind == IndirectLocalPathEntry::GslReferenceInit ||
             PE.Kind == IndirectLocalPathEntry::LifetimeBoundCall)
@@ -468,12 +489,17 @@ static void visitFunctionCallArguments(IndirectLocalPath &Path, Expr *Call,
     if (CheckCoroCall || Callee->getParamDecl(I)->hasAttr<LifetimeBoundAttr>())
       VisitLifetimeBoundArg(Callee->getParamDecl(I), Args[I]);
     else if (EnableGSLAnalysis && I == 0) {
+      // Perform GSL analysis for the first argument
       if (shouldTrackFirstArgument(Callee)) {
         VisitGSLPointerArg(Callee, Args[0]);
-      } else if (auto *CCE = dyn_cast<CXXConstructExpr>(Call);
-                 CCE &&
-                 CCE->getConstructor()->getParent()->hasAttr<PointerAttr>()) {
-        VisitGSLPointerArg(CCE->getConstructor(), Args[0]);
+      } else if (auto *Ctor = dyn_cast<CXXConstructExpr>(Call)) {
+        const auto *ClassD = Ctor->getConstructor()->getParent();
+        // Two cases:
+        //  a GSL pointer, e.g. std::string_view
+        //  a container of GSL pointer, e.g. std::vector<string_view>
+        if (ClassD->hasAttr<PointerAttr>() ||
+            (isContainerOfPointer(ClassD) && Callee->getNumParams() == 1))
+          VisitGSLPointerArg(Ctor->getConstructor(), Args[0]);
       }
     }
   }
@@ -975,7 +1001,6 @@ static void checkExprLifetimeImpl(Sema &SemaRef,
     SourceLocation DiagLoc = DiagRange.getBegin();
 
     auto *MTE = dyn_cast<MaterializeTemporaryExpr>(L);
-
     bool IsGslPtrValueFromGslTempOwner = false;
     bool IsLocalGslOwner = false;
     if (pathOnlyHandlesGslPointer(Path)) {
@@ -985,13 +1010,12 @@ static void checkExprLifetimeImpl(Sema &SemaRef,
         //   int &p = *localUniquePtr;
         //   someContainer.add(std::move(localUniquePtr));
         //   return p;
-        IsLocalGslOwner = isRecordWithAttr<OwnerAttr>(L->getType());
+        IsLocalGslOwner = isGSLOwner(L->getType());
         if (pathContainsInit(Path) || !IsLocalGslOwner)
           return false;
       } else {
         IsGslPtrValueFromGslTempOwner =
-            MTE && !MTE->getExtendingDecl() &&
-            isRecordWithAttr<OwnerAttr>(MTE->getType());
+            MTE && !MTE->getExtendingDecl() && isGSLOwner(MTE->getType());
         // Skipping a chain of initializing gsl::Pointer annotated objects.
         // We are looking only for the final source to find out if it was
         // a local or temporary owner or the address of a local variable/param.
diff --git a/clang/test/Sema/warn-lifetime-analysis-nocfg.cpp b/clang/test/Sema/warn-lifetime-analysis-nocfg.cpp
index 59357d0730a7d9..b7f7fb6044839d 100644
--- a/clang/test/Sema/warn-lifetime-analysis-nocfg.cpp
+++ b/clang/test/Sema/warn-lifetime-analysis-nocfg.cpp
@@ -158,17 +158,30 @@ auto begin(C &c) -> decltype(c.begin());
 template<typename T, int N>
 T *begin(T (&array)[N]);
 
+using size_t = decltype(sizeof(0));
+
+template<typename T>
+struct initializer_list {
+  const T* ptr; size_t sz;
+};
 template <typename T>
 struct vector {
   typedef __gnu_cxx::basic_iterator<T> iterator;
   iterator begin();
   iterator end();
   const T *data() const;
+  vector();
+  vector(initializer_list<T> __l);
+  
+  template<typename InputIterator>
+	vector(InputIterator first, InputIterator __last);
+
   T &at(int n);
 };
 
 template<typename T>
 struct basic_string_view {
+  basic_string_view();
   basic_string_view(const T *);
   const T *begin() const;
 };
@@ -203,11 +216,21 @@ template<typename T>
 struct optional {
   optional();
   optional(const T&);
+
+  template<typename U = T>
+	optional(U&& t);
+
+  template<typename U>
+	optional(optional<U>&& __t);
+
   T &operator*() &;
   T &&operator*() &&;
   T &value() &;
   T &&value() &&;
 };
+template<typename T>
+optional<__decay(T)> make_optional(T&&);
+
 
 template<typename T>
 struct stack {
@@ -553,3 +576,62 @@ void test() {
   std::string_view svjkk1 = ReturnStringView(StrCat("bar", "x")); // expected-warning {{object backing the pointer will be destroyed at the end of the full-expression}}
 }
 } // namespace GH100549
+
+namespace GH100526 {
+void test() {
+  std::vector<std::string_view> v1({std::string()}); // expected-warning {{object backing the pointer will be destroyed at the end}}
+  std::vector<std::string_view> v2({
+    std::string(), // expected-warning {{object backing the pointer will be destroyed at the end}}
+    std::string_view()
+  });
+  std::vector<std::string_view> v3({
+    std::string_view(),
+    std::string()  // expected-warning {{object backing the pointer will be destroyed at the end}}
+  });
+
+  std::optional<std::string_view> o1 = std::string(); // expected-warning {{object backing the pointer}}
+
+  std::string s;
+  // This is a tricky use-after-free case, what it does:
+  //   1. make_optional creates a temporary "optional<string>"" object
+  //   2. the temporary object owns the underlying string which is copied from s.
+  //   3. the t3 object holds the view to the underlying string of the temporary object.
+  std::optional<std::string_view> o2 = std::make_optional(s); // expected-warning {{object backing the pointer}}
+  std::optional<std::string_view> o3 = std::optional<std::string>(s); // expected-warning {{object backing the pointer}}
+  std::optional<std::string_view> o4 = std::optional<std::string_view>(s); 
+
+  // FIXME: should work for assignment cases
+  v1 = {std::string()};
+  o1 = std::string();
+
+  // no warning on copying pointers.
+  std::vector<std::string_view> n1 = {std::string_view()};
+  std::optional<std::string_view> n2 = {std::string_view()};
+  std::optional<std::string_view> n3 = std::string_view();
+  std::optional<std::string_view> n4 = std::make_optional(std::string_view());
+  const char* b = "";
+  std::optional<std::string_view> n5 = std::make_optional(b);
+  std::optional<std::string_view> n6 = std::make_optional("test");
+}
+
+std::vector<std::string_view> test2(int i) {
+  std::vector<std::string_view> t;
+  if (i)
+    return t; // this is fine, no dangling
+  return std::vector<std::string_view>(t.begin(), t.end());
+}
+
+std::optional<std::string_view> test3(int i) {
+  std::string s;
+  std::string_view sv;
+  if (i)
+   return s; // expected-warning {{address of stack memory associated}}
+  return sv; // fine
+}
+
+std::optional<int*> test4(int a) {
+  if (a)
+   return std::make_optional(nullptr); // fine
+}
+
+} // namespace GH100526

@hokein hokein force-pushed the reland-container-pointer branch from 62a8f8d to edea044 Compare September 12, 2024 08:21
Copy link
Collaborator

@Xazax-hun Xazax-hun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@hokein hokein force-pushed the reland-container-pointer branch from edea044 to 69d669a Compare September 12, 2024 09:48
Copy link
Contributor

@usx95 usx95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We seem to be good at detecting new false positives but this is natural due to the visible compiler diagnositc. Unfortunately, same is not true for new false-negatives. More tests in our test-suite is the only way to detect those and we should be extensively adding more tests. This is something I feel particularly missing for lifetime analysis given the complexity of the code. This is obviously not actionable and non-blocking and is more of a meta comment.

See my comment for ideas for more tests.

I also found a recently introduced false-negative. This is not particularly introduced by this PR but is very related:
#108463

if (ClassD->hasAttr<PointerAttr>())
return true;

// case 2: construct a container of pointer (std::vector<std::string_view>)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/case/Case for consistency

shouldTrackFirstArgumentForConstructor(const CXXConstructExpr *Ctor) {
const auto *ClassD = Ctor->getConstructor()->getParent();

auto FirstArgType = Ctor->getArg(0)->getType();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: move below closer to its first use near return isGSLOwner(FirstArgType)

Copy link

github-actions bot commented Sep 13, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@hokein
Copy link
Collaborator Author

hokein commented Sep 13, 2024

We seem to be good at detecting new false positives but this is natural due to the visible compiler diagnositc. Unfortunately, same is not true for new false-negatives. More tests in our test-suite is the only way to detect those and we should be extensively adding more tests. This is something I feel particularly missing for lifetime analysis given the complexity of the code. This is obviously not actionable and non-blocking and is more of a meta comment.

+1, I fully agree with your point. I’ve felt that the current test lacks sufficient coverage. On the other side, I think the situation has been getter better with our recent improvements to lifetime analysis. Hopefully, with continued effort, we’ll reach a point where the test cover the majority of key cases.

See my comment for ideas for more tests.

I also found a recently introduced false-negative. This is not particularly introduced by this PR but is very related: #108463

Thanks, please see my replied comment.

Copy link
Contributor

@usx95 usx95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks.

@hokein hokein force-pushed the reland-container-pointer branch from ca0b4ba to 954a428 Compare September 23, 2024 09:46
We should also consider isNullPtrType as the pointer type.
And add more tests per the comment. The new false negative was caused by
replacing the instances of "!isRecordWithAttr<OwnerAttr>" in `TemporaryVisitor`
with "isGSLOwner".
@hokein hokein force-pushed the reland-container-pointer branch from 954a428 to 5790cca Compare September 23, 2024 20:30
@hokein hokein force-pushed the reland-container-pointer branch from 5790cca to 55f9462 Compare September 23, 2024 20:33
@hokein hokein merged commit fe06a6d into llvm:main Sep 25, 2024
9 checks passed
@hokein hokein deleted the reland-container-pointer branch September 25, 2024 12:12
hokein added a commit that referenced this pull request Oct 14, 2024
This is a follow-up to #108344.

The original bailout check was overly strict, causing it to miss cases
like the vector(initializer_list, allocator) constructor. This patch
relaxes the check to address that issue.

Fix #111680
DanielCChen pushed a commit to DanielCChen/llvm-project that referenced this pull request Oct 16, 2024
This is a follow-up to llvm#108344.

The original bailout check was overly strict, causing it to miss cases
like the vector(initializer_list, allocator) constructor. This patch
relaxes the check to address that issue.

Fix llvm#111680
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants