-
Notifications
You must be signed in to change notification settings - Fork 13.7k
[analyzer] Allow overriding Unknown memspaces using a ProgramState trait #123003
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-clang-static-analyzer-1 Author: Michael Flanders (Flandini) ChangesThis trait associates MemSpaceRegions with MemRegions to allow refining or changing information known about memory regions after they are created, since they are immutable. This commit is an intermediate step towards moving MemSpaceRegion out of the MemRegion class hierarchy and moving all notion of MemSpaceRegion to the trait. The intermediate step is that for now, only MemRegions with UnknownSpaceRegion are mapped in the trait and checked in checkers/core. Tagging @steakhal for review. Patch is 28.50 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/123003.diff 13 Files Affected:
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/MemSpaces.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/MemSpaces.h
new file mode 100644
index 00000000000000..178a6b60c1cb1a
--- /dev/null
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/MemSpaces.h
@@ -0,0 +1,47 @@
+//===-- MemSpaces.h -----------------------------------------------*- C++ -*--//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// TODO
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_CLANG_STATICANALYZER_CHECKERS_MEMSPACES_H
+#define LLVM_CLANG_STATICANALYZER_CHECKERS_MEMSPACES_H
+
+#include "clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/ProgramState_Fwd.h"
+
+namespace clang {
+namespace ento {
+
+class MemRegion;
+class MemSpaceRegion;
+
+namespace memspace {
+
+[[nodiscard]] ProgramStateRef setMemSpaceTrait(ProgramStateRef State,
+ const MemRegion *MR,
+ const MemSpaceRegion *MS);
+
+[[nodiscard]] const MemSpaceRegion *getMemSpaceTrait(ProgramStateRef State,
+ const MemRegion *MR);
+
+[[nodiscard]] bool hasMemSpaceTrait(ProgramStateRef State, const MemRegion *MR);
+
+template <typename FirstT, typename... RestT>
+[[nodiscard]] bool isMemSpaceOrTrait(ProgramStateRef State,
+ const MemRegion *MR) {
+ return isa<FirstT, RestT...>(MR->getMemorySpace()) ||
+ isa_and_nonnull<FirstT, RestT...>(getMemSpaceTrait(State, MR));
+}
+
+} // namespace memspace
+} // namespace ento
+} // namespace clang
+
+#endif // LLVM_CLANG_STATICANALYZER_CHECKERS_MEMSPACES_H
diff --git a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
index 6422933c8828a9..e97be53fee4c7f 100644
--- a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
@@ -22,6 +22,7 @@
#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/DynamicExtent.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/MemSpaces.h"
#include "llvm/ADT/APSInt.h"
#include "llvm/ADT/SmallString.h"
#include "llvm/Support/FormatVariadic.h"
diff --git a/clang/lib/StaticAnalyzer/Checkers/MacOSXAPIChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/MacOSXAPIChecker.cpp
index 754b167642961c..858b6a37551f5d 100644
--- a/clang/lib/StaticAnalyzer/Checkers/MacOSXAPIChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/MacOSXAPIChecker.cpp
@@ -22,6 +22,7 @@
#include "clang/StaticAnalyzer/Core/Checker.h"
#include "clang/StaticAnalyzer/Core/CheckerManager.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/MemSpaces.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/ProgramStateTrait.h"
#include "llvm/ADT/SmallString.h"
#include "llvm/ADT/StringSwitch.h"
@@ -75,10 +76,11 @@ void MacOSXAPIChecker::CheckDispatchOnce(CheckerContext &C, const CallExpr *CE,
if (!R)
return;
+ ProgramStateRef State = C.getState();
+
// Global variables are fine.
const MemRegion *RB = R->getBaseRegion();
- const MemSpaceRegion *RS = RB->getMemorySpace();
- if (isa<GlobalsSpaceRegion>(RS))
+ if (memspace::isMemSpaceOrTrait<GlobalsSpaceRegion>(State, RB))
return;
// Handle _dispatch_once. In some versions of the OS X SDK we have the case
@@ -117,9 +119,9 @@ void MacOSXAPIChecker::CheckDispatchOnce(CheckerContext &C, const CallExpr *CE,
if (IVR != R)
os << " memory within";
os << " the instance variable '" << IVR->getDecl()->getName() << '\'';
- } else if (isa<HeapSpaceRegion>(RS)) {
+ } else if (memspace::isMemSpaceOrTrait<HeapSpaceRegion>(State, RB)) {
os << " heap-allocated memory";
- } else if (isa<UnknownSpaceRegion>(RS)) {
+ } else if (isa<UnknownSpaceRegion>(RB->getMemorySpace())) {
// Presence of an IVar superregion has priority over this branch, because
// ObjC objects are on the heap even if the core doesn't realize this.
// Presence of a block variable base region has priority over this branch,
diff --git a/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
index 4166cf14391e2d..76fee2f1322e17 100644
--- a/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
@@ -72,6 +72,7 @@
#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerHelpers.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/DynamicExtent.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/ExplodedGraph.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/MemSpaces.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/ProgramStateTrait.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/ProgramState_Fwd.h"
@@ -784,7 +785,8 @@ class MallocChecker
bool IsALeakCheck = false) const;
///@}
static bool SummarizeValue(raw_ostream &os, SVal V);
- static bool SummarizeRegion(raw_ostream &os, const MemRegion *MR);
+ static bool SummarizeRegion(ProgramStateRef State, raw_ostream &os,
+ const MemRegion *MR);
void HandleNonHeapDealloc(CheckerContext &C, SVal ArgVal, SourceRange Range,
const Expr *DeallocExpr,
@@ -2206,13 +2208,21 @@ MallocChecker::FreeMemAux(CheckerContext &C, const Expr *ArgExpr,
// Parameters, locals, statics, globals, and memory returned by
// __builtin_alloca() shouldn't be freed.
- if (!isa<UnknownSpaceRegion, HeapSpaceRegion>(MS)) {
+ // Should skip this check if:
+ // - The region is in the heap
+ // - The region has unknown memspace and no trait for further clarification
+ // (i.e., just unknown)
+ // - The region has unknown memspace and a heap memspace trait
+ const MemSpaceRegion *MSTrait = memspace::getMemSpaceTrait(State, R);
+ bool HasHeapOrUnknownTrait = !MSTrait || isa<HeapSpaceRegion>(MSTrait);
+ if (!(isa<HeapSpaceRegion>(MS) ||
+ (isa<UnknownSpaceRegion>(MS) && HasHeapOrUnknownTrait))) {
// Regions returned by malloc() are represented by SymbolicRegion objects
// within HeapSpaceRegion. Of course, free() can work on memory allocated
// outside the current function, so UnknownSpaceRegion is also a
// possibility here.
- if (isa<AllocaRegion>(R))
+ if (isa<AllocaRegion>(R) || isa_and_nonnull<AllocaRegion>(MSTrait))
HandleFreeAlloca(C, ArgVal, ArgExpr->getSourceRange());
else
HandleNonHeapDealloc(C, ArgVal, ArgExpr->getSourceRange(), ParentExpr,
@@ -2384,7 +2394,7 @@ bool MallocChecker::SummarizeValue(raw_ostream &os, SVal V) {
return true;
}
-bool MallocChecker::SummarizeRegion(raw_ostream &os,
+bool MallocChecker::SummarizeRegion(ProgramStateRef State, raw_ostream &os,
const MemRegion *MR) {
switch (MR->getKind()) {
case MemRegion::FunctionCodeRegionKind: {
@@ -2404,8 +2414,10 @@ bool MallocChecker::SummarizeRegion(raw_ostream &os,
return true;
default: {
const MemSpaceRegion *MS = MR->getMemorySpace();
+ const MemSpaceRegion *MSTrait = memspace::getMemSpaceTrait(State, MR);
- if (isa<StackLocalsSpaceRegion>(MS)) {
+ if (isa<StackLocalsSpaceRegion>(MS) ||
+ isa_and_nonnull<StackLocalsSpaceRegion>(MSTrait)) {
const VarRegion *VR = dyn_cast<VarRegion>(MR);
const VarDecl *VD;
if (VR)
@@ -2420,7 +2432,8 @@ bool MallocChecker::SummarizeRegion(raw_ostream &os,
return true;
}
- if (isa<StackArgumentsSpaceRegion>(MS)) {
+ if (isa<StackArgumentsSpaceRegion>(MS) ||
+ isa_and_nonnull<StackArgumentsSpaceRegion>(MSTrait)) {
const VarRegion *VR = dyn_cast<VarRegion>(MR);
const VarDecl *VD;
if (VR)
@@ -2435,7 +2448,8 @@ bool MallocChecker::SummarizeRegion(raw_ostream &os,
return true;
}
- if (isa<GlobalsSpaceRegion>(MS)) {
+ if (isa<GlobalsSpaceRegion>(MS) ||
+ isa_and_nonnull<GlobalsSpaceRegion>(MSTrait)) {
const VarRegion *VR = dyn_cast<VarRegion>(MR);
const VarDecl *VD;
if (VR)
@@ -2489,8 +2503,8 @@ void MallocChecker::HandleNonHeapDealloc(CheckerContext &C, SVal ArgVal,
os << "deallocator";
os << " is ";
- bool Summarized = MR ? SummarizeRegion(os, MR)
- : SummarizeValue(os, ArgVal);
+ bool Summarized =
+ MR ? SummarizeRegion(C.getState(), os, MR) : SummarizeValue(os, ArgVal);
if (Summarized)
os << ", which is not memory allocated by ";
else
diff --git a/clang/lib/StaticAnalyzer/Checkers/MoveChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/MoveChecker.cpp
index 52416e21399147..d4ff8186ea69c9 100644
--- a/clang/lib/StaticAnalyzer/Checkers/MoveChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/MoveChecker.cpp
@@ -22,6 +22,7 @@
#include "clang/StaticAnalyzer/Core/CheckerManager.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/MemSpaces.h"
#include "llvm/ADT/StringSet.h"
using namespace clang;
@@ -145,12 +146,14 @@ class MoveChecker
// Obtains ObjectKind of an object. Because class declaration cannot always
// be easily obtained from the memory region, it is supplied separately.
- ObjectKind classifyObject(const MemRegion *MR, const CXXRecordDecl *RD) const;
+ ObjectKind classifyObject(ProgramStateRef State, const MemRegion *MR,
+ const CXXRecordDecl *RD) const;
// Classifies the object and dumps a user-friendly description string to
// the stream.
- void explainObject(llvm::raw_ostream &OS, const MemRegion *MR,
- const CXXRecordDecl *RD, MisuseKind MK) const;
+ void explainObject(ProgramStateRef State, llvm::raw_ostream &OS,
+ const MemRegion *MR, const CXXRecordDecl *RD,
+ MisuseKind MK) const;
bool belongsTo(const CXXRecordDecl *RD, const llvm::StringSet<> &Set) const;
@@ -299,12 +302,12 @@ MoveChecker::MovedBugVisitor::VisitNode(const ExplodedNode *N,
SmallString<128> Str;
llvm::raw_svector_ostream OS(Str);
- ObjectKind OK = Chk.classifyObject(Region, RD);
+ ObjectKind OK = Chk.classifyObject(State, Region, RD);
switch (OK.StdKind) {
case SK_SmartPtr:
if (MK == MK_Dereference) {
OS << "Smart pointer";
- Chk.explainObject(OS, Region, RD, MK);
+ Chk.explainObject(State, OS, Region, RD, MK);
OS << " is reset to null when moved from";
break;
}
@@ -315,12 +318,12 @@ MoveChecker::MovedBugVisitor::VisitNode(const ExplodedNode *N,
case SK_NonStd:
case SK_Safe:
OS << "Object";
- Chk.explainObject(OS, Region, RD, MK);
+ Chk.explainObject(State, OS, Region, RD, MK);
OS << " is moved";
break;
case SK_Unsafe:
OS << "Object";
- Chk.explainObject(OS, Region, RD, MK);
+ Chk.explainObject(State, OS, Region, RD, MK);
OS << " is left in a valid but unspecified state after move";
break;
}
@@ -353,7 +356,7 @@ void MoveChecker::modelUse(ProgramStateRef State, const MemRegion *Region,
CheckerContext &C) const {
assert(!C.isDifferent() && "No transitions should have been made by now");
const RegionState *RS = State->get<TrackedRegionMap>(Region);
- ObjectKind OK = classifyObject(Region, RD);
+ ObjectKind OK = classifyObject(State, Region, RD);
// Just in case: if it's not a smart pointer but it does have operator *,
// we shouldn't call the bug a dereference.
@@ -406,24 +409,25 @@ ExplodedNode *MoveChecker::tryToReportBug(const MemRegion *Region,
// Creating the error message.
llvm::SmallString<128> Str;
llvm::raw_svector_ostream OS(Str);
+ ProgramStateRef ErrorNodeState = N->getState();
switch(MK) {
case MK_FunCall:
OS << "Method called on moved-from object";
- explainObject(OS, Region, RD, MK);
+ explainObject(ErrorNodeState, OS, Region, RD, MK);
break;
case MK_Copy:
OS << "Moved-from object";
- explainObject(OS, Region, RD, MK);
+ explainObject(ErrorNodeState, OS, Region, RD, MK);
OS << " is copied";
break;
case MK_Move:
OS << "Moved-from object";
- explainObject(OS, Region, RD, MK);
+ explainObject(ErrorNodeState, OS, Region, RD, MK);
OS << " is moved";
break;
case MK_Dereference:
OS << "Dereference of null smart pointer";
- explainObject(OS, Region, RD, MK);
+ explainObject(ErrorNodeState, OS, Region, RD, MK);
break;
}
@@ -482,7 +486,7 @@ void MoveChecker::checkPostCall(const CallEvent &Call,
return;
const CXXRecordDecl *RD = MethodDecl->getParent();
- ObjectKind OK = classifyObject(ArgRegion, RD);
+ ObjectKind OK = classifyObject(State, ArgRegion, RD);
if (shouldBeTracked(OK)) {
// Mark object as moved-from.
State = State->set<TrackedRegionMap>(ArgRegion, RegionState::getMoved());
@@ -549,7 +553,7 @@ bool MoveChecker::belongsTo(const CXXRecordDecl *RD,
}
MoveChecker::ObjectKind
-MoveChecker::classifyObject(const MemRegion *MR,
+MoveChecker::classifyObject(ProgramStateRef State, const MemRegion *MR,
const CXXRecordDecl *RD) const {
// Local variables and local rvalue references are classified as "Local".
// For the purposes of this checker, we classify move-safe STL types
@@ -557,7 +561,7 @@ MoveChecker::classifyObject(const MemRegion *MR,
MR = unwrapRValueReferenceIndirection(MR);
bool IsLocal =
isa_and_nonnull<VarRegion, CXXLifetimeExtendedObjectRegion>(MR) &&
- isa<StackSpaceRegion>(MR->getMemorySpace());
+ memspace::isMemSpaceOrTrait<StackSpaceRegion>(State, MR);
if (!RD || !RD->getDeclContext()->isStdNamespace())
return { IsLocal, SK_NonStd };
@@ -571,8 +575,9 @@ MoveChecker::classifyObject(const MemRegion *MR,
return { IsLocal, SK_Unsafe };
}
-void MoveChecker::explainObject(llvm::raw_ostream &OS, const MemRegion *MR,
- const CXXRecordDecl *RD, MisuseKind MK) const {
+void MoveChecker::explainObject(ProgramStateRef State, llvm::raw_ostream &OS,
+ const MemRegion *MR, const CXXRecordDecl *RD,
+ MisuseKind MK) const {
// We may need a leading space every time we actually explain anything,
// and we never know if we are to explain anything until we try.
if (const auto DR =
@@ -581,7 +586,7 @@ void MoveChecker::explainObject(llvm::raw_ostream &OS, const MemRegion *MR,
OS << " '" << RegionDecl->getDeclName() << "'";
}
- ObjectKind OK = classifyObject(MR, RD);
+ ObjectKind OK = classifyObject(State, MR, RD);
switch (OK.StdKind) {
case SK_NonStd:
case SK_Safe:
diff --git a/clang/lib/StaticAnalyzer/Checkers/PutenvStackArrayChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/PutenvStackArrayChecker.cpp
index bf81d57bf82fd3..4449cb4ae68eb9 100644
--- a/clang/lib/StaticAnalyzer/Checkers/PutenvStackArrayChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/PutenvStackArrayChecker.cpp
@@ -21,6 +21,7 @@
#include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/MemSpaces.h"
using namespace clang;
using namespace ento;
@@ -45,10 +46,15 @@ void PutenvStackArrayChecker::checkPostCall(const CallEvent &Call,
SVal ArgV = Call.getArgSVal(0);
const Expr *ArgExpr = Call.getArgExpr(0);
- const auto *SSR =
- dyn_cast<StackSpaceRegion>(ArgV.getAsRegion()->getMemorySpace());
+ const MemRegion *MR = ArgV.getAsRegion();
+
+ const auto *SSR = dyn_cast<StackSpaceRegion>(MR->getMemorySpace());
+ if (!SSR)
+ SSR = dyn_cast_if_present<StackSpaceRegion>(
+ memspace::getMemSpaceTrait(C.getState(), MR));
if (!SSR)
return;
+
const auto *StackFrameFuncD =
dyn_cast_or_null<FunctionDecl>(SSR->getStackFrame()->getDecl());
if (StackFrameFuncD && StackFrameFuncD->isMain())
diff --git a/clang/lib/StaticAnalyzer/Checkers/RetainCountChecker/RetainCountDiagnostics.cpp b/clang/lib/StaticAnalyzer/Checkers/RetainCountChecker/RetainCountDiagnostics.cpp
index 456132ef0a0a22..ecc7563ceb4bba 100644
--- a/clang/lib/StaticAnalyzer/Checkers/RetainCountChecker/RetainCountDiagnostics.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/RetainCountChecker/RetainCountDiagnostics.cpp
@@ -13,6 +13,7 @@
#include "RetainCountDiagnostics.h"
#include "RetainCountChecker.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/MemSpaces.h"
#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/SmallVector.h"
#include <optional>
@@ -690,9 +691,14 @@ static AllocationInfo GetAllocationSite(ProgramStateManager &StateMgr,
const MemRegion *R = FB.getRegion();
// Do not show local variables belonging to a function other than
// where the error is reported.
- if (auto MR = dyn_cast<StackSpaceRegion>(R->getMemorySpace()))
- if (MR->getStackFrame() == LeakContext->getStackFrame())
- FirstBinding = R;
+ const StackSpaceRegion *MR =
+ dyn_cast<StackSpaceRegion>(R->getMemorySpace());
+ if (!MR)
+ MR = dyn_cast_if_present<StackSpaceRegion>(
+ memspace::getMemSpaceTrait(St, R));
+
+ if (MR && MR->getStackFrame() == LeakContext->getStackFrame())
+ FirstBinding = R;
}
// AllocationNode is the last node in which the symbol was tracked.
diff --git a/clang/lib/StaticAnalyzer/Checkers/StackAddrEscapeChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/StackAddrEscapeChecker.cpp
index f4de3b500499c4..4d8c8eea14ea63 100644
--- a/clang/lib/StaticAnalyzer/Checkers/StackAddrEscapeChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/StackAddrEscapeChecker.cpp
@@ -19,6 +19,7 @@
#include "clang/StaticAnalyzer/Core/CheckerManager.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/MemSpaces.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h"
#include "llvm/ADT/SmallString.h"
#include "llvm/Support/raw_ostream.h"
@@ -61,7 +62,7 @@ class StackAddrEscapeChecker
ASTContext &Ctx);
static SmallVector<const MemRegion *, 4>
getCapturedStackRegions(const BlockDataRegion &B, CheckerContext &C);
- static bool isNotInCurrentFrame(const MemRegion *R, CheckerContext &C);
+ static bool isNotInCurrentFrame(const MemSpaceRegion *MS, CheckerContext &C);
};
} // namespace
@@ -117,9 +118,9 @@ SourceRange StackAddrEscapeChecker::genName(raw_ostream &os, const MemRegion *R,
return range;
}
-bool StackAddrEscapeChecker::isNotInCurrentFrame(const MemRegion *R,
+bool StackAddrEscapeChecker::isNotInCurrentFrame(const MemSpaceRegion *MS,
CheckerContext &C) {
- const StackSpaceRegion *S = cast<StackSpaceRegion>(R->getMemorySpace());
+ const StackSpaceRegion *S = cast<StackSpaceRegion>(MS);
return S->getStackFrame() != C.getStackFrame();
}
@@ -138,10 +139,11 @@ SmallVector<const MemRegion *, 4>
StackAddrEscapeChecker::getCapturedStackRegions(const BlockDataRegion &B,
CheckerContext &C) {
SmallVector<const MemRegion *, 4> Regions;
+ ProgramStateRef State = C.getState();
for (auto Var : B.referenced_vars()) {
- SVal Val = C.getState()->getSVal(Var.getCapturedRegion());
+ SVal Val = State->getSVal(Var.getCapturedRegion());
const MemRegion *Region = Val.getA...
[truncated]
|
@llvm/pr-subscribers-clang Author: Michael Flanders (Flandini) ChangesThis trait associates MemSpaceRegions with MemRegions to allow refining or changing information known about memory regions after they are created, since they are immutable. This commit is an intermediate step towards moving MemSpaceRegion out of the MemRegion class hierarchy and moving all notion of MemSpaceRegion to the trait. The intermediate step is that for now, only MemRegions with UnknownSpaceRegion are mapped in the trait and checked in checkers/core. Tagging @steakhal for review. Patch is 28.50 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/123003.diff 13 Files Affected:
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/MemSpaces.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/MemSpaces.h
new file mode 100644
index 00000000000000..178a6b60c1cb1a
--- /dev/null
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/MemSpaces.h
@@ -0,0 +1,47 @@
+//===-- MemSpaces.h -----------------------------------------------*- C++ -*--//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// TODO
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_CLANG_STATICANALYZER_CHECKERS_MEMSPACES_H
+#define LLVM_CLANG_STATICANALYZER_CHECKERS_MEMSPACES_H
+
+#include "clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/ProgramState_Fwd.h"
+
+namespace clang {
+namespace ento {
+
+class MemRegion;
+class MemSpaceRegion;
+
+namespace memspace {
+
+[[nodiscard]] ProgramStateRef setMemSpaceTrait(ProgramStateRef State,
+ const MemRegion *MR,
+ const MemSpaceRegion *MS);
+
+[[nodiscard]] const MemSpaceRegion *getMemSpaceTrait(ProgramStateRef State,
+ const MemRegion *MR);
+
+[[nodiscard]] bool hasMemSpaceTrait(ProgramStateRef State, const MemRegion *MR);
+
+template <typename FirstT, typename... RestT>
+[[nodiscard]] bool isMemSpaceOrTrait(ProgramStateRef State,
+ const MemRegion *MR) {
+ return isa<FirstT, RestT...>(MR->getMemorySpace()) ||
+ isa_and_nonnull<FirstT, RestT...>(getMemSpaceTrait(State, MR));
+}
+
+} // namespace memspace
+} // namespace ento
+} // namespace clang
+
+#endif // LLVM_CLANG_STATICANALYZER_CHECKERS_MEMSPACES_H
diff --git a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
index 6422933c8828a9..e97be53fee4c7f 100644
--- a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
@@ -22,6 +22,7 @@
#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/DynamicExtent.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/MemSpaces.h"
#include "llvm/ADT/APSInt.h"
#include "llvm/ADT/SmallString.h"
#include "llvm/Support/FormatVariadic.h"
diff --git a/clang/lib/StaticAnalyzer/Checkers/MacOSXAPIChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/MacOSXAPIChecker.cpp
index 754b167642961c..858b6a37551f5d 100644
--- a/clang/lib/StaticAnalyzer/Checkers/MacOSXAPIChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/MacOSXAPIChecker.cpp
@@ -22,6 +22,7 @@
#include "clang/StaticAnalyzer/Core/Checker.h"
#include "clang/StaticAnalyzer/Core/CheckerManager.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/MemSpaces.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/ProgramStateTrait.h"
#include "llvm/ADT/SmallString.h"
#include "llvm/ADT/StringSwitch.h"
@@ -75,10 +76,11 @@ void MacOSXAPIChecker::CheckDispatchOnce(CheckerContext &C, const CallExpr *CE,
if (!R)
return;
+ ProgramStateRef State = C.getState();
+
// Global variables are fine.
const MemRegion *RB = R->getBaseRegion();
- const MemSpaceRegion *RS = RB->getMemorySpace();
- if (isa<GlobalsSpaceRegion>(RS))
+ if (memspace::isMemSpaceOrTrait<GlobalsSpaceRegion>(State, RB))
return;
// Handle _dispatch_once. In some versions of the OS X SDK we have the case
@@ -117,9 +119,9 @@ void MacOSXAPIChecker::CheckDispatchOnce(CheckerContext &C, const CallExpr *CE,
if (IVR != R)
os << " memory within";
os << " the instance variable '" << IVR->getDecl()->getName() << '\'';
- } else if (isa<HeapSpaceRegion>(RS)) {
+ } else if (memspace::isMemSpaceOrTrait<HeapSpaceRegion>(State, RB)) {
os << " heap-allocated memory";
- } else if (isa<UnknownSpaceRegion>(RS)) {
+ } else if (isa<UnknownSpaceRegion>(RB->getMemorySpace())) {
// Presence of an IVar superregion has priority over this branch, because
// ObjC objects are on the heap even if the core doesn't realize this.
// Presence of a block variable base region has priority over this branch,
diff --git a/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
index 4166cf14391e2d..76fee2f1322e17 100644
--- a/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
@@ -72,6 +72,7 @@
#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerHelpers.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/DynamicExtent.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/ExplodedGraph.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/MemSpaces.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/ProgramStateTrait.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/ProgramState_Fwd.h"
@@ -784,7 +785,8 @@ class MallocChecker
bool IsALeakCheck = false) const;
///@}
static bool SummarizeValue(raw_ostream &os, SVal V);
- static bool SummarizeRegion(raw_ostream &os, const MemRegion *MR);
+ static bool SummarizeRegion(ProgramStateRef State, raw_ostream &os,
+ const MemRegion *MR);
void HandleNonHeapDealloc(CheckerContext &C, SVal ArgVal, SourceRange Range,
const Expr *DeallocExpr,
@@ -2206,13 +2208,21 @@ MallocChecker::FreeMemAux(CheckerContext &C, const Expr *ArgExpr,
// Parameters, locals, statics, globals, and memory returned by
// __builtin_alloca() shouldn't be freed.
- if (!isa<UnknownSpaceRegion, HeapSpaceRegion>(MS)) {
+ // Should skip this check if:
+ // - The region is in the heap
+ // - The region has unknown memspace and no trait for further clarification
+ // (i.e., just unknown)
+ // - The region has unknown memspace and a heap memspace trait
+ const MemSpaceRegion *MSTrait = memspace::getMemSpaceTrait(State, R);
+ bool HasHeapOrUnknownTrait = !MSTrait || isa<HeapSpaceRegion>(MSTrait);
+ if (!(isa<HeapSpaceRegion>(MS) ||
+ (isa<UnknownSpaceRegion>(MS) && HasHeapOrUnknownTrait))) {
// Regions returned by malloc() are represented by SymbolicRegion objects
// within HeapSpaceRegion. Of course, free() can work on memory allocated
// outside the current function, so UnknownSpaceRegion is also a
// possibility here.
- if (isa<AllocaRegion>(R))
+ if (isa<AllocaRegion>(R) || isa_and_nonnull<AllocaRegion>(MSTrait))
HandleFreeAlloca(C, ArgVal, ArgExpr->getSourceRange());
else
HandleNonHeapDealloc(C, ArgVal, ArgExpr->getSourceRange(), ParentExpr,
@@ -2384,7 +2394,7 @@ bool MallocChecker::SummarizeValue(raw_ostream &os, SVal V) {
return true;
}
-bool MallocChecker::SummarizeRegion(raw_ostream &os,
+bool MallocChecker::SummarizeRegion(ProgramStateRef State, raw_ostream &os,
const MemRegion *MR) {
switch (MR->getKind()) {
case MemRegion::FunctionCodeRegionKind: {
@@ -2404,8 +2414,10 @@ bool MallocChecker::SummarizeRegion(raw_ostream &os,
return true;
default: {
const MemSpaceRegion *MS = MR->getMemorySpace();
+ const MemSpaceRegion *MSTrait = memspace::getMemSpaceTrait(State, MR);
- if (isa<StackLocalsSpaceRegion>(MS)) {
+ if (isa<StackLocalsSpaceRegion>(MS) ||
+ isa_and_nonnull<StackLocalsSpaceRegion>(MSTrait)) {
const VarRegion *VR = dyn_cast<VarRegion>(MR);
const VarDecl *VD;
if (VR)
@@ -2420,7 +2432,8 @@ bool MallocChecker::SummarizeRegion(raw_ostream &os,
return true;
}
- if (isa<StackArgumentsSpaceRegion>(MS)) {
+ if (isa<StackArgumentsSpaceRegion>(MS) ||
+ isa_and_nonnull<StackArgumentsSpaceRegion>(MSTrait)) {
const VarRegion *VR = dyn_cast<VarRegion>(MR);
const VarDecl *VD;
if (VR)
@@ -2435,7 +2448,8 @@ bool MallocChecker::SummarizeRegion(raw_ostream &os,
return true;
}
- if (isa<GlobalsSpaceRegion>(MS)) {
+ if (isa<GlobalsSpaceRegion>(MS) ||
+ isa_and_nonnull<GlobalsSpaceRegion>(MSTrait)) {
const VarRegion *VR = dyn_cast<VarRegion>(MR);
const VarDecl *VD;
if (VR)
@@ -2489,8 +2503,8 @@ void MallocChecker::HandleNonHeapDealloc(CheckerContext &C, SVal ArgVal,
os << "deallocator";
os << " is ";
- bool Summarized = MR ? SummarizeRegion(os, MR)
- : SummarizeValue(os, ArgVal);
+ bool Summarized =
+ MR ? SummarizeRegion(C.getState(), os, MR) : SummarizeValue(os, ArgVal);
if (Summarized)
os << ", which is not memory allocated by ";
else
diff --git a/clang/lib/StaticAnalyzer/Checkers/MoveChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/MoveChecker.cpp
index 52416e21399147..d4ff8186ea69c9 100644
--- a/clang/lib/StaticAnalyzer/Checkers/MoveChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/MoveChecker.cpp
@@ -22,6 +22,7 @@
#include "clang/StaticAnalyzer/Core/CheckerManager.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/MemSpaces.h"
#include "llvm/ADT/StringSet.h"
using namespace clang;
@@ -145,12 +146,14 @@ class MoveChecker
// Obtains ObjectKind of an object. Because class declaration cannot always
// be easily obtained from the memory region, it is supplied separately.
- ObjectKind classifyObject(const MemRegion *MR, const CXXRecordDecl *RD) const;
+ ObjectKind classifyObject(ProgramStateRef State, const MemRegion *MR,
+ const CXXRecordDecl *RD) const;
// Classifies the object and dumps a user-friendly description string to
// the stream.
- void explainObject(llvm::raw_ostream &OS, const MemRegion *MR,
- const CXXRecordDecl *RD, MisuseKind MK) const;
+ void explainObject(ProgramStateRef State, llvm::raw_ostream &OS,
+ const MemRegion *MR, const CXXRecordDecl *RD,
+ MisuseKind MK) const;
bool belongsTo(const CXXRecordDecl *RD, const llvm::StringSet<> &Set) const;
@@ -299,12 +302,12 @@ MoveChecker::MovedBugVisitor::VisitNode(const ExplodedNode *N,
SmallString<128> Str;
llvm::raw_svector_ostream OS(Str);
- ObjectKind OK = Chk.classifyObject(Region, RD);
+ ObjectKind OK = Chk.classifyObject(State, Region, RD);
switch (OK.StdKind) {
case SK_SmartPtr:
if (MK == MK_Dereference) {
OS << "Smart pointer";
- Chk.explainObject(OS, Region, RD, MK);
+ Chk.explainObject(State, OS, Region, RD, MK);
OS << " is reset to null when moved from";
break;
}
@@ -315,12 +318,12 @@ MoveChecker::MovedBugVisitor::VisitNode(const ExplodedNode *N,
case SK_NonStd:
case SK_Safe:
OS << "Object";
- Chk.explainObject(OS, Region, RD, MK);
+ Chk.explainObject(State, OS, Region, RD, MK);
OS << " is moved";
break;
case SK_Unsafe:
OS << "Object";
- Chk.explainObject(OS, Region, RD, MK);
+ Chk.explainObject(State, OS, Region, RD, MK);
OS << " is left in a valid but unspecified state after move";
break;
}
@@ -353,7 +356,7 @@ void MoveChecker::modelUse(ProgramStateRef State, const MemRegion *Region,
CheckerContext &C) const {
assert(!C.isDifferent() && "No transitions should have been made by now");
const RegionState *RS = State->get<TrackedRegionMap>(Region);
- ObjectKind OK = classifyObject(Region, RD);
+ ObjectKind OK = classifyObject(State, Region, RD);
// Just in case: if it's not a smart pointer but it does have operator *,
// we shouldn't call the bug a dereference.
@@ -406,24 +409,25 @@ ExplodedNode *MoveChecker::tryToReportBug(const MemRegion *Region,
// Creating the error message.
llvm::SmallString<128> Str;
llvm::raw_svector_ostream OS(Str);
+ ProgramStateRef ErrorNodeState = N->getState();
switch(MK) {
case MK_FunCall:
OS << "Method called on moved-from object";
- explainObject(OS, Region, RD, MK);
+ explainObject(ErrorNodeState, OS, Region, RD, MK);
break;
case MK_Copy:
OS << "Moved-from object";
- explainObject(OS, Region, RD, MK);
+ explainObject(ErrorNodeState, OS, Region, RD, MK);
OS << " is copied";
break;
case MK_Move:
OS << "Moved-from object";
- explainObject(OS, Region, RD, MK);
+ explainObject(ErrorNodeState, OS, Region, RD, MK);
OS << " is moved";
break;
case MK_Dereference:
OS << "Dereference of null smart pointer";
- explainObject(OS, Region, RD, MK);
+ explainObject(ErrorNodeState, OS, Region, RD, MK);
break;
}
@@ -482,7 +486,7 @@ void MoveChecker::checkPostCall(const CallEvent &Call,
return;
const CXXRecordDecl *RD = MethodDecl->getParent();
- ObjectKind OK = classifyObject(ArgRegion, RD);
+ ObjectKind OK = classifyObject(State, ArgRegion, RD);
if (shouldBeTracked(OK)) {
// Mark object as moved-from.
State = State->set<TrackedRegionMap>(ArgRegion, RegionState::getMoved());
@@ -549,7 +553,7 @@ bool MoveChecker::belongsTo(const CXXRecordDecl *RD,
}
MoveChecker::ObjectKind
-MoveChecker::classifyObject(const MemRegion *MR,
+MoveChecker::classifyObject(ProgramStateRef State, const MemRegion *MR,
const CXXRecordDecl *RD) const {
// Local variables and local rvalue references are classified as "Local".
// For the purposes of this checker, we classify move-safe STL types
@@ -557,7 +561,7 @@ MoveChecker::classifyObject(const MemRegion *MR,
MR = unwrapRValueReferenceIndirection(MR);
bool IsLocal =
isa_and_nonnull<VarRegion, CXXLifetimeExtendedObjectRegion>(MR) &&
- isa<StackSpaceRegion>(MR->getMemorySpace());
+ memspace::isMemSpaceOrTrait<StackSpaceRegion>(State, MR);
if (!RD || !RD->getDeclContext()->isStdNamespace())
return { IsLocal, SK_NonStd };
@@ -571,8 +575,9 @@ MoveChecker::classifyObject(const MemRegion *MR,
return { IsLocal, SK_Unsafe };
}
-void MoveChecker::explainObject(llvm::raw_ostream &OS, const MemRegion *MR,
- const CXXRecordDecl *RD, MisuseKind MK) const {
+void MoveChecker::explainObject(ProgramStateRef State, llvm::raw_ostream &OS,
+ const MemRegion *MR, const CXXRecordDecl *RD,
+ MisuseKind MK) const {
// We may need a leading space every time we actually explain anything,
// and we never know if we are to explain anything until we try.
if (const auto DR =
@@ -581,7 +586,7 @@ void MoveChecker::explainObject(llvm::raw_ostream &OS, const MemRegion *MR,
OS << " '" << RegionDecl->getDeclName() << "'";
}
- ObjectKind OK = classifyObject(MR, RD);
+ ObjectKind OK = classifyObject(State, MR, RD);
switch (OK.StdKind) {
case SK_NonStd:
case SK_Safe:
diff --git a/clang/lib/StaticAnalyzer/Checkers/PutenvStackArrayChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/PutenvStackArrayChecker.cpp
index bf81d57bf82fd3..4449cb4ae68eb9 100644
--- a/clang/lib/StaticAnalyzer/Checkers/PutenvStackArrayChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/PutenvStackArrayChecker.cpp
@@ -21,6 +21,7 @@
#include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/MemSpaces.h"
using namespace clang;
using namespace ento;
@@ -45,10 +46,15 @@ void PutenvStackArrayChecker::checkPostCall(const CallEvent &Call,
SVal ArgV = Call.getArgSVal(0);
const Expr *ArgExpr = Call.getArgExpr(0);
- const auto *SSR =
- dyn_cast<StackSpaceRegion>(ArgV.getAsRegion()->getMemorySpace());
+ const MemRegion *MR = ArgV.getAsRegion();
+
+ const auto *SSR = dyn_cast<StackSpaceRegion>(MR->getMemorySpace());
+ if (!SSR)
+ SSR = dyn_cast_if_present<StackSpaceRegion>(
+ memspace::getMemSpaceTrait(C.getState(), MR));
if (!SSR)
return;
+
const auto *StackFrameFuncD =
dyn_cast_or_null<FunctionDecl>(SSR->getStackFrame()->getDecl());
if (StackFrameFuncD && StackFrameFuncD->isMain())
diff --git a/clang/lib/StaticAnalyzer/Checkers/RetainCountChecker/RetainCountDiagnostics.cpp b/clang/lib/StaticAnalyzer/Checkers/RetainCountChecker/RetainCountDiagnostics.cpp
index 456132ef0a0a22..ecc7563ceb4bba 100644
--- a/clang/lib/StaticAnalyzer/Checkers/RetainCountChecker/RetainCountDiagnostics.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/RetainCountChecker/RetainCountDiagnostics.cpp
@@ -13,6 +13,7 @@
#include "RetainCountDiagnostics.h"
#include "RetainCountChecker.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/MemSpaces.h"
#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/SmallVector.h"
#include <optional>
@@ -690,9 +691,14 @@ static AllocationInfo GetAllocationSite(ProgramStateManager &StateMgr,
const MemRegion *R = FB.getRegion();
// Do not show local variables belonging to a function other than
// where the error is reported.
- if (auto MR = dyn_cast<StackSpaceRegion>(R->getMemorySpace()))
- if (MR->getStackFrame() == LeakContext->getStackFrame())
- FirstBinding = R;
+ const StackSpaceRegion *MR =
+ dyn_cast<StackSpaceRegion>(R->getMemorySpace());
+ if (!MR)
+ MR = dyn_cast_if_present<StackSpaceRegion>(
+ memspace::getMemSpaceTrait(St, R));
+
+ if (MR && MR->getStackFrame() == LeakContext->getStackFrame())
+ FirstBinding = R;
}
// AllocationNode is the last node in which the symbol was tracked.
diff --git a/clang/lib/StaticAnalyzer/Checkers/StackAddrEscapeChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/StackAddrEscapeChecker.cpp
index f4de3b500499c4..4d8c8eea14ea63 100644
--- a/clang/lib/StaticAnalyzer/Checkers/StackAddrEscapeChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/StackAddrEscapeChecker.cpp
@@ -19,6 +19,7 @@
#include "clang/StaticAnalyzer/Core/CheckerManager.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/MemSpaces.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h"
#include "llvm/ADT/SmallString.h"
#include "llvm/Support/raw_ostream.h"
@@ -61,7 +62,7 @@ class StackAddrEscapeChecker
ASTContext &Ctx);
static SmallVector<const MemRegion *, 4>
getCapturedStackRegions(const BlockDataRegion &B, CheckerContext &C);
- static bool isNotInCurrentFrame(const MemRegion *R, CheckerContext &C);
+ static bool isNotInCurrentFrame(const MemSpaceRegion *MS, CheckerContext &C);
};
} // namespace
@@ -117,9 +118,9 @@ SourceRange StackAddrEscapeChecker::genName(raw_ostream &os, const MemRegion *R,
return range;
}
-bool StackAddrEscapeChecker::isNotInCurrentFrame(const MemRegion *R,
+bool StackAddrEscapeChecker::isNotInCurrentFrame(const MemSpaceRegion *MS,
CheckerContext &C) {
- const StackSpaceRegion *S = cast<StackSpaceRegion>(R->getMemorySpace());
+ const StackSpaceRegion *S = cast<StackSpaceRegion>(MS);
return S->getStackFrame() != C.getStackFrame();
}
@@ -138,10 +139,11 @@ SmallVector<const MemRegion *, 4>
StackAddrEscapeChecker::getCapturedStackRegions(const BlockDataRegion &B,
CheckerContext &C) {
SmallVector<const MemRegion *, 4> Regions;
+ ProgramStateRef State = C.getState();
for (auto Var : B.referenced_vars()) {
- SVal Val = C.getState()->getSVal(Var.getCapturedRegion());
+ SVal Val = State->getSVal(Var.getCapturedRegion());
const MemRegion *Region = Val.getA...
[truncated]
|
I think I changed all usages of memory spaces in
The only spot I thought was questionable to change is this lower bound check in llvm-project/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp Lines 577 to 579 in 7e0758d
There is also currently no place where these traits are set, so everything should have the same behaviour as before. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I strongly support the idea that we should eventually represent the memory space of a memory region by a state trait (instead of the "fake" super-region) and I am grateful that you're working on this.
I added a few suggestions in inline comments that could simplify the logic and ensure better behavior w.r.t. element and field subregions. (Also there are a few trivial nitpicks about comments.)
In my opinion it would be nice to have some code path that sets and uses these MemSpace traits (even in this initial commit), because that would let us test the code and verify that it's working correctly.
However, I'm not totally opposed to accepting this as an NFC (no functional change) commit, but in that case, please add [NFC]
to the review/commit title (before merging the commit) as that helps developers who want to quickly find a commit which is responsible for a certain issue.
clang/include/clang/StaticAnalyzer/Core/PathSensitive/MemSpaces.h
Outdated
Show resolved
Hide resolved
✅ With the latest revision this PR passed the C/C++ code formatter. |
@NagyDonat, I made the requested changes, lmk if this looks better. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the direction of this PR but at the same time it makes memspaces a bit more error prone to use. Do you think we could find a way to prevent using isa
on memspaces?
Do you mean moving away from a class hierarchy definition for memspaces and towards something else, maybe a enum Kind definition, or maybe just try to hide all the casting behind a different API? IIUC, from previous chat with @steakhal, the eventual goal is to get rid of the MemSpaceRegion part of the MemRegion class hierarchy and replace it with something else. |
I think it would be nice to do the latter in this patch so using |
@Xazax-hun I think and hope that this error-prone situation will be resolved soon in a follow-up commit that completely transfers the handling of memory spaces to the state trait, because after that point |
My bad! You are right, it is not the
I think it would be nice if we never had the error prone APIs to begin with since people can lose interest, or might get hit by a bus and we are left in an unfortunate state. That being said if there is no easy way to make this less error prone I am OK with delaying that to a follow up PR. But might worth exploring if this is something that can be done in a couple hours, why not. |
I forgot to answer this earlier, but this location in ArrayBoundCheckerV2 is indeed a place where it's better to leave the code unchanged. In fact, if we consider a somewhat contrived example like int globalArray[64];
void foo(int *p) {
if (p == globalArray + 32) {
p[-16] = 42
}
} then (1) If the planned follow-up changes erase the distinction between "region constructed in a non-unknown space" and "initially unknown region whose memory space was deduced later", then we'll get regions (i.e. pointer values) that appear to be top-level regions (i.e. not an element or a field of something else) but point into the middle of an allocation. This would complicate the diagnosis of underflow errors in theory -- but in practice I think false positives from this source would be rare. Still, it would be nice to have a better solution than "just eat a few false positives". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a quick look at the PR, and over all it looks as expected. Good job!
Have you considered changing MemRegion::getMemorySpace()
into MemRegion::getMemorySpace(ProgramStateRef)
?
This should lessen the confusion of which APIs should a dev use to get the memspace of a region (aka. the getter). Speaking of the setter counter part, I think that's is special-purpose enough that most people shouldn't know it exist, so I don't think that would bring much confusion.
So the question is, would changing it into MemRegion::getMemorySpace(ProgramStateRef)
be too intrusive to consider? (Maybe in some contexts where we currently call getMemorySpace
we just don't have a State, and can't easily get one - and that would kill this approach.) Alternatively, we could say, just introduce this as an overload, and highlight that people should really really use this one. That would bring consistency while being pragmatic and accepting past design choices.
I thought about this, but I decided against it since I am thinking that MemSpaces will eventually be their own separate thing, not part of the MemRegion hierarchy, i.e., MemSpaces will eventually just be a trait thing, maybe with its own checker depending on how much functionality can be added with this different implementation? All this being said, I am happy to change the API of how we get memory spaces, I am not tied to the current approach/API, just that we can move towards better results for #115410.
This has been the main thing I'm working through on this PR. This PR doesn't totally remove all This is also related to @Xazax-hun's above question about whether it would be possible to make all the changes here at once, do you have thoughts on that @steakhal, I think you tried some stuff here? |
Working on some benchmarking. Posted some problems in the discord channel, but working through it now, and benchmarking with the CodeChecker tool using only clangsa. Is there an already established set of benchmarks y'all use somewhere? IIRC, there was some github repo with some benchmarks that I saw linked earlier. |
Let express that I don't think we will ever drop memspaces from memregions, even if we strive for. The return of investment is not going to be there once we cover the unknownspace case - which you plan to fix the issue you referred to. I think having a single API (or overload set) is important for this. This is why I'd push for patching the
Yea, in those cases I think we should just do what we currently do: don't care about dynamically deduced memspaces, aka. let them use the
I think the best experience for everyone is if we make the smallest consistent step fixing #115410. Once we have that, we can discuss how to move forward if we really want to move anywhere. I've spent about one bit more than a day to completely get rid of memspaces from the memregions hierarchy. I should have chunk it up, but difficult to make incremental refactors with this so I gave up on that. I figured that I should spend my time on other things. |
Benchmarking is taking a while for me, and I can only really run benchmarks with low run-time variance overnight. I will post more over time as I get more benchmark projects added. Here is the result from benchmarking one C project, libsodium, so far. These results are for 13 runs after some warmup runs. This is with a debug build and default checkers enabled using CodeChecker: base/main branch of clang:
This branch:
edit: This is with the change that would not be merged: that there is a lookup in the trait for each memregion rather than only for those with UnknownSpaceRegion. |
This is a sad story. We don't really have the tooling. I mean, at Sonar, we have our internal tooling. Ericsson also has quite a bit of internal tooling built on top of https://github.com/Xazax-hun/csa-testbench. Like in most cases, we don't have the docs xD Probably the SATest.py is the best you can reach for once you dust it off a bit. |
I see no risk in this change to regress analysis time. Consequently, if I were you, I wouldn't run benchmarks. Unless my primary objective would be to dust off the SATest.py and get on hands experience using that tool. |
I would strongly support changing However, I would be very happy if this refactoring effort continued by dropping the memspaces from the memregions in follow-up commits -- I don't think that it would be terribly difficult and it would make it easier to reason about memregions. (If nobody else does it, I will eventually do it when I get fed up on some confusing |
clang/include/clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h
Outdated
Show resolved
Hide resolved
Made the above changes. I moved it all into |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates!
I added some inline remarks, but overall I'm satisfied with the state of this PR and I hope that it can be merged soon.
However, I became a bit uncertain about the plans for follow-up changes, because it seems that there are several locations where the difference between "inherently in <memory space X>
" and "originally in UnknownSpace, but we deduced that it's in <memory space X>
" is significant. (I'm not entirely certain about this, because e.g. the schematics and applications of "deducing a memory space" are not yet fixed. This might deserve some additional thinking.)
Previously I strongly supported the idea that in the future we should store the MemSpace
in a single location, as a state trait -- but if the inherent/acquired difference turns out to be relevant, then perhaps we should "stop after this PR" and stabilize this "hybrid representation" of memory spaces.
@@ -353,7 +352,7 @@ static std::string getRegionName(const SubRegion *Region) { | |||
return "the memory returned by 'alloca'"; | |||
|
|||
if (isa<SymbolicRegion>(Region) && | |||
isa<HeapSpaceRegion>(Region->getMemorySpace())) | |||
isa<HeapSpaceRegion>(Region->getRawMemorySpace())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should use getMemorySpace(State)
if it's not too cumbersome to pass a state to this function. (However, this is not an important change, just marginal prettification of a report message.)
@@ -1544,7 +1544,7 @@ SVal RegionStoreManager::getBinding(RegionBindingsConstRef B, Loc L, QualType T) | |||
// The location does not have a bound value. This means that it has | |||
// the value it had upon its creation and/or entry to the analyzed | |||
// function/method. These are either symbolic values or 'undefined'. | |||
if (R->hasStackNonParametersStorage()) { | |||
if (R->hasRawStackNonParametersStorage()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remark: This is another example (in addition to e.g. the ArrayBoundV2 lower bound check) where it seems to be important to distinguish the "raw" (inherent) memory space and the "acquired" memory space: this logic applies a default to "real" local stack variables, that should not apply to initially unknown regions where we later conclude that they're on the stack (but perhaps don't bind anything as their content).
No action expected in this commit, but this could be another reason for keeping the separate "inherent" memory space field in the future.
const MemSpaceRegion *LeftMS = LeftBase->getRawMemorySpace(); | ||
const MemSpaceRegion *RightMS = RightBase->getRawMemorySpace(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to use the non-raw memory space here, if possible.
clang/include/clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h
Outdated
Show resolved
Hide resolved
I've pushed a couple of commits to sync this PR with what I had in mind. Please anyone do a thorough review. I've considered uplifiting a couple of the |
I've checked just now that once this lands, we could add a checker callback like this to start modeling the memoryspaces transfers to unknown spaced regions: void checkPreStmt(const BinaryOperator *BO, CheckerContext &C) const {
if (CanQualType Ty = BO->getType()->getCanonicalTypeUnqualified();
Ty != C.getSValBuilder().getConditionType())
return;
if (auto Op = BO->getOpcode(); Op != BO_EQ && Op != BO_NE)
return;
const Expr *LExp = BO->getLHS();
const Expr *RExp = BO->getRHS();
if (!LExp->getType()->isAnyPointerType() ||
!RExp->getType()->isAnyPointerType())
return;
const auto *L = C.getSVal(LExp).getAsRegion();
const auto *R = C.getSVal(RExp).getAsRegion();
if (!L || !R)
return;
auto State = C.getState();
const auto *LSpace = L->getMemorySpace(State);
const auto *RSpace = R->getMemorySpace(State);
if (isa<UnknownSpaceRegion>(LSpace) == isa<UnknownSpaceRegion>(RSpace))
return;
C.addTransition(isa<UnknownSpaceRegion>(LSpace)
? L->setMemorySpace(State, RSpace)
: R->setMemorySpace(State, LSpace));
} I can confirm that it would work (kindof). There are problems with it of course, as it should be part of a post callback. But the ExprEngineC.cpp:VisitBinaryOperator would sink the aliasing case because it would call the |
Ping. I wonder if there is anything missing for this to be merged. Ah, yea, sure a rebase xD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fine going ahead with this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the improvements :)
In the follow-up commits let's pay attention to the ArrayBound
lower bound check where the code says that "A symbolic region in unknown space represents an unknown pointer that may point into the middle of an array, so we don't look for underflows." and assumes that e.g. a symbolic region in a known space necessarily points to the beginning of a memory area (i.e. negative indices are erroneous).
In general, if we see an allocation, we associate the immutable memory space with the constructed memory region. This works fine if we see the allocation. However, with symbolic regions it's not great because there we don't know anything about their memory spaces, thus put them into the Unknown space. The unfortunate consequence is that once we learn about some aliasing with this Symbolic Region, we can't change the memory space to the deduced one. In this patch, we open up the memory spaces as a trait, basically allowing associating a better memory space with a memregion that was created with the Unknown memory space. As a side effect, this means that now queriing the memory space of a region depends on the State, but many places in the analyzer, such as the Store, doesn't have (and cannot have) access to the State by design. This means that some uses must solely rely on the memspaces of the region, but any other users should use the getter taking a State.
0c900fa
to
b59b7c7
Compare
Yeey, could you please revert this for me? @vitalybuka |
Should be fixed with #128372 |
In general, if we see an allocation, we associate the immutable memory
space with the constructed memory region.
This works fine if we see the allocation.
However, with symbolic regions it's not great because there we don't
know anything about their memory spaces, thus put them into the Unknown
space.
The unfortunate consequence is that once we learn about some aliasing
with this Symbolic Region, we can't change the memory space to the
deduced one.
In this patch, we open up the memory spaces as a trait, basically
allowing associating a better memory space with a memregion that
was created with the Unknown memory space.
As a side effect, this means that now queriing the memory space of a
region depends on the State, but many places in the analyzer, such as
the Store, doesn't have (and cannot have) access to the State by design.
This means that some uses must solely rely on the memspaces of the
region, but any other users should use the getter taking a State.