Skip to content

[clang] Add optional pass to remove UBSAN traps using PGO #84214

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

vitalybuka
Copy link
Collaborator

@vitalybuka vitalybuka commented Mar 6, 2024

With #83471 it reduces UBSAN overhead from 44% to 6%.
Measured as "Geomean difference" on "test-suite/MultiSource/Benchmarks"
with PGO+ThinLTO build.

On real large server binary we see 95% of code is still instrumented,
with 10% -> 1.5% UBSAN overhead improvements. We can pass this test only
with subset of UBSAN, so base overhead is smaller.

We have followup patches to improve it even further.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. labels Mar 6, 2024
@llvmbot
Copy link
Member

llvmbot commented Mar 6, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-codegen

Author: Vitaly Buka (vitalybuka)

Changes

With #83470 it reduces UBSAN overhead from 52.5% to 12.6%.
Measured as "Geomean difference" on "test-suite/MultiSource/Benchmarks"
with PGO+ThinLTO build.

On real server apps we see 95% of code instrumented, with 10% -> 1.5%
UBSAN overhead improvements. We can pass this test only with subset of
UBSAN, so base overhead is smaller.

We have followup patches to improve it even further.


Full diff: https://github.com/llvm/llvm-project/pull/84214.diff

2 Files Affected:

  • (modified) clang/lib/CodeGen/BackendUtil.cpp (+17)
  • (added) clang/test/CodeGen/remote-traps.cpp (+15)
diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp
index 056f790d41853d..6f9aa262f3d51e 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -76,6 +76,7 @@
 #include "llvm/Transforms/Instrumentation/MemProfiler.h"
 #include "llvm/Transforms/Instrumentation/MemorySanitizer.h"
 #include "llvm/Transforms/Instrumentation/PGOInstrumentation.h"
+#include "llvm/Transforms/Instrumentation/RemoveTrapsPass.h"
 #include "llvm/Transforms/Instrumentation/SanitizerBinaryMetadata.h"
 #include "llvm/Transforms/Instrumentation/SanitizerCoverage.h"
 #include "llvm/Transforms/Instrumentation/ThreadSanitizer.h"
@@ -83,6 +84,7 @@
 #include "llvm/Transforms/Scalar/EarlyCSE.h"
 #include "llvm/Transforms/Scalar/GVN.h"
 #include "llvm/Transforms/Scalar/JumpThreading.h"
+#include "llvm/Transforms/Scalar/SimplifyCFG.h"
 #include "llvm/Transforms/Utils/Debugify.h"
 #include "llvm/Transforms/Utils/EntryExitInstrumenter.h"
 #include "llvm/Transforms/Utils/ModuleUtils.h"
@@ -98,6 +100,10 @@ using namespace llvm;
 namespace llvm {
 extern cl::opt<bool> PrintPipelinePasses;
 
+cl::opt<bool> ClRemoveTraps("clang-remove-traps", cl::Optional,
+                            cl::desc("Insert remove-traps pass."),
+                            cl::init(false));
+
 // Experiment to move sanitizers earlier.
 static cl::opt<bool> ClSanitizeOnOptimizerEarlyEP(
     "sanitizer-early-opt-ep", cl::Optional,
@@ -743,6 +749,17 @@ static void addSanitizers(const Triple &TargetTriple,
     // LastEP does not need GlobalsAA.
     PB.registerOptimizerLastEPCallback(SanitizersCallback);
   }
+
+  if (ClRemoveTraps) {
+    PB.registerOptimizerEarlyEPCallback([&](ModulePassManager &MPM,
+                                            OptimizationLevel Level) {
+      FunctionPassManager FPM;
+      FPM.addPass(RemoveTrapsPass());
+      FPM.addPass(
+          SimplifyCFGPass(SimplifyCFGOptions().convertSwitchRangeToICmp(true)));
+      MPM.addPass(createModuleToFunctionPassAdaptor(std::move(FPM)));
+    });
+  }
 }
 
 void EmitAssemblyHelper::RunOptimizationPipeline(
diff --git a/clang/test/CodeGen/remote-traps.cpp b/clang/test/CodeGen/remote-traps.cpp
new file mode 100644
index 00000000000000..7d4eb76a7d81e9
--- /dev/null
+++ b/clang/test/CodeGen/remote-traps.cpp
@@ -0,0 +1,15 @@
+// RUN: %clang_cc1 -O1 -emit-llvm -fsanitize=signed-integer-overflow -fsanitize-trap=signed-integer-overflow %s -o - | FileCheck %s 
+// RUN: %clang_cc1 -O1 -emit-llvm -fsanitize=signed-integer-overflow -fsanitize-trap=signed-integer-overflow -mllvm -clang-remove-traps -mllvm -remove-traps-random-rate=1 %s -o - | FileCheck %s --implicit-check-not="call void @llvm.ubsantrap" --check-prefixes=REMOVE
+
+int f(int x) {
+  return x + 123;
+}
+
+// CHECK-LABEL: define dso_local noundef i32 @_Z1fi(
+// CHECK: call { i32, i1 } @llvm.sadd.with.overflow.i32(
+// CHECK: trap:
+// CHECK-NEXT: call void @llvm.ubsantrap(i8 0)
+// CHECK-NEXT: unreachable, !nosanitize !2 
+
+// REMOVE-LABEL: define dso_local noundef i32 @_Z1fi(
+// REMOVE: call { i32, i1 } @llvm.sadd.with.overflow.i32(

@vitalybuka
Copy link
Collaborator Author

Please ignore buildkite for now, as the patch depends on #83471 which I uploaded without stacked review.

@aeubanks
Copy link
Contributor

aeubanks commented Mar 7, 2024

is there a long term plan to add a driver flag for this?

Created using spr 1.3.4
Created using spr 1.3.4
Created using spr 1.3.4
@vitalybuka vitalybuka force-pushed the users/vitalybuka/spr/clang-add-optional-pass-to-remove-ubsan-traps-using-pgo branch from 73a4120 to 000854e Compare March 7, 2024 18:53
@vitalybuka
Copy link
Collaborator Author

is there a long term plan to add a driver flag for this?

yes, but I'd like to that after we collect feedback from first users

Created using spr 1.3.4
@vitalybuka vitalybuka requested a review from aeubanks March 7, 2024 22:52
Created using spr 1.3.4
Created using spr 1.3.4
Created using spr 1.3.4
Created using spr 1.3.4
Created using spr 1.3.4
Created using spr 1.3.4
Created using spr 1.3.4
Created using spr 1.3.4
@vitalybuka
Copy link
Collaborator Author

yes, but I'd like to that after we collect feedback from first users

They are introduced by earlier transformations

Note: I'd like to have special intrinsic for this optimization. When we have it, we likely don't need this SimplifyCFG.

@aeubanks
Copy link
Contributor

yes, but I'd like to that after we collect feedback from first users

They are introduced by earlier transformations

Note: I'd like to have special intrinsic for this optimization. When we have it, we likely don't need this SimplifyCFG.

lgtm with a comment added on why we're adding the extra SimplifyCFG

@aeubanks
Copy link
Contributor

yes, but I'd like to that after we collect feedback from first users

They are introduced by earlier transformations
Note: I'd like to have special intrinsic for this optimization. When we have it, we likely don't need this SimplifyCFG.

lgtm with a comment added on why we're adding the extra SimplifyCFG

oh I missed that you'd added one. but maybe a TODO with your comment about an intrinsic?

Created using spr 1.3.4
@vitalybuka vitalybuka merged commit eaa71a9 into main Mar 11, 2024
@vitalybuka vitalybuka deleted the users/vitalybuka/spr/clang-add-optional-pass-to-remove-ubsan-traps-using-pgo branch March 11, 2024 19:07
@dyung
Copy link
Collaborator

dyung commented Mar 11, 2024

@vitalybuka the test you added remote-traps.c seems to be failing on a few buildbots, can you take a look?

@vitalybuka
Copy link
Collaborator Author

@vitalybuka the test you added remote-traps.c seems to be failing on a few buildbots, can you take a look?

looking

vitalybuka added a commit that referenced this pull request Mar 11, 2024
vitalybuka added a commit that referenced this pull request Apr 1, 2024
The goal is to have ability to change logic compile time based on PGO
data.

Our primary application is removing UBSAN checks from hot code.
Then we'd like to use this for libc++ hardening and regular debug
asserts.
Previous attempt is #84214.

Benefits from special intrinsic vs #84214:
1. Resulting binary is 3% faster than removing traps (on
"test-suite/MultiSource/Benchmarks" with PGO+ThinLTO)
2. Intrinsic can be used from source code to change behavior from C/C++
program. E.g. enabling asserts in cold code.
3. Easier to match basic blocks.

RFC:
https://discourse.llvm.org/t/rfc-add-llvm-experimental-hot-intrinsic-or-llvm-hot/77641

---------

Co-authored-by: Nikita Popov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants