Skip to content

Commit bac6cd5

Browse files
committed
[misexpect] Re-implement MisExpect Diagnostics
Reimplements MisExpect diagnostics from D66324 to reconstruct its original checking methodology only using MD_prof branch_weights metadata. New checks rely on 2 invariants: 1) For frontend instrumentation, MD_prof branch_weights will always be populated before llvm.expect intrinsics are lowered. 2) for IR and sample profiling, llvm.expect intrinsics will always be lowered before branch_weights are populated from the IR profiles. These invariants allow the checking to assume how the existing branch weights are populated depending on the profiling method used, and emit the correct diagnostics. If these invariants are ever invalidated, the MisExpect related checks would need to be updated, potentially by re-introducing MD_misexpect metadata, and ensuring it always will be transformed the same way as branch_weights in other optimization passes. Frontend based profiling is now enabled without using LLVM Args, by introducing a new CodeGen option, and checking if the -Wmisexpect flag has been passed on the command line. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D115907
1 parent eb2131b commit bac6cd5

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

53 files changed

+2197
-3
lines changed

clang/docs/MisExpect.rst

Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
===================
2+
Misexpect
3+
===================
4+
.. contents::
5+
6+
.. toctree::
7+
:maxdepth: 1
8+
9+
When developers use ``llvm.expect`` intrinsics, i.e., through use of
10+
``__builtin_expect(...)``, they are trying to communicate how their code is
11+
expected to behave at runtime to the optimizer. These annotations, however, can
12+
be incorrect for a variety of reasons: changes to the code base invalidate them
13+
silently, the developer mis-annotated them (e.g., using ``LIKELY`` instead of
14+
``UNLIKELY``), or perhaps they assumed something incorrectly when they wrote
15+
the annotation. Regardless of why, it is useful to detect these situations so
16+
that the optimizer can make more useful decisions about the code.
17+
18+
MisExpect diagnostics are intended to help developers identify and address
19+
these situations, by comparing the branch weights added by the ``llvm.expect``
20+
intrinsic to those collected through profiling. Whenever these values are
21+
mismatched, a diagnostic is surfaced to the user. Details on how the checks
22+
operate in the LLVM backed can be found in LLVM's documentation.
23+
24+
By default MisExpect checking is quite strict, because the use of the
25+
``llvm.expect`` intrinsic is designed for specialized cases, where the outcome
26+
of a condition is severely skewed. As a result, the optimizer can be extremely
27+
aggressive, which can result in performance degradation if the outcome is less
28+
predictable than the annotation suggests. Even when the annotation is correct
29+
90% of the time, it may be beneficial to either remove the annotation or to use
30+
a different intrinsic that can communicate the probability more directly.
31+
32+
Because this may be too strict, MisExpect diagnostics are not enabled by
33+
default, and support an additional flag to tolerate some deviation from the
34+
exact thresholds. The ``-fdiagnostic-misexpect-tolerance=N`` accepts
35+
deviations when comparing branch weights within ``N%`` of the expected values.
36+
So passing ``-fdiagnostic-misexpect-tolerance=5`` will not report diagnostic messages
37+
if the branch weight from the profile is within 5% of the weight added by
38+
the ``llvm.expect`` intrinsic.
39+
40+
MisExpect diagnostics are also available in the form of optimization remarks,
41+
which can be serialized and processed through the ``opt-viewer.py``
42+
scripts in LLVM.
43+
44+
.. option:: -Rpass=misexpect
45+
46+
Enables optimization remarks for misexpect when profiling data conflicts with
47+
use of ``llvm.expect`` intrinsics.
48+
49+
50+
.. option:: -Wmisexpect
51+
52+
Enables misexpect warnings when profiling data conflicts with use of
53+
``llvm.expect`` intrinsics.
54+
55+
.. option:: -fdiagnostic-misexpect-tolerance=N
56+
57+
Relaxes misexpect checking to tolerate profiling values within N% of the
58+
expected branch weight. e.g., a value of ``N=5`` allows misexpect to check against
59+
``0.95 * Threshold``
60+
61+
LLVM supports 4 types of profile formats: Frontend, IR, CS-IR, and
62+
Sampling. MisExpect Diagnostics are compatible with all Profiling formats.
63+
64+
+----------------+--------------------------------------------------------------------------------------+
65+
| Profile Type | Description |
66+
+================+======================================================================================+
67+
| Frontend | Profiling instrumentation added during compilation by the frontend, i.e. ``clang`` |
68+
+----------------+--------------------------------------------------------------------------------------+
69+
| IR | Profiling instrumentation added during by the LLVM backend |
70+
+----------------+--------------------------------------------------------------------------------------+
71+
| CS-IR | Context Sensitive IR based profiles |
72+
+----------------+--------------------------------------------------------------------------------------+
73+
| Sampling | Profiles collected through sampling with external tools, such as ``perf`` on Linux |
74+
+----------------+--------------------------------------------------------------------------------------+
75+

clang/docs/ReleaseNotes.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -157,6 +157,9 @@ Improvements to Clang's diagnostics
157157
function with a prototype. e.g., ``void f(int); void f() {}`` is now properly
158158
diagnosed.
159159

160+
- ``-Wmisexpect`` warns when the branch weights collected during profiling
161+
conflict with those added by ``llvm.expect``.
162+
160163
Non-comprehensive list of changes in this release
161164
-------------------------------------------------
162165
- Improve __builtin_dump_struct:

clang/docs/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@ Using Clang as a Compiler
4242
SourceBasedCodeCoverage
4343
Modules
4444
MSVCCompatibility
45+
MisExpect
4546
OpenCLSupport
4647
OpenMPSupport
4748
SYCLSupport

clang/include/clang/Basic/CodeGenOptions.def

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -174,6 +174,7 @@ CODEGENOPT(NoExecStack , 1, 0) ///< Set when -Wa,--noexecstack is enabled.
174174
CODEGENOPT(FatalWarnings , 1, 0) ///< Set when -Wa,--fatal-warnings is
175175
///< enabled.
176176
CODEGENOPT(NoWarn , 1, 0) ///< Set when -Wa,--no-warn is enabled.
177+
CODEGENOPT(MisExpect , 1, 0) ///< Set when -Wmisexpect is enabled
177178
CODEGENOPT(EnableSegmentedStacks , 1, 0) ///< Set when -fsplit-stack is enabled.
178179
CODEGENOPT(NoInlineLineTables, 1, 0) ///< Whether debug info should contain
179180
///< inline line tables.

clang/include/clang/Basic/CodeGenOptions.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -423,6 +423,10 @@ class CodeGenOptions : public CodeGenOptionsBase {
423423
/// If threshold option is not specified, it is disabled by default.
424424
Optional<uint64_t> DiagnosticsHotnessThreshold = 0;
425425

426+
/// The maximum percentage profiling weights can deviate from the expected
427+
/// values in order to be included in misexpect diagnostics.
428+
Optional<uint64_t> DiagnosticsMisExpectTolerance = 0;
429+
426430
public:
427431
// Define accessors/mutators for code generation options of enumeration type.
428432
#define CODEGENOPT(Name, Bits, Default)

clang/include/clang/Basic/DiagnosticDriverKinds.td

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -149,6 +149,8 @@ def err_drv_invalid_darwin_version : Error<
149149
"invalid Darwin version number: %0">;
150150
def err_drv_invalid_diagnotics_hotness_threshold : Error<
151151
"invalid argument in '%0', only integer or 'auto' is supported">;
152+
def err_drv_invalid_diagnotics_misexpect_tolerance : Error<
153+
"invalid argument in '%0', only integers are supported">;
152154
def err_drv_missing_argument : Error<
153155
"argument to '%0' is missing (expected %1 value%s1)">;
154156
def err_drv_invalid_Xarch_argument_with_args : Error<
@@ -378,6 +380,9 @@ def warn_drv_empty_joined_argument : Warning<
378380
def warn_drv_diagnostics_hotness_requires_pgo : Warning<
379381
"argument '%0' requires profile-guided optimization information">,
380382
InGroup<UnusedCommandLineArgument>;
383+
def warn_drv_diagnostics_misexpect_requires_pgo : Warning<
384+
"argument '%0' requires profile-guided optimization information">,
385+
InGroup<UnusedCommandLineArgument>;
381386
def warn_drv_clang_unsupported : Warning<
382387
"the clang compiler does not support '%0'">;
383388
def warn_drv_deprecated_arg : Warning<

clang/include/clang/Basic/DiagnosticFrontendKinds.td

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -315,6 +315,11 @@ def warn_profile_data_missing : Warning<
315315
def warn_profile_data_unprofiled : Warning<
316316
"no profile data available for file \"%0\"">,
317317
InGroup<ProfileInstrUnprofiled>;
318+
def warn_profile_data_misexpect : Warning<
319+
"Potential performance regression from use of __builtin_expect(): "
320+
"Annotation was correct on %0 of profiled executions.">,
321+
BackendInfo,
322+
InGroup<MisExpect>;
318323
} // end of instrumentation issue category
319324

320325
}

clang/include/clang/Basic/DiagnosticGroups.td

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1256,6 +1256,7 @@ def BackendWarningAttributes : DiagGroup<"attribute-warning">;
12561256
def ProfileInstrMissing : DiagGroup<"profile-instr-missing">;
12571257
def ProfileInstrOutOfDate : DiagGroup<"profile-instr-out-of-date">;
12581258
def ProfileInstrUnprofiled : DiagGroup<"profile-instr-unprofiled">;
1259+
def MisExpect : DiagGroup<"misexpect">;
12591260

12601261
// AddressSanitizer frontend instrumentation remarks.
12611262
def SanitizeAddressRemarks : DiagGroup<"sanitize-address">;

clang/include/clang/Driver/Options.td

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1430,6 +1430,9 @@ def fdiagnostics_hotness_threshold_EQ : Joined<["-"], "fdiagnostics-hotness-thre
14301430
Group<f_Group>, Flags<[CC1Option]>, MetaVarName<"<value>">,
14311431
HelpText<"Prevent optimization remarks from being output if they do not have at least this profile count. "
14321432
"Use 'auto' to apply the threshold from profile summary">;
1433+
def fdiagnostics_misexpect_tolerance_EQ : Joined<["-"], "fdiagnostics-misexpect-tolerance=">,
1434+
Group<f_Group>, Flags<[CC1Option]>, MetaVarName<"<value>">,
1435+
HelpText<"Prevent misexpect diagnostics from being output if the profile counts are within N% of the expected. ">;
14331436
defm diagnostics_show_option : BoolFOption<"diagnostics-show-option",
14341437
DiagnosticOpts<"ShowOptionNames">, DefaultTrue,
14351438
NegFlag<SetFalse, [CC1Option]>, PosFlag<SetTrue, [], "Print option name with mappable diagnostics">>;

clang/lib/CodeGen/BackendUtil.cpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -474,6 +474,7 @@ static bool initTargetOptions(DiagnosticsEngine &Diags,
474474
Entry.IgnoreSysRoot ? Entry.Path : HSOpts.Sysroot + Entry.Path);
475475
Options.MCOptions.Argv0 = CodeGenOpts.Argv0;
476476
Options.MCOptions.CommandLineArgs = CodeGenOpts.CommandLineArgs;
477+
Options.MisExpect = CodeGenOpts.MisExpect;
477478

478479
return true;
479480
}

clang/lib/CodeGen/CodeGenAction.cpp

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -340,6 +340,15 @@ namespace clang {
340340
CodeGenOpts.getProfileUse() != CodeGenOptions::ProfileNone)
341341
Ctx.setDiagnosticsHotnessRequested(true);
342342

343+
if (CodeGenOpts.MisExpect) {
344+
Ctx.setMisExpectWarningRequested(true);
345+
}
346+
347+
if (CodeGenOpts.DiagnosticsMisExpectTolerance) {
348+
Ctx.setDiagnosticsMisExpectTolerance(
349+
CodeGenOpts.DiagnosticsMisExpectTolerance);
350+
}
351+
343352
// Link each LinkModule into our module.
344353
if (LinkInModules())
345354
return;
@@ -440,6 +449,9 @@ namespace clang {
440449
void OptimizationFailureHandler(
441450
const llvm::DiagnosticInfoOptimizationFailure &D);
442451
void DontCallDiagHandler(const DiagnosticInfoDontCall &D);
452+
/// Specialized handler for misexpect warnings.
453+
/// Note that misexpect remarks are emitted through ORE
454+
void MisExpectDiagHandler(const llvm::DiagnosticInfoMisExpect &D);
443455
};
444456

445457
void BackendConsumer::anchor() {}
@@ -821,6 +833,25 @@ void BackendConsumer::DontCallDiagHandler(const DiagnosticInfoDontCall &D) {
821833
<< llvm::demangle(D.getFunctionName().str()) << D.getNote();
822834
}
823835

836+
void BackendConsumer::MisExpectDiagHandler(
837+
const llvm::DiagnosticInfoMisExpect &D) {
838+
StringRef Filename;
839+
unsigned Line, Column;
840+
bool BadDebugInfo = false;
841+
FullSourceLoc Loc =
842+
getBestLocationFromDebugLoc(D, BadDebugInfo, Filename, Line, Column);
843+
844+
Diags.Report(Loc, diag::warn_profile_data_misexpect) << D.getMsg().str();
845+
846+
if (BadDebugInfo)
847+
// If we were not able to translate the file:line:col information
848+
// back to a SourceLocation, at least emit a note stating that
849+
// we could not translate this location. This can happen in the
850+
// case of #line directives.
851+
Diags.Report(Loc, diag::note_fe_backend_invalid_loc)
852+
<< Filename << Line << Column;
853+
}
854+
824855
/// This function is invoked when the backend needs
825856
/// to report something to the user.
826857
void BackendConsumer::DiagnosticHandlerImpl(const DiagnosticInfo &DI) {
@@ -895,6 +926,9 @@ void BackendConsumer::DiagnosticHandlerImpl(const DiagnosticInfo &DI) {
895926
case llvm::DK_DontCall:
896927
DontCallDiagHandler(cast<DiagnosticInfoDontCall>(DI));
897928
return;
929+
case llvm::DK_MisExpect:
930+
MisExpectDiagHandler(cast<DiagnosticInfoMisExpect>(DI));
931+
return;
898932
default:
899933
// Plugin IDs are not bound to any value as they are set dynamically.
900934
ComputeDiagRemarkID(Severity, backend_plugin, DiagID);

clang/lib/Frontend/CompilerInvocation.cpp

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -107,6 +107,20 @@ using namespace driver;
107107
using namespace options;
108108
using namespace llvm::opt;
109109

110+
//===----------------------------------------------------------------------===//
111+
// Helpers.
112+
//===----------------------------------------------------------------------===//
113+
114+
// Parse misexpect tolerance argument value.
115+
// Valid option values are integers in the range [0, 100)
116+
inline Expected<Optional<uint64_t>> parseToleranceOption(StringRef Arg) {
117+
int64_t Val;
118+
if (Arg.getAsInteger(10, Val))
119+
return llvm::createStringError(llvm::inconvertibleErrorCode(),
120+
"Not an integer: %s", Arg.data());
121+
return Val;
122+
}
123+
110124
//===----------------------------------------------------------------------===//
111125
// Initialization.
112126
//===----------------------------------------------------------------------===//
@@ -1547,6 +1561,9 @@ void CompilerInvocation::GenerateCodeGenArgs(
15471561
: "auto",
15481562
SA);
15491563

1564+
GenerateArg(Args, OPT_fdiagnostics_misexpect_tolerance_EQ,
1565+
Twine(*Opts.DiagnosticsMisExpectTolerance), SA);
1566+
15501567
for (StringRef Sanitizer : serializeSanitizerKinds(Opts.SanitizeRecover))
15511568
GenerateArg(Args, OPT_fsanitize_recover_EQ, Sanitizer, SA);
15521569

@@ -1959,6 +1976,23 @@ bool CompilerInvocation::ParseCodeGenArgs(CodeGenOptions &Opts, ArgList &Args,
19591976
}
19601977
}
19611978

1979+
if (auto *arg =
1980+
Args.getLastArg(options::OPT_fdiagnostics_misexpect_tolerance_EQ)) {
1981+
auto ResultOrErr = parseToleranceOption(arg->getValue());
1982+
1983+
if (!ResultOrErr) {
1984+
Diags.Report(diag::err_drv_invalid_diagnotics_misexpect_tolerance)
1985+
<< "-fdiagnostics-misexpect-tolerance=";
1986+
} else {
1987+
Opts.DiagnosticsMisExpectTolerance = *ResultOrErr;
1988+
if ((!Opts.DiagnosticsMisExpectTolerance.hasValue() ||
1989+
Opts.DiagnosticsMisExpectTolerance.getValue() > 0) &&
1990+
!UsingProfile)
1991+
Diags.Report(diag::warn_drv_diagnostics_misexpect_requires_pgo)
1992+
<< "-fdiagnostics-misexpect-tolerance=";
1993+
}
1994+
}
1995+
19621996
// If the user requested to use a sample profile for PGO, then the
19631997
// backend will need to track source location information so the profile
19641998
// can be incorporated into the IR.
@@ -4432,6 +4466,13 @@ bool CompilerInvocation::CreateFromArgsImpl(
44324466
if (Res.getFrontendOpts().ProgramAction == frontend::RewriteObjC)
44334467
LangOpts.ObjCExceptions = 1;
44344468

4469+
for (auto Warning : Res.getDiagnosticOpts().Warnings) {
4470+
if (Warning == "misexpect" &&
4471+
!Diags.isIgnored(diag::warn_profile_data_misexpect, SourceLocation())) {
4472+
Res.getCodeGenOpts().MisExpect = true;
4473+
}
4474+
}
4475+
44354476
if (LangOpts.CUDA) {
44364477
// During CUDA device-side compilation, the aux triple is the
44374478
// triple used for host compilation.
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
bar
2+
# Func Hash:
3+
11262309464
4+
# Num Counters:
5+
2
6+
# Counter Values:
7+
200000
8+
2
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
bar
2+
# Func Hash:
3+
45795613684824
4+
# Num Counters:
5+
2
6+
# Counter Values:
7+
200000
8+
180000
9+
10+
fizz
11+
# Func Hash:
12+
45795613684824
13+
# Num Counters:
14+
2
15+
# Counter Values:
16+
200000
17+
18000
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
main
2+
# Func Hash:
3+
79676873694057560
4+
# Num Counters:
5+
5
6+
# Counter Values:
7+
1
8+
20
9+
20000
10+
20000
11+
20000
12+
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
main
2+
# Func Hash:
3+
8734802134600123338
4+
# Num Counters:
5+
9
6+
# Counter Values:
7+
1
8+
20000
9+
20000
10+
4066
11+
11889
12+
0
13+
0
14+
4045
15+
0
16+
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
main
2+
# Func Hash:
3+
3721743393642630379
4+
# Num Counters:
5+
10
6+
# Counter Values:
7+
1
8+
20
9+
20000
10+
20000
11+
1
12+
0
13+
0
14+
0
15+
19999
16+
0

0 commit comments

Comments
 (0)