Skip to content

Commit 2615131

Browse files
committed
Merge branch 'main' into branch-protection-pauthabi
2 parents a37bd92 + 769952d commit 2615131

File tree

1,585 files changed

+161731
-115604
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,585 files changed

+161731
-115604
lines changed

.github/CODEOWNERS

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -67,11 +67,11 @@ clang/test/AST/Interp/ @tbaederr
6767
/mlir/include/mlir/Dialect/Linalg @dcaballe @nicolasvasilache @rengolin
6868
/mlir/lib/Dialect/Linalg @dcaballe @nicolasvasilache @rengolin
6969
/mlir/lib/Dialect/Linalg/Transforms/DecomposeLinalgOps.cpp @MaheshRavishankar @nicolasvasilache
70-
/mlir/lib/Dialect/Linalg/Transforms/DropUnitDims.cpp @MaheshRavishankar @nicolasvasilache
70+
/mlir/lib/Dialect/Linalg/Transforms/DropUnitDims.cpp @dcaballe @MaheshRavishankar @nicolasvasilache
7171
/mlir/lib/Dialect/Linalg/Transforms/ElementwiseOpFusion.cpp @MaheshRavishankar @nicolasvasilache
7272
/mlir/lib/Dialect/Linalg/Transforms/DataLayoutPropagation.cpp @hanhanW @nicolasvasilache
73-
/mlir/lib/Dialect/Linalg/Transforms/Transforms.cpp @hanhanW @nicolasvasilache
74-
/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp @hanhanW @nicolasvasilache
73+
/mlir/lib/Dialect/Linalg/Transforms/Transforms.cpp @dcaballe @hanhanW @nicolasvasilache
74+
/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp @banach-space @dcaballe @hanhanW @nicolasvasilache
7575

7676
# MemRef Dialect in MLIR.
7777
/mlir/lib/Dialect/MemRef/Transforms/EmulateNarrowType.cpp @MaheshRavishankar @nicolasvasilache
@@ -85,10 +85,10 @@ clang/test/AST/Interp/ @tbaederr
8585
/mlir/**/*VectorToSCF* @banach-space @dcaballe @matthias-springer @nicolasvasilache
8686
/mlir/**/*VectorToLLVM* @banach-space @dcaballe @nicolasvasilache
8787
/mlir/**/*X86Vector* @aartbik @dcaballe @nicolasvasilache
88-
/mlir/include/mlir/Dialect/Vector @dcaballe @nicolasvasilache
89-
/mlir/lib/Dialect/Vector @dcaballe @nicolasvasilache
90-
/mlir/lib/Dialect/Vector/Transforms/* @hanhanW @nicolasvasilache
91-
/mlir/lib/Dialect/Vector/Transforms/VectorEmulateNarrowType.cpp @MaheshRavishankar @nicolasvasilache
88+
/mlir/include/mlir/Dialect/Vector @banach-space @dcaballe @nicolasvasilache
89+
/mlir/lib/Dialect/Vector @banach-space @dcaballe @nicolasvasilache
90+
/mlir/lib/Dialect/Vector/Transforms/* @banach-space @dcaballe @hanhanW @nicolasvasilache
91+
/mlir/lib/Dialect/Vector/Transforms/VectorEmulateNarrowType.cpp @banach-space @dcaballe @MaheshRavishankar @nicolasvasilache
9292
/mlir/**/*EmulateNarrowType* @dcaballe @hanhanW
9393

9494
# Presburger library in MLIR

bolt/docs/CommandLineArgumentReference.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -283,6 +283,12 @@
283283

284284
List of functions to pad with amount of bytes
285285

286+
- `--print-mappings`
287+
288+
Print mappings in the legend, between characters/blocks and text sections
289+
(default false).
290+
291+
286292
- `--profile-format=<value>`
287293

288294
Format to dump profile output in aggregation mode, default is fdata
@@ -1240,4 +1246,4 @@
12401246

12411247
- `--print-options`
12421248

1243-
Print non-default options after command line parsing
1249+
Print non-default options after command line parsing

bolt/docs/HeatmapHeader.png

75 KB
Loading

bolt/docs/Heatmaps.md

Lines changed: 56 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
# Code Heatmaps
22

33
BOLT has gained the ability to print code heatmaps based on
4-
sampling-based LBR profiles generated by `perf`. The output is produced
5-
in colored ASCII to be displayed in a color-capable terminal. It looks
6-
something like this:
4+
sampling-based profiles generated by `perf`, either with `LBR` data or not.
5+
The output is produced in colored ASCII to be displayed in a color-capable
6+
terminal. It looks something like this:
77

88
![](./Heatmap.png)
99

@@ -32,20 +32,64 @@ $ llvm-bolt-heatmap -p perf.data <executable>
3232
```
3333

3434
By default the heatmap will be dumped to *stdout*. You can change it
35-
with `-o <heatmapfile>` option. Each character/block in the heatmap
36-
shows the execution data accumulated for corresponding 64 bytes of
37-
code. You can change this granularity with a `-block-size` option.
38-
E.g. set it to 4096 to see code usage grouped by 4K pages.
39-
Other useful options are:
35+
with `-o <heatmapfile>` option.
4036

41-
```bash
42-
-line-size=<uint> - number of entries per line (default 256)
43-
-max-address=<uint> - maximum address considered valid for heatmap (default 4GB)
44-
```
4537

4638
If you prefer to look at the data in a browser (or would like to share
4739
it that way), then you can use an HTML conversion tool. E.g.:
4840

4941
```bash
5042
$ aha -b -f <heatmapfile> > <heatmapfile>.html
5143
```
44+
45+
---
46+
47+
## Background on heatmaps:
48+
A heatmap is effectively a histogram that is rendered into a grid for better
49+
visualization.
50+
In theory we can generate a heatmap using any binary and a perf profile.
51+
52+
Each block/character in the heatmap shows the execution data accumulated for
53+
corresponding 64 bytes of code. You can change this granularity with a
54+
`-block-size` option.
55+
E.g. set it to 4096 to see code usage grouped by 4K pages.
56+
57+
58+
When a block is shown as a dot, it means that no samples were found for that
59+
address.
60+
When it is shown as a letter, it indicates a captured sample on a particular
61+
text section of the binary.
62+
To show a mapping between letters and text sections in the legend, use
63+
`-print-mappings`.
64+
When a sampled address does not belong to any of the text sections, the
65+
characters 'o' or 'O' will be shown.
66+
67+
The legend shows by default the ranges in the heatmap according to the number
68+
of samples per block.
69+
A color is assigned per range, except the first two ranges that distinguished by
70+
lower and upper case letters.
71+
72+
On the Y axis, each row/line starts with an actual address of the binary.
73+
Consecutive lines in the heatmap advance by the same amount, with the binary
74+
size covered by a line dependent on the block size and the line size.
75+
An empty new line is inserted for larger gaps between samples.
76+
77+
On the X axis, the horizontally emitted hex numbers can help *estimate* where
78+
in the line the samples lie, but they cannot be combined to provide a full
79+
address, as they are relative to both the bucket and line sizes.
80+
81+
In the example below, the highlighted `0x100` column is not an offset to each
82+
row's address, but instead, it points to the middle of the line.
83+
For the generation, the default bucket size was used with a line size of 128.
84+
85+
86+
![](./HeatmapHeader.png)
87+
88+
89+
Some useful options are:
90+
91+
```
92+
-line-size=<uint> - number of entries per line (default 256)
93+
-max-address=<uint> - maximum address considered valid for heatmap (default 4GB)
94+
-print-mappings - print mappings in the legend, between characters/blocks and text sections (default false)
95+
```

bolt/include/bolt/Core/DebugData.h

Lines changed: 44 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -256,7 +256,7 @@ class DebugRangeListsSectionWriter : public DebugRangesSectionWriter {
256256
};
257257
virtual ~DebugRangeListsSectionWriter(){};
258258

259-
static void setAddressWriter(DebugAddrWriter *AddrW) { AddrWriter = AddrW; }
259+
void setAddressWriter(DebugAddrWriter *AddrW) { AddrWriter = AddrW; }
260260

261261
/// Add ranges with caching.
262262
uint64_t addRanges(
@@ -284,7 +284,7 @@ class DebugRangeListsSectionWriter : public DebugRangesSectionWriter {
284284
}
285285

286286
private:
287-
static DebugAddrWriter *AddrWriter;
287+
DebugAddrWriter *AddrWriter = nullptr;
288288
/// Used to find unique CU ID.
289289
DWARFUnit *CU;
290290
/// Current relative offset of range list entry within this CUs rangelist
@@ -336,21 +336,36 @@ using AddressSectionBuffer = SmallVector<char, 4>;
336336
class DebugAddrWriter {
337337
public:
338338
DebugAddrWriter() = delete;
339-
DebugAddrWriter(BinaryContext *BC_);
339+
DebugAddrWriter(BinaryContext *BC_) : DebugAddrWriter(BC_, UCHAR_MAX) {};
340+
DebugAddrWriter(BinaryContext *BC_, uint8_t AddressByteSize);
340341
virtual ~DebugAddrWriter(){};
341342
/// Given an address returns an index in .debug_addr.
342343
/// Adds Address to map.
343344
uint32_t getIndexFromAddress(uint64_t Address, DWARFUnit &CU);
344345

345346
/// Write out entries in to .debug_addr section for CUs.
346-
virtual void update(DIEBuilder &DIEBlder, DWARFUnit &CUs);
347+
virtual std::optional<uint64_t> finalize(const size_t BufferSize);
347348

348349
/// Return buffer with all the entries in .debug_addr already writen out using
349350
/// update(...).
350-
virtual AddressSectionBuffer &finalize() { return *Buffer; }
351+
virtual std::unique_ptr<AddressSectionBuffer> releaseBuffer() {
352+
return std::move(Buffer);
353+
}
354+
355+
/// Returns buffer size.
356+
virtual size_t getBufferSize() const { return Buffer->size(); }
357+
358+
/// Returns True if Buffer is not empty.
359+
bool isInitialized() const { return !Buffer->empty(); }
351360

352-
/// Returns False if .debug_addr section was created..
353-
bool isInitialized() const { return !AddressMaps.empty(); }
361+
/// Updates address base with the given Offset.
362+
virtual void updateAddrBase(DIEBuilder &DIEBlder, DWARFUnit &CU,
363+
const uint64_t Offset);
364+
365+
/// Appends an AddressSectionBuffer to the address writer's buffer.
366+
void appendToAddressBuffer(const AddressSectionBuffer &Buffer) {
367+
*AddressStream << Buffer;
368+
}
354369

355370
protected:
356371
class AddressForDWOCU {
@@ -407,23 +422,32 @@ class DebugAddrWriter {
407422
}
408423

409424
BinaryContext *BC;
410-
/// Maps DWOID to AddressForDWOCU.
411-
std::unordered_map<uint64_t, AddressForDWOCU> AddressMaps;
425+
/// Address for the DWO CU associated with the address writer.
426+
AddressForDWOCU Map;
427+
uint8_t AddressByteSize;
412428
/// Mutex used for parallel processing of debug info.
413429
std::mutex WriterMutex;
414430
std::unique_ptr<AddressSectionBuffer> Buffer;
415431
std::unique_ptr<raw_svector_ostream> AddressStream;
416432
/// Used to track sections that were not modified so that they can be re-used.
417-
DenseMap<uint64_t, uint64_t> UnmodifiedAddressOffsets;
433+
static DenseMap<uint64_t, uint64_t> UnmodifiedAddressOffsets;
418434
};
419435

420436
class DebugAddrWriterDwarf5 : public DebugAddrWriter {
421437
public:
422438
DebugAddrWriterDwarf5() = delete;
423439
DebugAddrWriterDwarf5(BinaryContext *BC) : DebugAddrWriter(BC) {}
440+
DebugAddrWriterDwarf5(BinaryContext *BC, uint8_t AddressByteSize,
441+
std::optional<uint64_t> AddrOffsetSectionBase)
442+
: DebugAddrWriter(BC, AddressByteSize),
443+
AddrOffsetSectionBase(AddrOffsetSectionBase) {}
424444

425445
/// Write out entries in to .debug_addr section for CUs.
426-
virtual void update(DIEBuilder &DIEBlder, DWARFUnit &CUs) override;
446+
virtual std::optional<uint64_t> finalize(const size_t BufferSize) override;
447+
448+
/// Updates address base with the given Offset.
449+
virtual void updateAddrBase(DIEBuilder &DIEBlder, DWARFUnit &CU,
450+
const uint64_t Offset) override;
427451

428452
protected:
429453
/// Given DWARFUnit \p Unit returns either DWO ID or it's offset within
@@ -435,6 +459,10 @@ class DebugAddrWriterDwarf5 : public DebugAddrWriter {
435459
}
436460
return Unit.getOffset();
437461
}
462+
463+
private:
464+
std::optional<uint64_t> AddrOffsetSectionBase = std::nullopt;
465+
static constexpr uint32_t HeaderSize = 8;
438466
};
439467

440468
/// This class is NOT thread safe.
@@ -583,12 +611,10 @@ class DebugLoclistWriter : public DebugLocWriter {
583611
public:
584612
~DebugLoclistWriter() {}
585613
DebugLoclistWriter() = delete;
586-
DebugLoclistWriter(DWARFUnit &Unit, uint8_t DV, bool SD)
587-
: DebugLocWriter(DV, LocWriterKind::DebugLoclistWriter), CU(Unit),
588-
IsSplitDwarf(SD) {
589-
assert(DebugLoclistWriter::AddrWriter &&
590-
"Please use SetAddressWriter to initialize "
591-
"DebugAddrWriter before instantiation.");
614+
DebugLoclistWriter(DWARFUnit &Unit, uint8_t DV, bool SD,
615+
DebugAddrWriter &AddrW)
616+
: DebugLocWriter(DV, LocWriterKind::DebugLoclistWriter),
617+
AddrWriter(AddrW), CU(Unit), IsSplitDwarf(SD) {
592618
if (DwarfVersion >= 5) {
593619
LocBodyBuffer = std::make_unique<DebugBufferVector>();
594620
LocBodyStream = std::make_unique<raw_svector_ostream>(*LocBodyBuffer);
@@ -600,8 +626,6 @@ class DebugLoclistWriter : public DebugLocWriter {
600626
}
601627
}
602628

603-
static void setAddressWriter(DebugAddrWriter *AddrW) { AddrWriter = AddrW; }
604-
605629
/// Stores location lists internally to be written out during finalize phase.
606630
virtual void addList(DIEBuilder &DIEBldr, DIE &Die, DIEValue &AttrInfo,
607631
DebugLocationsVector &LocList) override;
@@ -630,7 +654,7 @@ class DebugLoclistWriter : public DebugLocWriter {
630654
/// Writes out locations in to a local buffer and applies debug info patches.
631655
void finalizeDWARF5(DIEBuilder &DIEBldr, DIE &Die);
632656

633-
static DebugAddrWriter *AddrWriter;
657+
DebugAddrWriter &AddrWriter;
634658
DWARFUnit &CU;
635659
bool IsSplitDwarf{false};
636660
// Used for DWARF5 to store location lists before being finalized.

bolt/include/bolt/Core/MCPlusBuilder.h

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2041,9 +2041,14 @@ class MCPlusBuilder {
20412041
return InstructionListType();
20422042
}
20432043

2044-
virtual InstructionListType createDummyReturnFunction(MCContext *Ctx) const {
2045-
llvm_unreachable("not implemented");
2046-
return InstructionListType();
2044+
/// Returns a function body that contains only a return instruction. An
2045+
/// example usage is a workaround for the '__bolt_fini_trampoline' of
2046+
// Instrumentation.
2047+
virtual InstructionListType
2048+
createReturnInstructionList(MCContext *Ctx) const {
2049+
InstructionListType Insts(1);
2050+
createReturn(Insts[0]);
2051+
return Insts;
20472052
}
20482053

20492054
/// This method takes an indirect call instruction and splits it up into an

bolt/include/bolt/Rewrite/DWARFRewriter.h

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -66,10 +66,6 @@ class DWARFRewriter {
6666
/// .debug_aranges DWARF section.
6767
std::unique_ptr<DebugARangesSectionWriter> ARangesSectionWriter;
6868

69-
/// Stores and serializes information that will be put into the
70-
/// .debug_addr DWARF section.
71-
std::unique_ptr<DebugAddrWriter> AddrWriter;
72-
7369
/// Stores and serializes information that will be put in to the
7470
/// .debug_addr DWARF section.
7571
/// Does not do de-duplication.
@@ -93,6 +89,10 @@ class DWARFRewriter {
9389
std::unordered_map<uint64_t, std::unique_ptr<DebugRangesSectionWriter>>
9490
LegacyRangesWritersByCU;
9591

92+
/// Stores address writer for each CU.
93+
std::unordered_map<uint64_t, std::unique_ptr<DebugAddrWriter>>
94+
AddressWritersByCU;
95+
9696
std::mutex LocListDebugInfoPatchesMutex;
9797

9898
/// Dwo id specific its RangesBase.
@@ -115,6 +115,7 @@ class DWARFRewriter {
115115
void updateUnitDebugInfo(DWARFUnit &Unit, DIEBuilder &DIEBldr,
116116
DebugLocWriter &DebugLocWriter,
117117
DebugRangesSectionWriter &RangesSectionWriter,
118+
DebugAddrWriter &AddressWriter,
118119
std::optional<uint64_t> RangesBase = std::nullopt);
119120

120121
/// Patches the binary for an object's address ranges to be updated.
@@ -141,13 +142,15 @@ class DWARFRewriter {
141142
/// Process and write out CUs that are passsed in.
142143
void finalizeCompileUnits(DIEBuilder &DIEBlder, DIEStreamer &Streamer,
143144
CUOffsetMap &CUMap,
144-
const std::list<DWARFUnit *> &CUs);
145+
const std::list<DWARFUnit *> &CUs,
146+
DebugAddrWriter &FinalAddrWriter);
145147

146148
/// Finalize debug sections in the main binary.
147149
void finalizeDebugSections(DIEBuilder &DIEBlder,
148150
DWARF5AcceleratorTable &DebugNamesTable,
149151
DIEStreamer &Streamer, raw_svector_ostream &ObjOS,
150-
CUOffsetMap &CUMap);
152+
CUOffsetMap &CUMap,
153+
DebugAddrWriter &FinalAddrWriter);
151154

152155
/// Patches the binary for DWARF address ranges (e.g. in functions and lexical
153156
/// blocks) to be updated.

bolt/include/bolt/Utils/CommandLineOpts.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ extern llvm::cl::opt<unsigned> ExecutionCountThreshold;
4040
extern llvm::cl::opt<unsigned> HeatmapBlock;
4141
extern llvm::cl::opt<unsigned long long> HeatmapMaxAddress;
4242
extern llvm::cl::opt<unsigned long long> HeatmapMinAddress;
43+
extern llvm::cl::opt<bool> HeatmapPrintMappings;
4344
extern llvm::cl::opt<bool> HotData;
4445
extern llvm::cl::opt<bool> HotFunctionsAtEnd;
4546
extern llvm::cl::opt<bool> HotText;

bolt/lib/Core/DIEBuilder.cpp

Lines changed: 18 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -556,7 +556,17 @@ DWARFDie DIEBuilder::resolveDIEReference(
556556
const DWARFAbbreviationDeclaration::AttributeSpec AttrSpec,
557557
DWARFUnit *&RefCU, DWARFDebugInfoEntry &DwarfDebugInfoEntry) {
558558
assert(RefValue.isFormClass(DWARFFormValue::FC_Reference));
559-
uint64_t RefOffset = *RefValue.getAsReference();
559+
uint64_t RefOffset;
560+
if (std::optional<uint64_t> Off = RefValue.getAsRelativeReference()) {
561+
RefOffset = RefValue.getUnit()->getOffset() + *Off;
562+
} else if (Off = RefValue.getAsDebugInfoReference(); Off) {
563+
RefOffset = *Off;
564+
} else {
565+
BC.errs()
566+
<< "BOLT-WARNING: [internal-dwarf-error]: unsupported reference type: "
567+
<< FormEncodingString(RefValue.getForm()) << ".\n";
568+
return DWARFDie();
569+
}
560570
return resolveDIEReference(AttrSpec, RefOffset, RefCU, DwarfDebugInfoEntry);
561571
}
562572

@@ -607,7 +617,13 @@ void DIEBuilder::cloneDieReferenceAttribute(
607617
DIE &Die, const DWARFUnit &U, const DWARFDie &InputDIE,
608618
const DWARFAbbreviationDeclaration::AttributeSpec AttrSpec,
609619
const DWARFFormValue &Val) {
610-
const uint64_t Ref = *Val.getAsReference();
620+
uint64_t Ref;
621+
if (std::optional<uint64_t> Off = Val.getAsRelativeReference())
622+
Ref = Val.getUnit()->getOffset() + *Off;
623+
else if (Off = Val.getAsDebugInfoReference(); Off)
624+
Ref = *Off;
625+
else
626+
return;
611627

612628
DIE *NewRefDie = nullptr;
613629
DWARFUnit *RefUnit = nullptr;

0 commit comments

Comments
 (0)