Skip to content

Conversation

MaskRay
Copy link
Member

@MaskRay MaskRay commented Jun 27, 2024

https://reviews.llvm.org/D70157 (for Intel Jump Conditional Code
Erratum) introduced two virtual function calls in the generic
MCObjectStreamer::emitInstruction, which added some overhead.

This patch removes the virtual function overhead:

  • Define llvm::X86_MC::emitInstruction that calls emitInstruction{Begin,End}.
  • Define {X86ELFStreamer,X86WinCOFFStreamer}::emitInstruction to call llvm::X86_MC::emitInstruction

Created using spr 1.3.5-bogner
@MaskRay MaskRay requested a review from KanRobert June 27, 2024 00:32
@llvmbot llvmbot added backend:X86 llvm:mc Machine (object) code labels Jun 27, 2024
@MaskRay MaskRay requested review from aengelke and preames June 27, 2024 00:32
@llvmbot
Copy link
Member

llvmbot commented Jun 27, 2024

@llvm/pr-subscribers-mc

@llvm/pr-subscribers-backend-x86

Author: Fangrui Song (MaskRay)

Changes

https://reviews.llvm.org/D70157 (for Intel Jump Conditional Code
Erratum) introduced two virtual function calls in the generic
MCObjectStreamer::emitInstruction, which added some overhead.

Define X86ELFStreamer and move the emitInstruction{Begin,End} hooks to
X86AsmBackend. For now, keep X86AsmBackend and X86ELFStreamer and in the
same file to avoid virtual function calls.


Full diff: https://github.com/llvm/llvm-project/pull/96835.diff

5 Files Affected:

  • (modified) llvm/include/llvm/MC/MCAsmBackend.h (-7)
  • (modified) llvm/lib/MC/MCObjectStreamer.cpp (-2)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp (+33-3)
  • (added) llvm/lib/Target/X86/MCTargetDesc/X86ELFStreamer.h (+22)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp (+2)
diff --git a/llvm/include/llvm/MC/MCAsmBackend.h b/llvm/include/llvm/MC/MCAsmBackend.h
index 01a64fb425a94..1f36b7e98274f 100644
--- a/llvm/include/llvm/MC/MCAsmBackend.h
+++ b/llvm/include/llvm/MC/MCAsmBackend.h
@@ -62,13 +62,6 @@ class MCAsmBackend {
   /// tricky way for optimization.
   virtual bool allowEnhancedRelaxation() const { return false; }
 
-  /// Give the target a chance to manipulate state related to instruction
-  /// alignment (e.g. padding for optimization), instruction relaxablility, etc.
-  /// before and after actually emitting the instruction.
-  virtual void emitInstructionBegin(MCObjectStreamer &OS, const MCInst &Inst,
-                                    const MCSubtargetInfo &STI) {}
-  virtual void emitInstructionEnd(MCObjectStreamer &OS, const MCInst &Inst) {}
-
   /// lifetime management
   virtual void reset() {}
 
diff --git a/llvm/lib/MC/MCObjectStreamer.cpp b/llvm/lib/MC/MCObjectStreamer.cpp
index 5dc73c5b7887a..fec1ccee6ff84 100644
--- a/llvm/lib/MC/MCObjectStreamer.cpp
+++ b/llvm/lib/MC/MCObjectStreamer.cpp
@@ -330,9 +330,7 @@ void MCObjectStreamer::emitInstruction(const MCInst &Inst,
                                                 "' cannot have instructions");
     return;
   }
-  getAssembler().getBackend().emitInstructionBegin(*this, Inst, STI);
   emitInstructionImpl(Inst, STI);
-  getAssembler().getBackend().emitInstructionEnd(*this, Inst);
 }
 
 void MCObjectStreamer::emitInstructionImpl(const MCInst &Inst,
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
index c2188d206b5f6..fa3674830b728 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
@@ -7,8 +7,9 @@
 //===----------------------------------------------------------------------===//
 
 #include "MCTargetDesc/X86BaseInfo.h"
-#include "MCTargetDesc/X86FixupKinds.h"
+#include "MCTargetDesc/X86ELFStreamer.h"
 #include "MCTargetDesc/X86EncodingOptimization.h"
+#include "MCTargetDesc/X86FixupKinds.h"
 #include "llvm/ADT/StringSwitch.h"
 #include "llvm/BinaryFormat/ELF.h"
 #include "llvm/BinaryFormat/MachO.h"
@@ -162,8 +163,8 @@ class X86AsmBackend : public MCAsmBackend {
   bool allowAutoPadding() const override;
   bool allowEnhancedRelaxation() const override;
   void emitInstructionBegin(MCObjectStreamer &OS, const MCInst &Inst,
-                            const MCSubtargetInfo &STI) override;
-  void emitInstructionEnd(MCObjectStreamer &OS, const MCInst &Inst) override;
+                            const MCSubtargetInfo &STI);
+  void emitInstructionEnd(MCObjectStreamer &OS, const MCInst &Inst);
 
   unsigned getNumFixupKinds() const override {
     return X86::NumTargetFixupKinds;
@@ -1546,3 +1547,32 @@ MCAsmBackend *llvm::createX86_64AsmBackend(const Target &T,
     return new ELFX86_X32AsmBackend(T, OSABI, STI);
   return new ELFX86_64AsmBackend(T, OSABI, STI);
 }
+
+namespace {
+class X86ELFStreamer : public MCELFStreamer {
+public:
+  X86ELFStreamer(MCContext &Context, std::unique_ptr<MCAsmBackend> TAB,
+                 std::unique_ptr<MCObjectWriter> OW,
+                 std::unique_ptr<MCCodeEmitter> Emitter)
+      : MCELFStreamer(Context, std::move(TAB), std::move(OW),
+                      std::move(Emitter)) {}
+
+  void emitInstruction(const MCInst &Inst, const MCSubtargetInfo &STI) override;
+};
+} // end anonymous namespace
+
+void X86ELFStreamer::emitInstruction(const MCInst &Inst,
+                                     const MCSubtargetInfo &STI) {
+  auto &Backend = static_cast<X86AsmBackend &>(getAssembler().getBackend());
+  Backend.emitInstructionBegin(*this, Inst, STI);
+  MCObjectStreamer::emitInstruction(Inst, STI);
+  Backend.emitInstructionEnd(*this, Inst);
+}
+
+MCStreamer *llvm::createX86ELFStreamer(const Triple &T, MCContext &Context,
+                                       std::unique_ptr<MCAsmBackend> &&MAB,
+                                       std::unique_ptr<MCObjectWriter> &&MOW,
+                                       std::unique_ptr<MCCodeEmitter> &&MCE) {
+  return new X86ELFStreamer(Context, std::move(MAB), std::move(MOW),
+                            std::move(MCE));
+}
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86ELFStreamer.h b/llvm/lib/Target/X86/MCTargetDesc/X86ELFStreamer.h
new file mode 100644
index 0000000000000..e57c45d97b37a
--- /dev/null
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86ELFStreamer.h
@@ -0,0 +1,22 @@
+//===-- X86ELFStreamer.h - ELF Streamer for X86 -----------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIB_TARGET_X86_MCTARGETDESC_X86ELFSTREAMER_H
+#define LLVM_LIB_TARGET_X86_MCTARGETDESC_X86ELFSTREAMER_H
+
+#include "llvm/MC/MCELFStreamer.h"
+
+namespace llvm {
+
+MCStreamer *createX86ELFStreamer(const Triple &T, MCContext &Context,
+                                 std::unique_ptr<MCAsmBackend> &&MAB,
+                                 std::unique_ptr<MCObjectWriter> &&MOW,
+                                 std::unique_ptr<MCCodeEmitter> &&MCE);
+}
+
+#endif
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp
index ed4d0a45bd8f2..d862e7427489c 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp
@@ -14,6 +14,7 @@
 #include "TargetInfo/X86TargetInfo.h"
 #include "X86ATTInstPrinter.h"
 #include "X86BaseInfo.h"
+#include "X86ELFStreamer.h"
 #include "X86IntelInstPrinter.h"
 #include "X86MCAsmInfo.h"
 #include "X86TargetStreamer.h"
@@ -741,6 +742,7 @@ extern "C" LLVM_EXTERNAL_VISIBILITY void LLVMInitializeX86TargetMC() {
     // Register the null streamer.
     TargetRegistry::RegisterNullTargetStreamer(*T, createX86NullTargetStreamer);
 
+    TargetRegistry::RegisterELFStreamer(*T, createX86ELFStreamer);
     TargetRegistry::RegisterCOFFStreamer(*T, createX86WinCOFFStreamer);
 
     // Register the MCInstPrinter.

namespace {
class X86ELFStreamer : public MCELFStreamer {
public:
X86ELFStreamer(MCContext &Context, std::unique_ptr<MCAsmBackend> TAB,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about COFF streamer?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's in llvm/lib/Target/X86/MCTargetDesc/X86WinCOFFStreamer.cpp, which doesn't support Intel JCC Erratum.

There is some way to split X86AsmBackend.cpp into X86ELFStreamer.cpp, but that would be a quite large refactoring and otherwise do not yield a noticeable gain.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, Intel JC Erratum supports COFF now

llvm-mc -filetype=obj -triple  x86_64-pc-win32 --x86-align-branch-boundary=32 --x86-align-branch=call+jmp+indirect+ret+jcc ./llvm/test/MC/X86/align-branch-single.s | llvm-objdump -d --no-show-raw-insn -

though I didn't add test for it. Would this patch drop the support for COFF?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added X86_MC::emitInstruction to retain the support for COFF.

@MaskRay
Copy link
Member Author

MaskRay commented Jun 27, 2024

https://llvm-compile-time-tracker.com/compare.php?from=4558e45e7e33d1cfc1a54af761085e358dbab64b&to=1a702c28cba578bbc594d06e40dce096500fe08d&stat=instructions:u

stage2-O3:

Benchmark Old New
kimwitu++ 38987M 38942M (-0.12%)
sqlite3 35149M 35134M (-0.04%)
consumer-typeset 31961M 31951M (-0.03%)
Bullet 93293M 93203M (-0.10%)
tramp3d-v4 78572M 78510M (-0.08%)
mafft 32958M 32946M (-0.03%)
ClamAV 50424M 50404M (-0.04%)
lencod 61201M 61193M (-0.01%)
SPASS 43001M 42968M (-0.08%)
7zip 191976M 191913M (-0.03%)
geomean 55343M 55312M (-0.06%)

Created using spr 1.3.5-bogner
Copy link
Contributor

@KanRobert KanRobert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, could you add x86_64-pc-win32 for ./llvm/test/MC/X86/align-branch-single.s

@MaskRay
Copy link
Member Author

MaskRay commented Jun 27, 2024

./llvm/test/MC/X86/align-branch-single.s

Thanks for the review! Added the test in a separate commit before this PR is merged.

MaskRay added a commit that referenced this pull request Jun 27, 2024
Increase test coverage exposed by #96835.
@MaskRay MaskRay merged commit aa3589f into main Jun 27, 2024
@MaskRay MaskRay deleted the users/MaskRay/spr/mcx86-emitinstruction-remove-virtual-function-calls-due-to-intel-jcc-erratum branch June 27, 2024 16:43
cpiaseque pushed a commit to cpiaseque/llvm-project that referenced this pull request Jul 3, 2024
cpiaseque pushed a commit to cpiaseque/llvm-project that referenced this pull request Jul 3, 2024
…JCC Erratum

https://reviews.llvm.org/D70157 (for Intel Jump Conditional Code
Erratum) introduced two virtual function calls in the generic
MCObjectStreamer::emitInstruction, which added some overhead.

This patch removes the virtual function overhead:

* Define `llvm::X86_MC::emitInstruction` that calls `emitInstruction{Begin,End}`.
* Define {X86ELFStreamer,X86WinCOFFStreamer}::emitInstruction to call `llvm::X86_MC::emitInstruction`

Pull Request: llvm#96835
AlexisPerry pushed a commit to llvm-project-tlp/llvm-project that referenced this pull request Jul 9, 2024
AlexisPerry pushed a commit to llvm-project-tlp/llvm-project that referenced this pull request Jul 9, 2024
…JCC Erratum

https://reviews.llvm.org/D70157 (for Intel Jump Conditional Code
Erratum) introduced two virtual function calls in the generic
MCObjectStreamer::emitInstruction, which added some overhead.

This patch removes the virtual function overhead:

* Define `llvm::X86_MC::emitInstruction` that calls `emitInstruction{Begin,End}`.
* Define {X86ELFStreamer,X86WinCOFFStreamer}::emitInstruction to call `llvm::X86_MC::emitInstruction`

Pull Request: llvm#96835
@MaskRay
Copy link
Member Author

MaskRay commented Jul 20, 2025

The overhead introduced by these fragment-related features are much higher than I expected.

https://llvm-compile-time-tracker.com/compare.php?from=ff0cbecb68bd28f6131894fbb037e063e8da6bab&to=d77ac81e93e5e2df5275b687b53049d9acfe1357&stat=instructions:u claims another 0.10% saving from stage2-O0-g: instructions:u.

@MaskRay
Copy link
Member Author

MaskRay commented Jul 20, 2025

To further optimize the X86MCCodeEmitter::encodeInstruction (based on the changes in https://reviews.llvm.org/D145792 , where SmallVectorImpl replaced raw_ostream), we can leverage the fixed maximum instruction+prefix length and simplify buffer management. Assume the caller provides a sufficiently large buffer (trailing data of MCFragment, using a special bump allocator) and only begin and size are needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 llvm:mc Machine (object) code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants