Skip to content

[PowerPC] Support local-dynamic TLS relocation on AIX #66316

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 31 commits into from
Mar 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
0e519e0
[PowerPC] Support local-dynamic TLS relocation on AIX
Sep 14, 2023
685619e
address comments
Sep 14, 2023
6ff54ca
[NFC] address comments.
Sep 18, 2023
dc6fbec
[NFC] address comments
Sep 20, 2023
d979c44
[NFC] address comment
Sep 21, 2023
09cf488
Attempt to address comment: use r4 for LoadOffsetToc
Sep 22, 2023
a6085eb
Remove TLS local-dynamic mode guards for AIX.
Sep 22, 2023
b5f60f7
Fixed issues raised by comments and incorporated suggested changes.
Sep 24, 2023
fd68b57
(1) Use GPR3 directly in LoadModuleHandle.
Sep 25, 2023
9755a4a
Fix obj mode var access.
Sep 25, 2023
92732a5
[NFC] Incorporate comments.
Sep 27, 2023
fe45c89
[NFC] Add FIXME to highlight existing issue:
Sep 28, 2023
1d08e72
Address the following FIXME:
Oct 26, 2023
5c42342
Address comment: use == to compare StringRef names
Oct 26, 2023
30b26ae
Correct SMC setting for the _$TLSML symbol
Oct 27, 2023
62ce4ad
Simplify and add check
Oct 27, 2023
955fe59
misc
Oct 27, 2023
4f51e52
Simplify logic by move the XMC_TC setting for the _$TLSML symbol into…
Oct 30, 2023
8f82459
Add check to make sure GV's name is defined before access
Oct 30, 2023
c34895f
update
Oct 31, 2023
51ec134
Apply target flag change && update according to comments
Dec 7, 2023
10822d3
Address comment: Add flag check before string compare.
Dec 11, 2023
0119567
Update comment.
Dec 12, 2023
1f8edff
[NFC] Add comments
Jan 2, 2024
9fbb330
Remove dummy logic:
Jan 17, 2024
f75ff08
[NFC] Update comments.
Feb 7, 2024
595ef30
rebase test case
Feb 20, 2024
17ead44
[NFC] Update comments.
Feb 20, 2024
e400223
[NFC] Update comment.
Feb 20, 2024
3dde56f
[NFC] update comments
Feb 27, 2024
1a95e93
[NFC] update comments
Feb 29, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion clang/include/clang/Basic/DiagnosticDriverKinds.td
Original file line number Diff line number Diff line change
Expand Up @@ -693,7 +693,6 @@ def err_drv_cannot_mix_options : Error<"cannot specify '%1' along with '%0'">;
def err_drv_invalid_object_mode : Error<
"OBJECT_MODE setting %0 is not recognized and is not a valid setting">;

def err_aix_unsupported_tls_model : Error<"TLS model '%0' is not yet supported on AIX">;
def err_roptr_requires_data_sections: Error<"-mxcoff-roptr is supported only with -fdata-sections">;
def err_roptr_cannot_build_shared: Error<"-mxcoff-roptr is not supported with -shared">;

Expand Down
8 changes: 0 additions & 8 deletions clang/lib/Frontend/CompilerInvocation.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1975,14 +1975,6 @@ bool CompilerInvocation::ParseCodeGenArgs(CodeGenOptions &Opts, ArgList &Args,
Opts.LinkBitcodeFiles.push_back(F);
}

if (Arg *A = Args.getLastArg(OPT_ftlsmodel_EQ)) {
if (T.isOSAIX()) {
StringRef Name = A->getValue();
if (Name == "local-dynamic")
Diags.Report(diag::err_aix_unsupported_tls_model) << Name;
}
}

if (Arg *A = Args.getLastArg(OPT_fdenormal_fp_math_EQ)) {
StringRef Val = A->getValue();
Opts.FPDenormalMode = llvm::parseDenormalFPAttribute(Val);
Expand Down
6 changes: 0 additions & 6 deletions clang/lib/Sema/SemaDeclAttr.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2053,12 +2053,6 @@ static void handleTLSModelAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
return;
}

if (S.Context.getTargetInfo().getTriple().isOSAIX() &&
Model == "local-dynamic") {
S.Diag(LiteralLoc, diag::err_aix_attr_unsupported_tls_model) << Model;
return;
}

D->addAttr(::new (S.Context) TLSModelAttr(S.Context, AL, Model));
}

Expand Down
9 changes: 6 additions & 3 deletions clang/test/CodeGen/PowerPC/aix-tls-model.cpp
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
// RUN: %clang_cc1 %s -triple powerpc-unknown-aix -target-cpu pwr8 -emit-llvm -o - | FileCheck %s -check-prefix=CHECK-GD
// RUN: %clang_cc1 %s -triple powerpc-unknown-aix -target-cpu pwr8 -ftls-model=global-dynamic -emit-llvm -o - | FileCheck %s -check-prefix=CHECK-GD
// RUN: not %clang_cc1 %s -triple powerpc-unknown-aix -target-cpu pwr8 -ftls-model=local-dynamic -emit-llvm 2>&1 | FileCheck %s -check-prefix=CHECK-LD-ERROR
// RUN: %clang_cc1 %s -triple powerpc-unknown-aix -target-cpu pwr8 -ftls-model=local-dynamic -emit-llvm -o - | FileCheck %s -check-prefix=CHECK-LD
// RUN: %clang_cc1 %s -triple powerpc-unknown-aix -target-cpu pwr8 -ftls-model=initial-exec -emit-llvm -o - | FileCheck %s -check-prefix=CHECK-IE
// RUN: %clang_cc1 %s -triple powerpc-unknown-aix -target-cpu pwr8 -ftls-model=local-exec -emit-llvm -o - | FileCheck %s -check-prefix=CHECK-LE
// RUN: %clang_cc1 %s -triple powerpc64-unknown-aix -target-cpu pwr8 -emit-llvm -o - | FileCheck %s -check-prefix=CHECK-GD
// RUN: %clang_cc1 %s -triple powerpc64-unknown-aix -target-cpu pwr8 -ftls-model=global-dynamic -emit-llvm -o - | FileCheck %s -check-prefix=CHECK-GD
// RUN: not %clang_cc1 %s -triple powerpc64-unknown-aix -target-cpu pwr8 -ftls-model=local-dynamic -emit-llvm 2>&1 | FileCheck %s -check-prefix=CHECK-LD-ERROR
// RUN: %clang_cc1 %s -triple powerpc64-unknown-aix -target-cpu pwr8 -ftls-model=local-dynamic -emit-llvm -o - | FileCheck %s -check-prefix=CHECK-LD
// RUN: %clang_cc1 %s -triple powerpc64-unknown-aix -target-cpu pwr8 -ftls-model=initial-exec -emit-llvm -o - | FileCheck %s -check-prefix=CHECK-IE
// RUN: %clang_cc1 %s -triple powerpc64-unknown-aix -target-cpu pwr8 -ftls-model=local-exec -emit-llvm -o - | FileCheck %s -check-prefix=CHECK-LE

Expand All @@ -21,7 +21,10 @@ int f() {
// CHECK-GD: @z2 ={{.*}} global i32 0
// CHECK-GD: @x ={{.*}} thread_local global i32 0
// CHECK-GD: @_ZZ1fvE1y = internal thread_local global i32 0
// CHECK-LD-ERROR: error: TLS model 'local-dynamic' is not yet supported on AIX
// CHECK-LD: @z1 ={{.*}} global i32 0
// CHECK-LD: @z2 ={{.*}} global i32 0
// CHECK-LD: @x ={{.*}} thread_local(localdynamic) global i32 0
// CHECK-LD: @_ZZ1fvE1y = internal thread_local(localdynamic) global i32 0
// CHECK-IE: @z1 ={{.*}} global i32 0
// CHECK-IE: @z2 ={{.*}} global i32 0
// CHECK-IE: @x ={{.*}} thread_local(initialexec) global i32 0
Expand Down
2 changes: 1 addition & 1 deletion clang/test/Sema/aix-attr-tls_model.c
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,6 @@
#endif

static __thread int y __attribute((tls_model("global-dynamic"))); // no-warning
static __thread int y __attribute((tls_model("local-dynamic"))); // expected-error {{TLS model 'local-dynamic' is not yet supported on AIX}}
static __thread int y __attribute((tls_model("local-dynamic"))); // expected-no-diagnostics
static __thread int y __attribute((tls_model("initial-exec"))); // no-warning
static __thread int y __attribute((tls_model("local-exec"))); // no-warning
2 changes: 2 additions & 0 deletions llvm/include/llvm/MC/MCExpr.h
Original file line number Diff line number Diff line change
Expand Up @@ -307,6 +307,8 @@ class MCSymbolRefExpr : public MCExpr {
VK_PPC_AIX_TLSGDM, // symbol@m
VK_PPC_AIX_TLSIE, // symbol@ie
VK_PPC_AIX_TLSLE, // symbol@le
VK_PPC_AIX_TLSLD, // symbol@ld
VK_PPC_AIX_TLSML, // symbol@ml
VK_PPC_GOT_TLSLD, // symbol@got@tlsld
VK_PPC_GOT_TLSLD_LO, // symbol@got@tlsld@l
VK_PPC_GOT_TLSLD_HI, // symbol@got@tlsld@h
Expand Down
23 changes: 18 additions & 5 deletions llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2418,6 +2418,15 @@ MCSection *TargetLoweringObjectFileXCOFF::getSectionForExternalReference(
SmallString<128> Name;
getNameWithPrefix(Name, GO, TM);

// AIX TLS local-dynamic does not need the external reference for the
// "_$TLSML" symbol.
if (GO->getThreadLocalMode() == GlobalVariable::LocalDynamicTLSModel &&
GO->hasName() && GO->getName() == "_$TLSML") {
return getContext().getXCOFFSection(
Name, SectionKind::getData(),
XCOFF::CsectProperties(XCOFF::XMC_TC, XCOFF::XTY_SD));
}

XCOFF::StorageMappingClass SMC =
isa<Function>(GO) ? XCOFF::XMC_DS : XCOFF::XMC_UA;
if (GO->isThreadLocal())
Expand Down Expand Up @@ -2675,13 +2684,17 @@ MCSection *TargetLoweringObjectFileXCOFF::getSectionForTOCEntry(
// the chance of needing -bbigtoc is decreased. Also, the toc-entry for
// EH info is never referenced directly using instructions so it can be
// allocated with TE storage-mapping class.
// The "_$TLSML" symbol for TLS local-dynamic mode requires XMC_TC, otherwise
// the AIX assembler will complain.
return getContext().getXCOFFSection(
cast<MCSymbolXCOFF>(Sym)->getSymbolTableName(), SectionKind::getData(),
XCOFF::CsectProperties((TM.getCodeModel() == CodeModel::Large ||
cast<MCSymbolXCOFF>(Sym)->isEHInfo())
? XCOFF::XMC_TE
: XCOFF::XMC_TC,
XCOFF::XTY_SD));
XCOFF::CsectProperties(
((TM.getCodeModel() == CodeModel::Large &&
cast<MCSymbolXCOFF>(Sym)->getSymbolTableName() != "_$TLSML") ||
cast<MCSymbolXCOFF>(Sym)->isEHInfo())
? XCOFF::XMC_TE
: XCOFF::XMC_TC,
XCOFF::XTY_SD));
}

MCSection *TargetLoweringObjectFileXCOFF::getSectionForLSDA(
Expand Down
4 changes: 4 additions & 0 deletions llvm/lib/MC/MCExpr.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -338,6 +338,10 @@ StringRef MCSymbolRefExpr::getVariantKindName(VariantKind Kind) {
return "ie";
case VK_PPC_AIX_TLSLE:
return "le";
case VK_PPC_AIX_TLSLD:
return "ld";
case VK_PPC_AIX_TLSML:
return "ml";
case VK_PPC_GOT_TLSLD: return "got@tlsld";
case VK_PPC_GOT_TLSLD_LO: return "got@tlsld@l";
case VK_PPC_GOT_TLSLD_HI: return "got@tlsld@h";
Expand Down
3 changes: 2 additions & 1 deletion llvm/lib/MC/XCOFFObjectWriter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -715,7 +715,8 @@ void XCOFFObjectWriter::recordRelocation(MCAssembler &Asm,
if (Type == XCOFF::RelocationType::R_POS ||
Type == XCOFF::RelocationType::R_TLS ||
Type == XCOFF::RelocationType::R_TLS_LE ||
Type == XCOFF::RelocationType::R_TLS_IE)
Type == XCOFF::RelocationType::R_TLS_IE ||
Type == XCOFF::RelocationType::R_TLS_LD)
// The FixedValue should be symbol's virtual address in this object file
// plus any constant value that we might get.
FixedValue = getVirtualAddress(SymA, SymASec) + Target.getConstant();
Expand Down
13 changes: 10 additions & 3 deletions llvm/lib/Target/PowerPC/MCTargetDesc/PPCMCTargetDesc.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -231,12 +231,19 @@ class PPCTargetAsmStreamer : public PPCTargetStreamer {
MCSymbolXCOFF *TCSym =
cast<MCSectionXCOFF>(Streamer.getCurrentSectionOnly())
->getQualNameSymbol();
// On AIX, we have a region handle (symbol@m) and the variable offset
// (symbol@{gd|ie|le}) for TLS variables, depending on the TLS model.
// On AIX, we have TLS variable offsets (symbol@({gd|ie|le|ld}) depending
// on the TLS access method (or model). For the general-dynamic access
// method, we also have region handle (symbol@m) for each variable. For
// local-dynamic, there is a module handle (_$TLSML[TC]@ml) for all
// variables. Finally for local-exec and initial-exec, we have a thread
// pointer, in r13 for 64-bit mode and returned by .__get_tpointer for
// 32-bit mode.
if (Kind == MCSymbolRefExpr::VariantKind::VK_PPC_AIX_TLSGD ||
Kind == MCSymbolRefExpr::VariantKind::VK_PPC_AIX_TLSGDM ||
Kind == MCSymbolRefExpr::VariantKind::VK_PPC_AIX_TLSIE ||
Kind == MCSymbolRefExpr::VariantKind::VK_PPC_AIX_TLSLE)
Kind == MCSymbolRefExpr::VariantKind::VK_PPC_AIX_TLSLE ||
Kind == MCSymbolRefExpr::VariantKind::VK_PPC_AIX_TLSLD ||
Kind == MCSymbolRefExpr::VariantKind::VK_PPC_AIX_TLSML)
OS << "\t.tc " << TCSym->getName() << "," << XSym->getName() << "@"
<< MCSymbolRefExpr::getVariantKindName(Kind) << '\n';
else
Expand Down
4 changes: 4 additions & 0 deletions llvm/lib/Target/PowerPC/MCTargetDesc/PPCXCOFFObjectWriter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,10 @@ std::pair<uint8_t, uint8_t> PPCXCOFFObjectWriter::getRelocTypeAndSignSize(
return {XCOFF::RelocationType::R_TLS_IE, SignAndSizeForFKData};
case MCSymbolRefExpr::VK_PPC_AIX_TLSLE:
return {XCOFF::RelocationType::R_TLS_LE, SignAndSizeForFKData};
case MCSymbolRefExpr::VK_PPC_AIX_TLSLD:
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there could be concern regarding this setting, and I observed obj output and the asm-as output is a little bit different on "IsSigned" setting on relocations for symbol ".__tls_get_mod", for example:

obj mode output processed by "llvm-readobj --relocs --expand-relocs"

  Section (index: 1) .text {
    Relocation {
      Virtual Address: 0xE
      Symbol: _$TLSML (11)
      IsSigned: No
      FixupBitValue: 0
      Length: 16
      Type: R_TOC (0x3)
    }
    Relocation {
      Virtual Address: 0x10
      Symbol: .__tls_get_mod (1)
      IsSigned: No
      FixupBitValue: 0
      Length: 26
      Type: R_RBA (0x18)
    }
    Relocation {
      Virtual Address: 0x16
      Symbol: a (13)
      IsSigned: No
      FixupBitValue: 0
      Length: 16
      Type: R_TOC (0x3)
    }
...

asm mode output assembled by "as -a64 -many ", and then processed by "llvm-readobj --relocs --expand-relocs"

  Section (index: 1) .text {
    Relocation {
      Virtual Address: 0x16
      Symbol: a (13)
      IsSigned: No
      FixupBitValue: 0
      Length: 16
      Type: R_TOC (0x3)
    }
    Relocation {
      Virtual Address: 0x1A
      Symbol: _$TLSML (15)
      IsSigned: No
      FixupBitValue: 0
      Length: 16
      Type: R_TOC (0x3)
    }
    Relocation {
      Virtual Address: 0x1C
      Symbol: .__tls_get_mod (3)
      IsSigned: Yes
      FixupBitValue: 0
      Length: 26
      Type: R_RBA (0x18)
    }
...

Notice they have different setting regarding "IsSigned" on the relocation for the ".__tls_get_mod" symbol.

I took another look into the behavior of general-dynamic, and then I saw the same difference there:

obj mode

    Relocation {
      Virtual Address: 0x1C
      Symbol: .__tls_get_addr (1)
      IsSigned: No
      FixupBitValue: 0
      Length: 26
      Type: R_RBA (0x18)
    }

asm mode

    Relocation {
      Virtual Address: 0x1C
      Symbol: .__tls_get_addr (3)
      IsSigned: Yes
      FixupBitValue: 0
      Length: 26
      Type: R_RBA (0x18)
    }

Looks like LD is aligned with GD in this particular behavior, so this may not be an issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering, maybe the IsSigned does not matter in this situation due to the above comment?

  // People from AIX OS team says AIX link editor does not care about
  // the sign bit in the relocation entry "most" of the time.
  // The system assembler seems to set the sign bit on relocation entry
  // based on similar property of IsPCRel. So we will do the same here.
  // TODO: More investigation on how assembler decides to set the sign
  // bit, and we might want to match that.
  const uint8_t EncodedSignednessIndicator = IsPCRel ? SignBitMask : 0u;

return {XCOFF::RelocationType::R_TLS_LD, SignAndSizeForFKData};
case MCSymbolRefExpr::VK_PPC_AIX_TLSML:
return {XCOFF::RelocationType::R_TLSML, SignAndSizeForFKData};
case MCSymbolRefExpr::VK_None:
return {XCOFF::RelocationType::R_POS, SignAndSizeForFKData};
}
Expand Down
6 changes: 6 additions & 0 deletions llvm/lib/Target/PowerPC/PPC.h
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,12 @@ class ModulePass;
/// and Local Exec models.
MO_TPREL_FLAG,

/// MO_TLSLDM_FLAG - on AIX the ML relocation type is only valid for a
/// reference to a TOC symbol from the symbol itself, and right now its only
/// user is the symbol "_$TLSML". The symbol name is used to decide that
/// the R_TLSML relocation is expected.
MO_TLSLDM_FLAG,

/// MO_TLSLD_FLAG - If this bit is set the symbol reference is relative to
/// TLS Local Dynamic model.
MO_TLSLD_FLAG,
Expand Down
58 changes: 45 additions & 13 deletions llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -621,12 +621,23 @@ void PPCAsmPrinter::LowerPATCHPOINT(StackMaps &SM, const MachineInstr &MI) {
EmitToStreamer(*OutStreamer, MCInstBuilder(PPC::NOP));
}

/// This helper function creates the TlsGetAddr MCSymbol for AIX. We will
/// create the csect and use the qual-name symbol instead of creating just the
/// external symbol.
/// This helper function creates the TlsGetAddr/TlsGetMod MCSymbol for AIX. We
/// will create the csect and use the qual-name symbol instead of creating just
/// the external symbol.
static MCSymbol *createMCSymbolForTlsGetAddr(MCContext &Ctx, unsigned MIOpc) {
StringRef SymName =
MIOpc == PPC::GETtlsTpointer32AIX ? ".__get_tpointer" : ".__tls_get_addr";
StringRef SymName;
switch (MIOpc) {
default:
SymName = ".__tls_get_addr";
break;
case PPC::GETtlsTpointer32AIX:
SymName = ".__get_tpointer";
break;
case PPC::GETtlsMOD32AIX:
case PPC::GETtlsMOD64AIX:
SymName = ".__tls_get_mod";
break;
}
return Ctx
.getXCOFFSection(SymName, SectionKind::getText(),
XCOFF::CsectProperties(XCOFF::XMC_PR, XCOFF::XTY_ER))
Expand Down Expand Up @@ -668,14 +679,16 @@ void PPCAsmPrinter::EmitTlsCall(const MachineInstr *MI,
"GETtls[ld]ADDR[32] must read GPR3");

if (Subtarget->isAIXABI()) {
// On AIX, the variable offset should already be in R4 and the region handle
// should already be in R3.
// For TLSGD, which currently is the only supported access model, we only
// need to generate an absolute branch to .__tls_get_addr.
// For TLSGD, the variable offset should already be in R4 and the region
// handle should already be in R3. We generate an absolute branch to
// .__tls_get_addr. For TLSLD, the module handle should already be in R3.
// We generate an absolute branch to .__tls_get_mod.
Register VarOffsetReg = Subtarget->isPPC64() ? PPC::X4 : PPC::R4;
(void)VarOffsetReg;
assert(MI->getOperand(2).isReg() &&
MI->getOperand(2).getReg() == VarOffsetReg &&
assert((MI->getOpcode() == PPC::GETtlsMOD32AIX ||
MI->getOpcode() == PPC::GETtlsMOD64AIX ||
(MI->getOperand(2).isReg() &&
MI->getOperand(2).getReg() == VarOffsetReg)) &&
"GETtls[ld]ADDR[32] must read GPR4");
EmitAIXTlsCallHelper(MI);
Copy link
Collaborator

@hubert-reinterpretcast hubert-reinterpretcast Oct 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[edit: I misread the assembly below. The floating-point argument was not being reloaded.]

The helper functions have special calling convention properties. For example, they do not use the FP registers. The IBM XL compiler was able to take advantage of that.

For:

__attribute__((tls_model("local-dynamic"))) __thread int x;
double g(int, double);
void f() {
  double gg = g(0, 1.);
  g(x, gg);
}

The IBM XL compilers were able to make use of the returned double staying in the register:

      28: 48 00 00 03   bla 0
                        00000028:  R_RBA        (idx: 36) .__tls_get_mod[PR]
      2c: 7c 66 18 2e   lwzx 3, 6, 3
      30: 4b ff ff d1   bl 0x0 <.f>
                        00000030:  R_RBR        (idx: 34) .g[PR]

Clang/LLVM loads the value from the stack:

      24: 48 00 00 03   bla 0
                0000000000000024:  R_RBA        (idx: 3) .__tls_get_mod[PR]
      28: e8 82 00 08   ld 4, 8(2)
                000000000000002a:  R_TOC        (idx: 17) x[TC]
      2c: 7c 63 22 aa   lwax 3, 3, 4
      30: 4b ff ff d1   bl 0x0 <.f>
                0000000000000030:  R_RBR        (idx: 1) .g[PR]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I've observed, Clang/LLVM uses the special GPR usage rules for the call to __tls_get_mod. But as you noted, FPRs are not handled as a special case.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for looking into this!

I tried the example, and it seems turn on optimization can help remove the FP load. We declared "Defs = [X0,X4,X5,X11,LR8,CR0]" for GETtlsMOD64AIX, and "Defs = [R0,R4,R5,R11,LR,CR0]" for GETtlsMOD32AIX. I think those FP registers should be treated as not touched by the call to __tls_get_mod.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @orcguru. I misread the assembly 🤦‍♂️ (the extra instruction is not an lfd).

return;
Expand Down Expand Up @@ -844,6 +857,13 @@ void PPCAsmPrinter::emitInstruction(const MachineInstr *MI) {
return MCSymbolRefExpr::VariantKind::VK_PPC_AIX_TLSGDM;
if (Flag == PPCII::MO_TLSGD_FLAG || Flag == PPCII::MO_GOT_TLSGD_PCREL_FLAG)
return MCSymbolRefExpr::VariantKind::VK_PPC_AIX_TLSGD;
// For local-dynamic TLS access on AIX, we have one TOC entry for the symbol
// (the variable offset) and one shared TOC entry for the module handle.
// They are differentiated by MO_TLSLD_FLAG and MO_TLSLDM_FLAG.
if (Flag == PPCII::MO_TLSLD_FLAG && IsAIX)
return MCSymbolRefExpr::VariantKind::VK_PPC_AIX_TLSLD;
if (Flag == PPCII::MO_TLSLDM_FLAG && IsAIX)
return MCSymbolRefExpr::VariantKind::VK_PPC_AIX_TLSML;
return MCSymbolRefExpr::VariantKind::VK_None;
};

Expand Down Expand Up @@ -1354,6 +1374,11 @@ void PPCAsmPrinter::emitInstruction(const MachineInstr *MI) {
.addExpr(SymGotTlsGD));
return;
}
case PPC::GETtlsMOD32AIX:
case PPC::GETtlsMOD64AIX:
// Transform: %r3 = GETtlsMODNNAIX %r3 (for NN == 32/64).
// Into: BLA .__tls_get_mod()
// Input parameter is a module handle (_$TLSML[TC]@ml) for all variables.
case PPC::GETtlsADDR:
// Transform: %x3 = GETtlsADDR %x3, @sym
// Into: BL8_NOP_TLS __tls_get_addr(sym at tlsgd)
Expand Down Expand Up @@ -2167,6 +2192,11 @@ void PPCAIXAsmPrinter::emitLinkage(const GlobalValue *GV,
}
}

// Do not emit the _$TLSML symbol.
if (GV->getThreadLocalMode() == GlobalVariable::LocalDynamicTLSModel &&
GV->hasName() && GV->getName() == "_$TLSML")
return;

OutStreamer->emitXCOFFSymbolLinkageWithVisibility(GVSym, LinkageAttr,
VisibilityAttr);
}
Expand Down Expand Up @@ -2981,11 +3011,13 @@ void PPCAIXAsmPrinter::emitInstruction(const MachineInstr *MI) {
MMI->hasDebugInfo());
break;
}
case PPC::GETtlsMOD32AIX:
case PPC::GETtlsMOD64AIX:
case PPC::GETtlsTpointer32AIX:
case PPC::GETtlsADDR64AIX:
case PPC::GETtlsADDR32AIX: {
// A reference to .__tls_get_addr/.__get_tpointer is unknown to the
// assembler so we need to emit an external symbol reference.
// A reference to .__tls_get_mod/.__tls_get_addr/.__get_tpointer is unknown
// to the assembler so we need to emit an external symbol reference.
MCSymbol *TlsGetAddr =
createMCSymbolForTlsGetAddr(OutContext, MI->getOpcode());
ExtSymSDNodeSymbols.insert(TlsGetAddr);
Expand Down
39 changes: 32 additions & 7 deletions llvm/lib/Target/PowerPC/PPCISelLowering.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1774,9 +1774,11 @@ const char *PPCTargetLowering::getTargetNodeName(unsigned Opcode) const {
case PPCISD::ADDIS_TLSGD_HA: return "PPCISD::ADDIS_TLSGD_HA";
case PPCISD::ADDI_TLSGD_L: return "PPCISD::ADDI_TLSGD_L";
case PPCISD::GET_TLS_ADDR: return "PPCISD::GET_TLS_ADDR";
case PPCISD::GET_TLS_MOD_AIX: return "PPCISD::GET_TLS_MOD_AIX";
case PPCISD::GET_TPOINTER: return "PPCISD::GET_TPOINTER";
case PPCISD::ADDI_TLSGD_L_ADDR: return "PPCISD::ADDI_TLSGD_L_ADDR";
case PPCISD::TLSGD_AIX: return "PPCISD::TLSGD_AIX";
case PPCISD::TLSLD_AIX: return "PPCISD::TLSLD_AIX";
case PPCISD::ADDIS_TLSLD_HA: return "PPCISD::ADDIS_TLSLD_HA";
case PPCISD::ADDI_TLSLD_L: return "PPCISD::ADDI_TLSLD_L";
case PPCISD::GET_TLSLD_ADDR: return "PPCISD::GET_TLSLD_ADDR";
Expand Down Expand Up @@ -3415,13 +3417,36 @@ SDValue PPCTargetLowering::LowerGlobalTLSAddressAIX(SDValue Op,
return DAG.getNode(PPCISD::ADD_TLS, dl, PtrVT, TLSReg, VariableOffset);
}

// Only Local-Exec, Initial-Exec and General-Dynamic TLS models are currently
// supported models. If Local- or Initial-exec are not possible or specified,
// all GlobalTLSAddress nodes are lowered using the general-dynamic model.
// We need to generate two TOC entries, one for the variable offset, one for
// the region handle. The global address for the TOC entry of the region
// handle is created with the MO_TLSGDM_FLAG flag and the global address
// for the TOC entry of the variable offset is created with MO_TLSGD_FLAG.
if (Model == TLSModel::LocalDynamic) {
// For local-dynamic on AIX, we need to generate one TOC entry for each
// variable offset, and a single module-handle TOC entry for the entire
// file.

SDValue VariableOffsetTGA =
DAG.getTargetGlobalAddress(GV, dl, PtrVT, 0, PPCII::MO_TLSLD_FLAG);
SDValue VariableOffset = getTOCEntry(DAG, dl, VariableOffsetTGA);

Module *M = DAG.getMachineFunction().getFunction().getParent();
GlobalVariable *TLSGV =
dyn_cast_or_null<GlobalVariable>(M->getOrInsertGlobal(
StringRef("_$TLSML"), PointerType::getUnqual(*DAG.getContext())));
TLSGV->setThreadLocalMode(GlobalVariable::LocalDynamicTLSModel);
assert(TLSGV && "Not able to create GV for _$TLSML.");
SDValue ModuleHandleTGA =
DAG.getTargetGlobalAddress(TLSGV, dl, PtrVT, 0, PPCII::MO_TLSLDM_FLAG);
SDValue ModuleHandleTOC = getTOCEntry(DAG, dl, ModuleHandleTGA);
SDValue ModuleHandle =
DAG.getNode(PPCISD::TLSLD_AIX, dl, PtrVT, ModuleHandleTOC);

return DAG.getNode(ISD::ADD, dl, PtrVT, ModuleHandle, VariableOffset);
}

// If Local- or Initial-exec or Local-dynamic is not possible or specified,
// all GlobalTLSAddress nodes are lowered using the general-dynamic model. We
// need to generate two TOC entries, one for the variable offset, one for the
// region handle. The global address for the TOC entry of the region handle is
// created with the MO_TLSGDM_FLAG flag and the global address for the TOC
// entry of the variable offset is created with MO_TLSGD_FLAG.
SDValue VariableOffsetTGA =
DAG.getTargetGlobalAddress(GV, dl, PtrVT, 0, PPCII::MO_TLSGD_FLAG);
SDValue RegionHandleTGA =
Expand Down
13 changes: 12 additions & 1 deletion llvm/lib/Target/PowerPC/PPCISelLowering.h
Original file line number Diff line number Diff line change
Expand Up @@ -370,11 +370,22 @@ namespace llvm {
/// G8RC = TLSGD_AIX, TOC_ENTRY, TOC_ENTRY
/// Op that combines two register copies of TOC entries
/// (region handle into R3 and variable offset into R4) followed by a
/// GET_TLS_ADDR node which will be expanded to a call to __get_tls_addr.
/// GET_TLS_ADDR node which will be expanded to a call to .__tls_get_addr.
/// This node is used in 64-bit mode as well (in which case the result is
/// G8RC and inputs are X3/X4).
TLSGD_AIX,

/// %x3 = GET_TLS_MOD_AIX _$TLSML - For the AIX local-dynamic TLS model,
/// produces a call to .__tls_get_mod(_$TLSML\@ml).
GET_TLS_MOD_AIX,

/// [GP|G8]RC = TLSLD_AIX, TOC_ENTRY(module handle)
/// Op that requires a single input of the module handle TOC entry in R3,
/// and generates a GET_TLS_MOD_AIX node which will be expanded into a call
/// to .__tls_get_mod. This node is used in both 32-bit and 64-bit modes.
/// The only difference is the register class.
TLSLD_AIX,

/// G8RC = ADDIS_TLSLD_HA %x2, Symbol - For the local-dynamic TLS
/// model, produces an ADDIS8 instruction that adds the GOT base
/// register to sym\@got\@tlsld\@ha.
Expand Down
Loading