Skip to content

Conversation

bsdjhb
Copy link
Contributor

@bsdjhb bsdjhb commented Oct 6, 2025

This extracts the PT_CHERI_PCC bits from the c18n branch. It includes changes to rewrite the bounds of code capability relocations to use the bounds of the containing PT_CHERI_PCC instead of the bounds of the function.

This is not enabled for CHERI-MIPS.

bsdjhb added 4 commits October 6, 2025 11:21
If the address of base+offset resolves to a target symbol, print that
symbol with an offset of 0 as the symbol for a capreloc in the compact
output.  If the base+offset does not resolve to a target symbol, fall
back to using any symbol associated with the base address.
Static linkers can emit this ELF program header to describe the ranges
of address space that should share PCC bounds.  Dynamic linkers should
constrain code pointers that resolve to addresses in these ranges to
the bounds of the program header.
@bsdjhb bsdjhb requested a review from jrtc27 October 6, 2025 19:03
BaseSymbol = it->second;
// errs() << "BaseSymbol = SymbolNames[" << Base << "] = " << it->second << "\n";
}
if (Offset != 0 && !opts::ExpandRelocs) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This becomes more useful once code capabilities use PT_CHERI_PCC bounds as the symbol name remains the name of the function instead of becoming the name of the first symbol in the PT_CHERI_PCC segment with an offset.

OS << format(" Base: 0x%lx (", static_cast<unsigned long>(Base))
<< BaseSymbol;
if (SymbolOffset >= 0)
if (SymbolOffset > 0)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is debatable perhaps if this is a useful change. I think it is a bit more readable (and wanted to avoid implying a cap reloc offset of 0 since in many cases an offset != 0 can now result in a +0 output after the previous change). I should perhaps include more of the rationale in the commit log.

}

// Determine the sections PCC should cover for each compartment.
if (config->isCheriAbi) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should be disabled for CHERI MIPS? Currently I only disable the caprelocs changes, but perhaps PT_CHERI_PCC should also be disabled for CHERI MIPS? The experimental cap-table stuff isn't really compatible with it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, the per-file/function stuff is still ultimately one section accessed relatively to the magic PC-on-function-entry register, right? It's just the PLT ABI (and function descriptor, but that was never implemented outside of my initial out-of-tree undergrad prototype) that is weird with PCC bounds, because $cgp is known to point to the captable on entry, and presumably it indirects everything through that, even things that sensible architectures do PC-relatively, given the lack of PC-relative addressing. So I think you just want isCheriBoundsSection to bail after checking SHF_EXECINSTR for the MIPS PLT ABI? Something like:

if (config->emachine == EM_MIPS && in.mipsAbiFlags) {
  std::optional<unsigned> abi;
  invokeELFT(getMipsCheriAbiVariant, abi, *in.mipsAbiFlags);
  if (abi == DF_MIPS_CHERI_ABI_PLT || abi == DF_MIPS_CHERI_ABI_FNDESC)
    return false;
}

// OutputSection. Returns true if the alignment of any OutputSections were
// modified.
static bool alignPCCBounds(PhdrEntry *p) {
// Determine the required alignment for a single PT_CHERI_PCC segment. Apply
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could squash this commit down into the previous one if the detailed history here is not useful.

bsdjhb added 6 commits October 6, 2025 15:14
This just uses cheri-compressed-cap.h directly.  Probably the existing
helper methods in llvm/lib/Target/RISCV/MCTargetDesc should be moved
out to llvm/Support so they can be reused here.
Similar to CHERI RISC-V, this uses cheri-compressed-cap.h directly for
CHERI 128.
This segment bounds all of the executable and rodata sections as well
as any GOTs and PLTs accessed via PCC.
Use an architecture-specific method to compute the required alignment
for the PT_CHERI_PCC segment.  Align the first section as well as the
next section after the last section to this alignment.  Round up the
memory size of the PT_CHERI_PCC to cover this padding.

Note that this does not handle the edge case that the file might need
to be padded at the end to ensure that the PT_CHERI_PCC segment does
not exceed the bounds of the mapping of the entire file.
- The padding section's flags match the previous section so that it is
  merged into the same segment.

- Treat the padding section as relro if the previous section was relro
  to avoid forcing a new PT_LOAD due to relro vs non-relro.

- Ignore empty text input sections when determining if padding is
  needed.

  llvm-mc includes an empty .text section in object files even if
  there is no .text input
@bsdjhb
Copy link
Contributor Author

bsdjhb commented Oct 6, 2025

The clang format errors are all about excessive whitespace in existing tests that I had to update.

TargetInfo::~TargetInfo() {}

uint64_t TargetInfo::getCheriRequiredAlignment(uint64_t len) const {
error("current target does not provide required Cheri alignment");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, llvm_unreachable like adjustPrologueForCrossSplitStack?

return true;

// .got.plt is accessed relative to PCC.
if (sec == in.gotPlt->getParent())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • igotPlt for completeness? Redundant in reality as it's always in the same OutputSection as gotPlt (except 32-bit Arm which puts it with GotSection) but avoids needing to think about it.

if (sec == in.gotPlt->getParent())
return true;

// CHERI capability table is accessed relative to PCC.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment should say MIPS

return true;

// .rodata symbols are accessed relative to PCC.
if (sec->name.startswith(".rodata"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

String matching isn't ideal, but we already have to do that elsewhere in Cheri.cpp. I guess in an ideal world we'd figure this out based on the relocations (and maybe even split .rodata into the things that are accessed PC-relatively and the things that are just const variables).

}

// Determine the sections PCC should cover for each compartment.
if (config->isCheriAbi) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, the per-file/function stuff is still ultimately one section accessed relatively to the magic PC-on-function-entry register, right? It's just the PLT ABI (and function descriptor, but that was never implemented outside of my initial out-of-tree undergrad prototype) that is weird with PCC bounds, because $cgp is known to point to the captable on entry, and presumably it indirects everything through that, even things that sensible architectures do PC-relatively, given the lack of PC-relative addressing. So I think you just want isCheriBoundsSection to bail after checking SHF_EXECINSTR for the MIPS PLT ABI? Something like:

if (config->emachine == EM_MIPS && in.mipsAbiFlags) {
  std::optional<unsigned> abi;
  invokeELFT(getMipsCheriAbiVariant, abi, *in.mipsAbiFlags);
  if (abi == DF_MIPS_CHERI_ABI_PLT || abi == DF_MIPS_CHERI_ABI_FNDESC)
    return false;
}

# RUN: ld.lld --shared %t/plt.o -o %t/plt.so
# RUN: ld.lld --shared %t/plt_notext.o -o %t/plt_notext.so

# RUN: llvm-readelf -t %t/small_text.so | FileCheck --check-prefix=SMALL_TEXT %s
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use -W so they fit on one line, and kebab-case the prefixes

# PLT_NOTEXT-NEXT: PROGBITS 00000000000053d0 0023d0 000010 00 0 0 1
# PLT_NOTEXT-NEXT: [0000000000000003]: WRITE, ALLOC

#--- small_text.s
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kebab-case

Comment on lines +106 to +128
#--- text_rodata.s

.global foo
.type foo, @function
foo:
ret
.size foo, . - foo

.rodata
.global bar
.type bar, @object
bar:
.space 8192
.size bar, . - bar

#--- rodata_only.s

.rodata
.global bar
.type bar, @object
bar:
.space 8196
.size bar, . - bar
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not have rodata.s and link with small-text for one of the tests?

}

// For function relocs, use PCC bounds from the PT_CHERI_PCC segment.
if (config->emachine != EM_MIPS && (isFunc || isGnuIFunc)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this one get turned off for MIPS?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also could fold into above if?

# RELOC32: CHERI __cap_relocs [
# RELOC32-NEXT: 0x013088 Base: 0x11030 (<unknown symbol>) Length: 64 Perms: Code
# RELOC32-NEXT: 0x013090 Base: 0x11030 (<unknown symbol>) Length: 64 Perms: Code
# RELOC32-NEXT: 0x013088 Base: 0x11000 (_start+48) Length: 9216 Perms: Code
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Base being after _start is odd?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants