Skip to content

Support fno-plt like gcc #78275

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
hstk30-hw opened this issue Jan 16, 2024 · 9 comments
Closed

Support fno-plt like gcc #78275

hstk30-hw opened this issue Jan 16, 2024 · 9 comments

Comments

@hstk30-hw
Copy link
Contributor

Hi, I see https://reviews.llvm.org/D39079 add the support for fno-plt in frontend. But has no affect on asm code.

https://discourse.llvm.org/t/bug-fno-plt-has-no-effect/57986

If I want to support fno-plt in clang like gcc, Where should I start?

@hstk30-hw hstk30-hw added the question A question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead! label Jan 16, 2024
@hstk30-hw
Copy link
Contributor Author

CC @tmsri @MaskRay

@tmsri
Copy link
Member

tmsri commented Jan 16, 2024

Sorry, it's been a while. Trying to understand what you want to achieve here, Is it that you have a PLT call in inline asm that you want to avoid with -fno-plt?

@hstk30-hw
Copy link
Contributor Author

I want to achieve you proposed in gcc https://gcc.gnu.org/legacy-ml/gcc-patches/2015-05/msg00001.html @tmsri

https://godbolt.org/z/Yeor6TTeE

For now, the fno-plt in clang seem just add NonLazyBind attribute to the function, seem not affect in the downstream.

@MaskRay
Copy link
Member

MaskRay commented Jan 18, 2024

mips has -mno-plt. Some arches such as x86/aarch64 have -fno-plt. in Clang, -fno-plt is only implemented for x86. I can add the AArch64 support.

I have some notes on https://maskray.me/blog/2021-09-19-all-about-procedure-linkage-table#fno-plt . -fno-plt is significantly worse on RISC architectures due to a longer code sequence.

@wzssyqa
Copy link
Contributor

wzssyqa commented Jan 21, 2024

gcc support -mplt only for TARGET_ABSOLUTE_ABICALLS.

#define TARGET_ABSOLUTE_ABICALLS
(TARGET_ABICALLS
&& !TARGET_SHARED
&& TARGET_EXPLICIT_RELOCS
&& !ABI_HAS_64BIT_SYMBOLS)

For normal PIC ELFs, all of them are no-plt.

I think that it makes no sense to support it on LLVM.

@MaskRay
Copy link
Member

MaskRay commented Jan 21, 2024

gcc support -mplt only for TARGET_ABSOLUTE_ABICALLS.

#define TARGET_ABSOLUTE_ABICALLS (TARGET_ABICALLS && !TARGET_SHARED && TARGET_EXPLICIT_RELOCS && !ABI_HAS_64BIT_SYMBOLS)

For normal PIC ELFs, all of them are no-plt.

I think that it makes no sense to support it on LLVM.

mips's situation is different. -mno-plt is generally the default except certain -fno-pic cases that default to -mplt (most o32/n32 and n64 -msym32).
GCC's mips -mplt implementation confuses problematic copy relocations with PLT, which is a flaw but can be fixed.

For most other architectures (probably all supported by LLVM), -fplt is the default and having -fno-plt could be interesting in some cases. The calls will be similar to __dllspec(dllimport) and can have an interesting trade-off for C library functions when dynamically linked.

MaskRay added a commit that referenced this issue Mar 5, 2024
Clang sets the nonlazybind attribute for certain ObjC features. The
AArch64 SelectionDAG implementation for non-intrinsic calls
(46e36f0) is behind a cl option.

GCC implements -fno-plt for a few ELF targets. In Clang, -fno-plt also
sets the nonlazybind attribute. For SelectionDAG, make the cl option not
affect ELF so that non-intrinsic calls to a dso_preemptable function use
GOT. Adjust AArch64TargetLowering::LowerCall to handle intrinsic calls.

For FastISel, change `fastLowerCall` to bail out when a call is due to
-fno-plt.

For GlobalISel, handle non-intrinsic calls in CallLowering::lowerCall
and intrinsic calls in AArch64CallLowering::lowerCall (where the
target-independent CallLowering::lowerCall is not called).
The GlobalISel test in `call-rv-marker.ll` is therefore updated.

Note: the current -fno-plt -fpic implementation does not use GOT for a
preemptable function.

Link: #78275

Pull Request: #78890
@Kojoley
Copy link
Contributor

Kojoley commented Mar 14, 2024

Some arches such as x86/aarch64 have -fno-plt. in Clang, -fno-plt is only implemented for x86.

It doesn't seem to be implemented for 32bit x86 and x86 GlobalISel.

https://godbolt.org/z/aa6jqraPY

void bar(void);

void foo(void)
{
    bar();
}
; clang -m32 -O3 -fpic -fno-plt
foo():                                # @foo()
        push    ebx
        sub     esp, 8
        call    .L0$pb
.L0$pb:
        pop     ebx
.Ltmp0:
        add     ebx, offset _GLOBAL_OFFSET_TABLE_+(.Ltmp0-.L0$pb)
        call    bar()@PLT
        add     esp, 8
        pop     ebx
        ret

; gcc -m32 -O3 -fpic -fno-plt
foo():
        call    __x86.get_pc_thunk.ax
        add     eax, OFFSET FLAT:_GLOBAL_OFFSET_TABLE_
        jmp     [DWORD PTR _Z3barv@GOT[eax]]
__x86.get_pc_thunk.ax:
        mov     eax, DWORD PTR [esp]
        ret
clang -O3 -fpic -fno-plt -mllvm -global-isel
fatal error: error in backend: cannot select: %0:gr64(p0) = G_GLOBAL_VALUE @_Z3barv (in function: _Z3foov)
clang -m32 -O3 -fpic -fno-plt -mllvm -global-isel
fatal error: error in backend: cannot select: %0:gr32(p0) = G_GLOBAL_VALUE @_Z3barv (in function: _Z3foov)

@EugeneZelenko EugeneZelenko added backend:AArch64 llvm:codegen and removed question A question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead! labels Mar 28, 2024
@llvmbot
Copy link
Member

llvmbot commented Mar 28, 2024

@llvm/issue-subscribers-backend-aarch64

Author: None (hstk30-hw)

Hi, I see https://reviews.llvm.org/D39079 add the support for `fno-plt` in frontend. But has no affect on asm code.

https://discourse.llvm.org/t/bug-fno-plt-has-no-effect/57986

If I want to support fno-plt in clang like gcc, Where should I start?

@hstk30-hw hstk30-hw reopened this Apr 1, 2024
@hstk30-hw
Copy link
Contributor Author

-fno-plt is significantly worse on RISC architectures due to a longer code sequence.

I test it in AArch64, gets 1.x% improvement, and branch-miss is down.
In AArch64 use 3 Instructions compare with 1 instruction in X86.
Branch-prediction is still the key :)

Thx @MaskRay

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants