forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 0
[libc][math] Disable FEnvSafeTest.cpp if AArch64 target has no FP support (#166370)
#1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Stylie777
wants to merge
39
commits into
main
from
11-05-_flang_openmp_add_lowering_support_for_taskloop_reductions_
Closed
[libc][math] Disable FEnvSafeTest.cpp if AArch64 target has no FP support (#166370)
#1
Stylie777
wants to merge
39
commits into
main
from
11-05-_flang_openmp_add_lowering_support_for_taskloop_reductions_
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…upport (llvm#166370) The `FEnvSafeTest.cpp` test fails on AArch64 soft nofp configurations because LLVM libc does not provide a floating-point environment in these configurations. This patch adds another preprocessor guard on `__ARM_FP` to disable the test on those.
…llvm#166174) Dummy variables have an entry in `Program::Globals`, but they are not added to `GlobalIndices`. When registering redeclarations, we used to only patch up the global indices, but that left the dummy variables alone. Update the dummy variables of all redeclarations as well. Fixes llvm#165952
…m#166266) explain more about use-after-free in llvm-twine-local add note about manually adjusting code after applying fix-it. fixed: llvm#154810
…der (llvm#166292) By default, the dialect conversion driver processes operations in pre-order: the initial worklist is populated pre-order. (New/modified operations are immediately legalized recursively.) This commit adds a new API for selective post-order legalization. Patterns can request an operation / region legalization via `ConversionPatternRewriter::legalize`. They can call these helper functions on nested regions before rewriting the operation itself. Note: In rollback mode, a failed recursive legalization typically leads to a conversion failure. Since recursive legalization is performed by separate pattern applications, there is no way for the original pattern to recover from such a failure.
…rn` (llvm#166513) Fixes a bug in `VectorConvertToLLVMPattern`, which converted operations with unsupported FP types. E.g., `arith.addf ... : f4E2M1FN` was lowered to `llvm.fadd ... : i4`, which does not verify. There are a few more patterns that have the same bug. Those will be fixed in follow-up PRs. This commit is in preparation of adding an `APFloat`-based lowering for `arith` operations with unsupported floating-point types.
…s.py. (llvm#164965) This change enables update_llc_test_checks.py to automatically generate MIR checks for RUN lines that use `-stop-before` or `-stop-after` flags allowing tests to verify intermediate compilation stages (e.g., after instruction selection but before peephole optimizations) alongside the final assembly output. If `-debug-only` flag is present in the run line it's considered as the main point of interest for testing and stop flags above are ignored (that is no MIR checks are generated). This resulted from the scenario, when I needed to test two instruction matching patterns where the later pattern in the peepholer reverts the earlier pattern in the instruction selector and distinguish it from the case when the earlier pattern didn't worked at all. Initially created by Claude Sonnet 4.5 it was improved later to handle conflicts in MIR <-> ASM prefixes and formatting.
…157818) This patch introduces the LASX and LSX conversion intrinsics: - <8 x float> @llvm.loongarch.lasx.cast.128.s(<4 x float>) - <4 x double> @llvm.loongarch.lasx.cast.128.d(<2 x double>) - <4 x i64> @llvm.loongarch.lasx.cast.128(<2 x i64>) - <8 x float> @llvm.loongarch.lasx.concat.128.s(<4 x float>, <4 x float>) - <4 x double> @llvm.loongarch.lasx.concat.128.d(<2 x double>, <2 x double>) - <4 x i64> @llvm.loongarch.lasx.concat.128(<2 x i64>, <2 x i64>) - <4 x float> @llvm.loongarch.lasx.extract.128.lo.s(<8 x float>) - <2 x double> @llvm.loongarch.lasx.extract.128.lo.d(<4 x double>) - <2 x i64> @llvm.loongarch.lasx.extract.128.lo(<4 x i64>) - <4 x float> @llvm.loongarch.lasx.extract.128.hi.s(<8 x float>) - <2 x double> @llvm.loongarch.lasx.extract.128.hi.d(<4 x double>) - <2 x i64> @llvm.loongarch.lasx.extract.128.hi(<4 x i64>) - <8 x float> @llvm.loongarch.lasx.insert.128.lo.s(<8 x float>, <4 x float>) - <4 x double> @llvm.loongarch.lasx.insert.128.lo.d(<4 x double>, <2 x double>) - <4 x i64> @llvm.loongarch.lasx.insert.128.lo(<4 x i64>, <2 x i64>) - <8 x float> @llvm.loongarch.lasx.insert.128.hi.s(<8 x float>, <4 x float>) - <4 x double> @llvm.loongarch.lasx.insert.128.hi.d(<4 x double>, <2 x double>) - <4 x i64> @llvm.loongarch.lasx.insert.128.hi(<4 x i64>, <2 x i64>)
…n-overaligned-type` (llvm#166546)
…ster (llvm#164479) Technically, it is possible that the a callee-saved register is saved in different locations. CFIInstrInserter should handle this, but currently it does not.
Use cannotHoistOrSinkRecipe to forbid sinking allocas.
When ISL encounters an internal error, it sets the error flag, but it is not isl_error_quota that was already checked. Check for general errors and abort the schedule optimization if that happens, instead of continuing on the good path. The error occured when compiling llvm-test-suite's MultiSource/Applications/JM/lencod/leaky_bucket.c with Polly enabled. Not adding a test case because it depends on ISL internals. We do not want to a test case to depend on which version of ISL is used.
…st_checks.py." (llvm#166549) Reverts llvm#164965
Currently, the ARM backend incorrectly parses every `arm` prefixed arch to be non-thumb, but `armv6m` is THUMB and doesnt have ARM ops causing the test to fail when compiling to assembly and not LLVM IR: `error: Function 'foo' uses ARM instructions, but the target does not support ARM mode execution.` This only happens when invoking cc1 directly and not the Clang driver. As a quick triage, this patch changes the tests to use `thumb`. Uncovered by llvm#151404
…lvm#160536) As discussed in llvm#153402, we have inefficiences in handling constant pool access that are difficult to address. Using an IR pass to promote double constants to a global allows a higher degree of control of code generation for these accesses, resulting in improved performance on benchmarks that might otherwise have high register pressure due to accessing constant pool values separately rather than via a common base. Directly promoting double constants to separate global values and relying on the global merger to do a sensible thing would be one potential avenue to explore, but it is _not_ done in this version of the patch because: * The global merger pass needs fixes. For instance it claims to be a function pass, yet all of the work is done in initialisation. This means that attempts by backends to schedule it after a given module pass don't actually work as expected. * The heuristics used can impact codegen unexpectedly, so I worry that tweaking it to get the behaviour desired for promoted constants may lead to other issues. This may be completely tractable though. Now that llvm#159352 has landed, the impact on terms if dynamically executed instructions is slightly smaller (as we are starting from a better baseline), but still worthwhile in lbm and nab from SPEC. Results below are for rva22u64: ``` Benchmark Baseline This PR Diff (%) ============================================================ ============================================================ 500.perlbench_r 180668945687 180666122417 -0.00% 502.gcc_r 221274522161 221277565086 0.00% 505.mcf_r 134656204033 134656204066 0.00% 508.namd_r 217646645332 216699783858 -0.44% 510.parest_r 291731988950 291916190776 0.06% 511.povray_r 30983594866 31107718817 0.40% 519.lbm_r 91217999812 87405361395 -4.18% 520.omnetpp_r 137699867177 137674535853 -0.02% 523.xalancbmk_r 284730719514 284734023366 0.00% 525.x264_r 379107521547 379100250568 -0.00% 526.blender_r 659391437610 659447919505 0.01% 531.deepsjeng_r 350038121654 350038121656 0.00% 538.imagick_r 238568674979 238560772162 -0.00% 541.leela_r 405660852855 405654701346 -0.00% 544.nab_r 398215801848 391352111262 -1.72% 557.xz_r 129832192047 129832192055 0.00% ``` --- Notes for reviewers: * As discussed at the sync-up meeting, the suggestion is to try to land an incremental improvement to the status quo even if there is more work to be done around the general issue of constant pool handling. We can discuss here if that is actually the best next step or not, but I just wanted to clarify that's why this is being posted with a somewhat narrow scope. * I've disabled transformations both for RV32 and on systems without D as both cases saw some regressions.
Fixes llvm#165346 This patch renames stale variable names where `TypeSourceInfo` objects were still using the old `DI` (`DeclaratorInfo`) naming convention. Specifically, variables of type `TypeSourceInfo` have been updated from `DI` to `TSI` to improve code clarity and maintain consistency with the current naming.
The debug info attached to the BUNDLE is the first instruction in the BUNDLE, even if a better debug info (line:column) is present in the later instructions of the bundle. The patch tries to get a better debug info first. If not, then a worse debug info without line number is chosen. --------- Co-authored-by: Vladislav Dzhidzhoev <[email protected]> Co-authored-by: Orlando Cazalet-Hyams <[email protected]>
We need to allow BitCasts between pointer types to different prim types, but that means we need to catch the problem at a later stage, i.e. when loading the values. Fixes llvm#158527 Fixes llvm#163778
Move to the libcall impl based functions.
Update tests to contain auto generated checks.
PR llvm#165993 accidentally broke the lowering of the `test.wait` Op. This patch fixes the issue and adds tests to verify the lowering to intrinsics for all mbarrier Ops, ensuring similar regressions are caught in the future. Additionally, the `cp-async-mbarrier` test is moved to the `mbarriers.mlir` test file to keep all related tests together. Signed-off-by: Durgadoss R <[email protected]>
llvm#166536) …e size is unknown Keep _negative suffix only for test cases when the size is negative
… checks (llvm#148810) This PR adds support for the NOTIFY specifier in the image selector as described in the 2023 standard, and add checks for the NOTIFY_TYPE type.
…ional (llvm#166032) This picks up from llvm#166028, making the `Function` argument optional: most cases don't need to provide it, but in e.g. InstCombine's case, where the instruction (select, branch) is not attached to a function yet, the function needs to be passed explicitly. Co-authored-by: Florian Hahn <[email protected]>
…lvm#166078) In the following example, `Functor::method()` inappropriately triggers a diagnostic that `outer()` is blocking by allocating memory. ``` void outer() [[clang::nonblocking]] { struct Functor { int* ptr; void method() { ptr = new int; } }; } ``` --------- Co-authored-by: Doug Wyatt <[email protected]>
…m#155630) When there's a deep inheritance hierarchy of multiple C++ classes (see below), then the mangled name of a VFTable can include multiple key nodes in the target name. For example, in the following code, MSVC will generate mangled names for the VFTables that have up to three key classes in the context. <details><summary>Code</summary> ```cpp class Base1 { virtual void a() {}; }; class Base2 { virtual void b() {} }; class Ind1 : public Base1 {}; class Ind2 : public Base1 {}; class A : public Ind1, public Ind2 {}; class Ind3 : public A {}; class Ind4 : public A {}; class B : public Ind3, public Ind4 {}; class Ind5 : public B {}; class Ind6 : public B {}; class C : public Ind5, public Ind6 {}; int main() { auto i = new C; } ``` </details> This will include `??_7C@@6BInd1@@ind4@@ind5@@@` (and every other combination). Microsoft's undname will demangle this to "const C::\`vftable'{for \`Ind1's \`Ind4's \`Ind5'}". Previously, LLVM would demangle this to "const C::\`vftable'{for \`Ind1'}". With this PR, the output of LLVM's undname will be identical to Microsoft's version. This changes `SpecialTableSymbolNode::TargetName` to a node array which contains each key from the name. Unlike namespaces, these keys are not in reverse order - they are in the same order as in the mangled name.
Support for lowering the Reduction clause support already exists, so we can extend the support for taskloop to include reduction. As support for Reduction in taskloop was only added in OpenMP 5.0, the use of the Clause has been restricted to that version of OpenMP or greater.
Owner
Author
This stack of pull requests is managed by Graphite. Learn more about stacking. |
Stylie777
pushed a commit
that referenced
this pull request
Nov 12, 2025
## Summary
Fix `FindProcesses` to respect Android's `hidepid=2` security model and
enable name matching for Android apps.
## Problem
1. Called `adb shell pidof` or `adb shell ps` directly, bypassing
Android's process visibility restrictions
2. Name matching failed for Android apps - searched for
`com.example.myapp` but GDB Remote Protocol reports `app_process64`
Android apps fork from Zygote, so `/proc/PID/exe` points to
`app_process64` for all apps. The actual package name is only in
`/proc/PID/cmdline`. The previous implementation applied name filters
without supplementing with cmdline, so searches failed.
## Fix
- Delegate to lldb-server via GDB Remote Protocol (respects `hidepid=2`)
- Get all visible processes, supplement zygote/app_process entries with
cmdline, then apply name matching
- Only fetch cmdline for zygote apps (performance), parallelize with
`xargs -P 8`
- Remove redundant code (GDB Remote Protocol already provides GID/arch)
## Test Results
### Before this fix:
```
(lldb) platform process list
error: no processes were found on the "remote-android" platform
(lldb) platform process list -n com.example.hellojni
1 matching process was found on "remote-android"
PID PARENT USER TRIPLE NAME
====== ====== ========== ============================== ============================
5276 359 u0_a192 com.example.hellojni
^^^^^^^^ Missing triple!
```
### After this fix:
```
(lldb) platform process list
PID PARENT USER TRIPLE NAME
====== ====== ========== ============================== ============================
1 0 root aarch64-unknown-linux-android init
2 0 root [kthreadd]
359 1 system aarch64-unknown-linux-android app_process64
5276 359 u0_a192 aarch64-unknown-linux-android com.example.hellojni
5357 5355 u0_a192 aarch64-unknown-linux-android sh
5377 5370 u0_a192 aarch64-unknown-linux-android lldb-server
^^^^^^^^ User-space processes now have triples!
(lldb) platform process list -n com.example.hellojni
1 matching process was found on "remote-android"
PID PARENT USER TRIPLE NAME
====== ====== ========== ============================== ============================
5276 359 u0_a192 aarch64-unknown-linux-android com.example.hellojni
(lldb) process attach -n com.example.hellojni
Process 5276 stopped
* thread #1, name = 'example.hellojni', stop reason = signal SIGSTOP
```
## Test Plan
With an Android device/emulator connected:
1. Start lldb-server on device:
```bash
adb push lldb-server /data/local/tmp/
adb shell chmod +x /data/local/tmp/lldb-server
adb shell /data/local/tmp/lldb-server platform --listen 127.0.0.1:9500 --server
```
2. Connect from LLDB:
```
(lldb) platform select remote-android
(lldb) platform connect connect://127.0.0.1:9500
(lldb) platform process list
```
3. Verify:
- `platform process list` returns all processes with triple information
- `platform process list -n com.example.app` finds Android apps by
package name
- `process attach -n com.example.app` successfully attaches to Android
apps
## Impact
Restores `platform process list` on Android with architecture
information and package name lookup. All name matching modes now work
correctly.
Fixes llvm#164192
Stylie777
pushed a commit
that referenced
this pull request
Nov 26, 2025
…am (llvm#167724) This got exposed by `09262656f32ab3f2e1d82e5342ba37eecac52522`. The underlying stream of `m_os` is referenced by the `TextDiagnostic` member of `TextDiagnosticPrinter`. It got turned into a `llvm::formatted_raw_ostream` in the commit above. When `~TextDiagnosticPrinter` (and thus `~TextDiagnostic`) is invoked, we now call `~formatted_raw_ostream`, which tries to access the underlying stream. But `m_os` was already deleted because it is earlier in the order of destruction in `TextDiagnosticPrinter`. Move the `m_os` member before the `TextDiagnosticPrinter` to avoid a use-after-free. Drive-by: * Also move the `m_output` member which the `m_os` holds a reference to. The fact it's a reference indicates the expectation is most likely that the string outlives the stream. The ASAN macOS bot is currently failing with this: ``` 08:15:39 ================================================================= 08:15:39 ==61103==ERROR: AddressSanitizer: heap-use-after-free on address 0x60600012cf40 at pc 0x00012140d304 bp 0x00016eecc850 sp 0x00016eecc848 08:15:39 READ of size 8 at 0x60600012cf40 thread T0 08:15:39 #0 0x00012140d300 in llvm::formatted_raw_ostream::releaseStream() FormattedStream.h:205 08:15:39 #1 0x00012140d3a4 in llvm::formatted_raw_ostream::~formatted_raw_ostream() FormattedStream.h:145 08:15:39 #2 0x00012604abf8 in clang::TextDiagnostic::~TextDiagnostic() TextDiagnostic.cpp:721 08:15:39 llvm#3 0x00012605dc80 in clang::TextDiagnosticPrinter::~TextDiagnosticPrinter() TextDiagnosticPrinter.cpp:30 08:15:39 llvm#4 0x00012605dd5c in clang::TextDiagnosticPrinter::~TextDiagnosticPrinter() TextDiagnosticPrinter.cpp:27 08:15:39 llvm#5 0x0001231fb210 in (anonymous namespace)::StoringDiagnosticConsumer::~StoringDiagnosticConsumer() ClangModulesDeclVendor.cpp:47 08:15:39 llvm#6 0x0001231fb3bc in (anonymous namespace)::StoringDiagnosticConsumer::~StoringDiagnosticConsumer() ClangModulesDeclVendor.cpp:47 08:15:39 llvm#7 0x000129aa9d70 in clang::DiagnosticsEngine::~DiagnosticsEngine() Diagnostic.cpp:91 08:15:39 llvm#8 0x0001230436b8 in llvm::RefCountedBase<clang::DiagnosticsEngine>::Release() const IntrusiveRefCntPtr.h:103 08:15:39 llvm#9 0x0001231fe6c8 in (anonymous namespace)::ClangModulesDeclVendorImpl::~ClangModulesDeclVendorImpl() ClangModulesDeclVendor.cpp:93 08:15:39 llvm#10 0x0001231fe858 in (anonymous namespace)::ClangModulesDeclVendorImpl::~ClangModulesDeclVendorImpl() ClangModulesDeclVendor.cpp:93 ... 08:15:39 08:15:39 0x60600012cf40 is located 32 bytes inside of 56-byte region [0x60600012cf20,0x60600012cf58) 08:15:39 freed by thread T0 here: 08:15:39 #0 0x0001018abb88 in _ZdlPv+0x74 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x4bb88) 08:15:39 #1 0x0001231fb1c0 in (anonymous namespace)::StoringDiagnosticConsumer::~StoringDiagnosticConsumer() ClangModulesDeclVendor.cpp:47 08:15:39 #2 0x0001231fb3bc in (anonymous namespace)::StoringDiagnosticConsumer::~StoringDiagnosticConsumer() ClangModulesDeclVendor.cpp:47 08:15:39 llvm#3 0x000129aa9d70 in clang::DiagnosticsEngine::~DiagnosticsEngine() Diagnostic.cpp:91 08:15:39 llvm#4 0x0001230436b8 in llvm::RefCountedBase<clang::DiagnosticsEngine>::Release() const IntrusiveRefCntPtr.h:103 08:15:39 llvm#5 0x0001231fe6c8 in (anonymous namespace)::ClangModulesDeclVendorImpl::~ClangModulesDeclVendorImpl() ClangModulesDeclVendor.cpp:93 08:15:39 llvm#6 0x0001231fe858 in (anonymous namespace)::ClangModulesDeclVendorImpl::~ClangModulesDeclVendorImpl() ClangModulesDeclVendor.cpp:93 ... 08:15:39 08:15:39 previously allocated by thread T0 here: 08:15:39 #0 0x0001018ab760 in _Znwm+0x74 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x4b760) 08:15:39 #1 0x0001231f8dec in lldb_private::ClangModulesDeclVendor::Create(lldb_private::Target&) ClangModulesDeclVendor.cpp:732 08:15:39 #2 0x00012320af58 in lldb_private::ClangPersistentVariables::GetClangModulesDeclVendor() ClangPersistentVariables.cpp:124 08:15:39 llvm#3 0x0001232111f0 in lldb_private::ClangUserExpression::PrepareForParsing(lldb_private::DiagnosticManager&, lldb_private::ExecutionContext&, bool) ClangUserExpression.cpp:536 08:15:39 llvm#4 0x000123213790 in lldb_private::ClangUserExpression::Parse(lldb_private::DiagnosticManager&, lldb_private::ExecutionContext&, lldb_private::ExecutionPolicy, bool, bool) ClangUserExpression.cpp:647 08:15:39 llvm#5 0x00012032b258 in lldb_private::UserExpression::Evaluate(lldb_private::ExecutionContext&, lldb_private::EvaluateExpressionOptions const&, llvm::StringRef, llvm::StringRef, std::__1::shared_ptr<lldb_private::ValueObject>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, lldb_private::ValueObject*) UserExpression.cpp:280 08:15:39 llvm#6 0x000120724010 in lldb_private::Target::EvaluateExpression(llvm::StringRef, lldb_private::ExecutionContextScope*, std::__1::shared_ptr<lldb_private::ValueObject>&, lldb_private::EvaluateExpressionOptions const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, lldb_private::ValueObject*) Target.cpp:2905 08:15:39 llvm#7 0x00011fc7bde0 in lldb::SBTarget::EvaluateExpression(char const*, lldb::SBExpressionOptions const&) SBTarget.cpp:2305 08:15:39 ==61103==ABORTING ... ```
Stylie777
pushed a commit
that referenced
this pull request
Nov 26, 2025
llvm#168105) …63019)" This reverts commit 92e5608.
Stylie777
pushed a commit
that referenced
this pull request
Nov 26, 2025
llvm#168619) I've been working on some scripts that evaluate the parent and child frame. It's been very annoying that the parent frame has a property but not the child. So I've added this to the extensions, I would've preferred to return None, but because the existing impl returns an invalid SBFrame, so I'm conforming to that API. ``` (lldb) script Python Interactive Interpreter. To exit, type 'quit()', 'exit()' or Ctrl-D. >>> lldb.frame frame #0: 0x0000555555555200 fib.out`main >>> lldb.frame.parent frame #1: 0x00007ffff782a610 libc.so.6`__libc_start_call_main + 128 >>> lldb.frame.parent.child frame #0: 0x0000555555555200 fib.out`main ```
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.

[libc][math] Disable
FEnvSafeTest.cppif AArch64 target has no FP support (llvm#166370)The
FEnvSafeTest.cpptest fails on AArch64 soft nofp configurationsbecause LLVM libc does not provide a floating-point environment in these
configurations.
This patch adds another preprocessor guard on
__ARM_FPto disable thetest on those.
[clang][bytecode] Remove dummy variables once they are proper globals (llvm#166174)
Dummy variables have an entry in
Program::Globals, but they are notadded to
GlobalIndices. When registering redeclarations, we used toonly patch up the global indices, but that left the dummy variables
alone. Update the dummy variables of all redeclarations as well.
Fixes llvm#165952
[clang-tidy][doc] add more information in twine-local's document (llvm#166266)
explain more about use-after-free in llvm-twine-local
add note about manually adjusting code after applying fix-it.
fixed: llvm#154810
[CIR] Add support for storing into _Atomic variables (llvm#165872)
[mlir] Dialect Conversion: Add support for post-order legalization order (llvm#166292)
By default, the dialect conversion driver processes operations in
pre-order: the initial worklist is populated pre-order. (New/modified
operations are immediately legalized recursively.)
This commit adds a new API for selective post-order legalization.
Patterns can request an operation / region legalization via
ConversionPatternRewriter::legalize. They can call these helperfunctions on nested regions before rewriting the operation itself.
Note: In rollback mode, a failed recursive legalization typically leads
to a conversion failure. Since recursive legalization is performed by
separate pattern applications, there is no way for the original pattern
to recover from such a failure.
[mlir][LLVM] Fix unsupported FP lowering in
VectorConvertToLLVMPattern(llvm#166513)Fixes a bug in
VectorConvertToLLVMPattern, which converted operationswith unsupported FP types. E.g.,
arith.addf ... : f4E2M1FNwas loweredto
llvm.fadd ... : i4, which does not verify. There are a few morepatterns that have the same bug. Those will be fixed in follow-up PRs.
This commit is in preparation of adding an
APFloat-based lowering forarithoperations with unsupported floating-point types.[utils][UpdateLLCTestChecks] Add MIR support to update_llc_test_checks.py. (llvm#164965)
This change enables update_llc_test_checks.py to automatically generate
MIR checks for RUN lines that use
-stop-beforeor-stop-afterflagsallowing tests to verify intermediate compilation stages (e.g., after
instruction selection but before peephole optimizations) alongside the
final assembly output. If
-debug-onlyflag is present in the run line it'sconsidered as the main point of interest for testing and stop flags above
are ignored (that is no MIR checks are generated).
This resulted from the scenario, when I needed to test two instruction
matching patterns where the later pattern in the peepholer reverts the
earlier pattern in the instruction selector and distinguish it from the
case when the earlier pattern didn't worked at all.
Initially created by Claude Sonnet 4.5 it was improved later to handle
conflicts in MIR <-> ASM prefixes and formatting.
[llvm][LoongArch] Introduce LASX and LSX conversion intrinsics (llvm#157818)
This patch introduces the LASX and LSX conversion intrinsics:
float>)
double>)
float>)
double>)
float>)
double>)
[gn] port 0c73009
[libc][math] Refactor exp2m1f16 implementation to header-only in src/__support/math folder. (llvm#162019)
Part of llvm#147386
in preparation for: https://discourse.llvm.org/t/rfc-make-clang-builtin-math-functions-constexpr-with-llvm-libc-to-support-c-23-constexpr-math-functions/86450
[clang-tidy][NFC] Fix broken link in
bugprone-default-operator-new-on-overaligned-type(llvm#166546)[Clang][NFC] Refactor SemaCXX/dllexport.cpp to use -verify= instead of macros (llvm#165855)
[RISCV] Add a test for multiple save locations of a callee-saved register (llvm#164479)
Technically, it is possible that the a callee-saved register is saved in
different locations. CFIInstrInserter should handle this, but currently
it does not.
[VPlan] Avoid sinking allocas in sinkScalarOperands (llvm#166135)
Use cannotHoistOrSinkRecipe to forbid sinking allocas.
[Polly] Check for ISL errors after schedule optimization (llvm#166551)
When ISL encounters an internal error, it sets the error flag, but it is
not isl_error_quota that was already checked. Check for general errors
and abort the schedule optimization if that happens, instead of
continuing on the good path.
The error occured when compiling llvm-test-suite's
MultiSource/Applications/JM/lencod/leaky_bucket.c with Polly enabled.
Not adding a test case because it depends on ISL internals. We do not
want to a test case to depend on which version of ISL is used.
Revert "[utils][UpdateLLCTestChecks] Add MIR support to update_llc_test_checks.py." (llvm#166549)
Reverts llvm#164965
[X86] Add test coverage for llvm#166534 (llvm#166552)
[Clang][ARM] Fix tests using thumb instead arm arch on cc1 (llvm#166416)
Currently, the ARM backend incorrectly parses every
armprefixed archto be non-thumb, but
armv6mis THUMB and doesnt have ARM ops causingthe test to fail when compiling to assembly and not LLVM IR:
error: Function 'foo' uses ARM instructions, but the target does not support ARM mode execution.This only happens when invoking cc1 directly andnot the Clang driver.
As a quick triage, this patch changes the tests to use
thumb.Uncovered by llvm#151404
[RISCV] Introduce pass to promote double constants to a global array (llvm#160536)
As discussed in llvm#153402, we have inefficiences in handling constant pool
access that are difficult to address. Using an IR pass to promote double
constants to a global allows a higher degree of control of code
generation for these accesses, resulting in improved performance on
benchmarks that might otherwise have high register pressure due to
accessing constant pool values separately rather than via a common base.
Directly promoting double constants to separate global values and
relying on the global merger to do a sensible thing would be one
potential avenue to explore, but it is not done in this version of the
patch because:
function pass, yet all of the work is done in initialisation. This means
that attempts by backends to schedule it after a given module pass don't
actually work as expected.
tweaking it to get the behaviour desired for promoted constants may lead
to other issues. This may be completely tractable though.
Now that llvm#159352 has landed, the impact on terms if dynamically executed
instructions is slightly smaller (as we are starting from a better
baseline), but still worthwhile in lbm and nab from SPEC. Results below
are for rva22u64:
Notes for reviewers:
an incremental improvement to the status quo even if there is more work
to be done around the general issue of constant pool handling. We can
discuss here if that is actually the best next step or not, but I just
wanted to clarify that's why this is being posted with a somewhat narrow
scope.
as both cases saw some regressions.
[clang][NFC] Rename stale TypeSourceInfo DI variables (llvm#166082)
Fixes llvm#165346
This patch renames stale variable names where
TypeSourceInfoobjectswere still using the old
DI(DeclaratorInfo) naming convention.Specifically, variables of type
TypeSourceInfohave been updated fromDItoTSIto improve code clarity and maintain consistency with thecurrent naming.
[DebugInfo] Assign best possible debugloc to bundle (llvm#164573)
The debug info attached to the BUNDLE is the first instruction in the
BUNDLE, even if a better debug info (line:column) is present in the
later instructions of the bundle. The patch tries to get a better debug
info first. If not, then a worse debug info without line number is
chosen.
Co-authored-by: Vladislav Dzhidzhoev [email protected]
Co-authored-by: Orlando Cazalet-Hyams [email protected]
[clang][bytecode] Check types when loading values (llvm#165385)
We need to allow BitCasts between pointer types to different prim types,
but that means we need to catch the problem at a later stage, i.e. when
loading the values.
Fixes llvm#158527
Fixes llvm#163778
DAG: Avoid some libcall string name comparisons (llvm#166321)
Move to the libcall impl based functions.
[PowerPC][NFC] auto gen checks vec rounding tests (llvm#166435)
Update tests to contain auto generated checks.
[MLIR][NVVM] Fix the lowering of mbarrier.test.wait (llvm#166555)
PR llvm#165993 accidentally broke the lowering of the
test.waitOp.This patch fixes the issue and adds tests to verify the lowering to
intrinsics for all mbarrier Ops, ensuring similar regressions are caught in the
future.
Additionally, the
cp-async-mbarriertest is moved to thembarriers.mlirtest file to keep all related tests together.Signed-off-by: Durgadoss R [email protected]
[BOLT][NFC] Rename funtions with _negative suffix to _unknown when th… (llvm#166536)
…e size is unknown
Keep _negative suffix only for test cases when the size is negative
[flang] Adding NOTIFY specifier in image selector and add notify type checks (llvm#148810)
This PR adds support for the NOTIFY specifier in the image selector as
described in the 2023 standard, and add checks for the NOTIFY_TYPE type.
[ProfCheck][NFC] Make Function argument from branch weight setter optional (llvm#166032)
This picks up from llvm#166028, making the
Functionargument optional:most cases don't need to provide it, but in e.g. InstCombine's case,
where the instruction (select, branch) is not attached to a function
yet, the function needs to be passed explicitly.
Co-authored-by: Florian Hahn [email protected]
[gn] port bb4ed55
[Clang] FunctionEffects: ignore (methods of) local CXXRecordDecls. (llvm#166078)
In the following example,
Functor::method()inappropriately triggers adiagnostic that
outer()is blocking by allocating memory.Co-authored-by: Doug Wyatt [email protected]
[NFC][TableGen] Adopt NamespaceEmitter in DirectiveEmitter (llvm#165600)
[MsDemangle] Read entire chain of target names in special tables (llvm#155630)
When there's a deep inheritance hierarchy of multiple C++ classes (see
below), then the mangled name of a VFTable can include multiple key
nodes in the target name.
For example, in the following code, MSVC will generate mangled names for
the VFTables that have up to three key classes in the context.
Code
This will include
??_7C@@6BInd1@@Ind4@@Ind5@@@(and every othercombination). Microsoft's undname will demangle this to "const
C::`vftable'{for `Ind1's `Ind4's `Ind5'}". Previously, LLVM would
demangle this to "const C::`vftable'{for `Ind1'}".
With this PR, the output of LLVM's undname will be identical to
Microsoft's version. This changes
SpecialTableSymbolNode::TargetNameto a node array which contains each key from the name. Unlike
namespaces, these keys are not in reverse order - they are in the same
order as in the mangled name.
Fix failures introduced in llvm#166032 (llvm#166574)
[gn build] Port 3700587
[gn build] Port 3ebed51
[gn build] Port 51d0f6d
[gn build] Port 718a3b2
[gn build] Port dd14eb8
[Flang][OpenMP] Add Lowering support for taskloop reductions
Support for lowering the Reduction clause support already exists,
so we can extend the support for taskloop to include reduction.
As support for Reduction in taskloop was only added in OpenMP 5.0,
the use of the Clause has been restricted to that version of OpenMP
or greater.