Feature/merge upstream 20210517 #57

kaz7 · 2021-07-15T02:20:30Z

Merge up to 2021/5/17.
Need the latest llvm-dev script to compile omptarget for VE.

Pass regression tests.

…nsts.ll. Remove (unneeded) '-asan-use-after-return' from hoist-argument-init-insts.ll. Reviewed By: vsk Differential Revision: https://reviews.llvm.org/D102448

`-fno-exceptions -fno-asynchronous-unwind-tables` compiled programs don't produce .eh_frame on Linux and other ELF platforms, so the slow unwinder cannot print stack traces. Just fall back to the fast unwinder: this allows -fno-asynchronous-unwind-tables without requiring the sanitizer option `fast_unwind_on_fatal=1` Reviewed By: #sanitizers, vitalybuka Differential Revision: https://reviews.llvm.org/D102046

This is not expected to have any practical compile-time effect, as the alias() calls inside callCapturesBefore() are rare. This should still be supported for API completeness, and might be useful for reachability caching.

Support for Darwin's libsystem_m's vector functions has been added to LLVM in 93a9a8a. This patch adds support for -fveclib=Darwin_libsystem_m to Clang. Reviewed By: arphaman Differential Revision: https://reviews.llvm.org/D102489

EOM. Reviewed By: cryptoad Differential Revision: https://reviews.llvm.org/D102529

It's easy to hit 2**16 limit with i686 GNU toolchains these days. Clang does it automagically, so it's not needed there, and the option causes warnings about being unused when linking. Differential Revision: https://reviews.llvm.org/D102419

AFAIK this is the default behaviour when this flag is not passed. Differential Revision: https://reviews.llvm.org/D102516

This patch adds the abstract class SystemZCallingConventionRegisters which is a SystemZ-specific class detailing special registers used by calling conventions on the target. SystemZELFRegisters and SystemZXPLINK64Registers implement this class for ELF and XPLINK64 respectively. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D102370

Previously, we already used BatchAA for individual simple pointer dependency queries. This extends BatchAA usage for the non-local case, so that only one BatchAA instance is used for all blocks, instead of one instance per block. Use of BatchAA is safe as IR cannot be modified during a MemDep query.

Currently we didn't support multiple return type, we work around to use error_code to represent: 1) The dangling probe. 2) Ignore the weight of non-probe instruction While merging the instructions' weight for the whole BB, it will filter out the error code. But If all instructions of the BB give error_code, the outside logic will mark it as a BB requiring the inference algorithm to infer its weight. This is different from the zero value which will be treated as a cold block. Fix one place that if we can't find the FunctionSamples in the profile data which indicates the BB is cold, we choose to return zero. Also refine the comments. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D102007

Thanks to Leonard Chan for reporting.

…ingMDNodeRef copies. NFCI." This reverts commit 5ed56a8. Reason: Broke the MSan buildbots. See Phabricator for more info (https://reviews.llvm.org/rG5ed56a821c0622869739a3ae752eea97a1ee1f48).

…zation (3/n) Differential revision: https://reviews.llvm.org/D102417

The FixSGPRCopies pass converts instructions to VALU when removing illegal VGPR to SGPR copies. Instructions that use SCC are changed to use VCC instead. When that happens, the pass must also change instructions that define SCC to define VCC. The pass was not changing the SCC definition when an ADDC is converted due to a input that is a VGPR to SGPR copy. But, the initial ADD insruction, which define SCC, is not converted. This causes a compilation failure due to a use of an undefined physical register. This patch adds code that inserts the SCC definition in the MoveToVALU worklist when a SCC use is converted to a VCC use. Differential Revision: https://reviews.llvm.org/D102111

…ng comprehensive bufferization (4/n) Differential revsion: https://reviews.llvm.org/D102420

Differential Revision: https://reviews.llvm.org/D101396

GlobalVariables are Constants, yet should not unconditionally be considered true for __builtin_constant_p. Via the LangRef https://llvm.org/docs/LangRef.html#llvm-is-constant-intrinsic: This intrinsic generates no code. If its argument is known to be a manifest compile-time constant value, then the intrinsic will be converted to a constant true value. Otherwise, it will be converted to a constant false value. In particular, note that if the argument is a constant expression which refers to a global (the address of which _is_ a constant, but not manifest during the compile), then the intrinsic evaluates to false. Move isManifestConstant from ConstantFolding to be a method of Constant so that we can reuse the same logic in LowerConstantIntrinsics. pr/41459 Reviewed By: rsmith, george.burgess.iv Differential Revision: https://reviews.llvm.org/D102367

I missed this when merging gtest 1.10.0, breaking all asan tests :|

This method was removed in https://reviews.llvm.org/D102265 but the declaration was missed.

…lars For column-major this is: A * B^t whereas for row-major: A^t * B Differential Revision: https://reviews.llvm.org/D101762

…tils.cpp Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D102533

…obal For opaque pointers, to avoid PointerType::getElementType(). Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102638

Only supported with -polly-position=early. Unfortunately, the extension point callpack for VectorizerStart only passes a FunctionPassManager, making it impossible to add a module pass.

During inlining of call-site with deoptimize intrinsic callee we miss attributes set on this call site. As a result attributes like deopt-lowering are disappeared resulting in inefficient behavior of register allocator in codegen. Just copy attributes for deoptimize call like we do for others calls. Reviewers: reames, apilipenko Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D102602

For the same reason as with -polly-dump-before, it is only supported with -polly-position=early.

This reverts commit a6d3987.

…f the scalar loop must execute (try 3)" This reverts commit 6d3e3ae. Still seeing PPC build bot failures, and one arm self host bot failing. I'm officially stumped, and need help from a bot owner to reduce.

The main motivation for this refactor is to remove the subclass relationship between the InputSegment and MergeInputSegment and SyntenticMergedInputSegment so that we can use the merging classes for debug sections which are not data segments. In the process of refactoring I also remove all the virtual functions from the class hierarchy and try to reuse techniques used in the ELF linker (see `lld/ELF/InputSections.h`). Differential Revision: https://reviews.llvm.org/D102546

Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D102596

This change tries to handle multiple dominating users of the pointer operand by choosing the most immediately dominating one, if possible. While making this change I also found that the previous implementation had a missing break statement, making all loads with an odd number of dominating users emit an OtherAccess value, so that has also been fixed. Patch by Henrik G Olsson! Differential Revision: https://reviews.llvm.org/D79097

This diff changes the type of the argument of isCodeSection to const InputSection *. NFC. Test plan: make check-lld-macho Differential revision: https://reviews.llvm.org/D102664

…-spill-base.

This initial patch removes some unused variables from global namespace. There will more incoming patches for moving global variables to classes or static members. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D102598

This patch is the Part-1 (FE Clang) implementation of HW Exception handling. This new feature adds the support of Hardware Exception for Microsoft Windows SEH (Structured Exception Handling). This is the first step of this project; only X86_64 target is enabled in this patch. Compiler options: For clang-cl.exe, the option is -EHa, the same as MSVC. For clang.exe, the extra option is -fasync-exceptions, plus -triple x86_64-windows -fexceptions and -fcxx-exceptions as usual. NOTE:: Without the -EHa or -fasync-exceptions, this patch is a NO-DIFF change. The rules for C code: For C-code, one way (MSVC approach) to achieve SEH -EHa semantic is to follow three rules: * First, no exception can move in or out of _try region., i.e., no "potential faulty instruction can be moved across _try boundary. * Second, the order of exceptions for instructions 'directly' under a _try must be preserved (not applied to those in callees). * Finally, global states (local/global/heap variables) that can be read outside of _try region must be updated in memory (not just in register) before the subsequent exception occurs. The impact to C++ code: Although SEH is a feature for C code, -EHa does have a profound effect on C++ side. When a C++ function (in the same compilation unit with option -EHa ) is called by a SEH C function, a hardware exception occurs in C++ code can also be handled properly by an upstream SEH _try-handler or a C++ catch(...). As such, when that happens in the middle of an object's life scope, the dtor must be invoked the same way as C++ Synchronous Exception during unwinding process. Design: A natural way to achieve the rules above in LLVM today is to allow an EH edge added on memory/computation instruction (previous iload/istore idea) so that exception path is modeled in Flow graph preciously. However, tracking every single memory instruction and potential faulty instruction can create many Invokes, complicate flow graph and possibly result in negative performance impact for downstream optimization and code generation. Making all optimizations be aware of the new semantic is also substantial. This design does not intend to model exception path at instruction level. Instead, the proposed design tracks and reports EH state at BLOCK-level to reduce the complexity of flow graph and minimize the performance-impact on CPP code under -EHa option. One key element of this design is the ability to compute State number at block-level. Our algorithm is based on the following rationales: A _try scope is always a SEME (Single Entry Multiple Exits) region as jumping into a _try is not allowed. The single entry must start with a seh_try_begin() invoke with a correct State number that is the initial state of the SEME. Through control-flow, state number is propagated into all blocks. Side exits marked by seh_try_end() will unwind to parent state based on existing SEHUnwindMap[]. Note side exits can ONLY jump into parent scopes (lower state number). Thus, when a block succeeds various states from its predecessors, the lowest State triumphs others. If some exits flow to unreachable, propagation on those paths terminate, not affecting remaining blocks. For CPP code, object lifetime region is usually a SEME as SEH _try. However there is one rare exception: jumping into a lifetime that has Dtor but has no Ctor is warned, but allowed: Warning: jump bypasses variable with a non-trivial destructor In that case, the region is actually a MEME (multiple entry multiple exits). Our solution is to inject a eha_scope_begin() invoke in the side entry block to ensure a correct State. Implementation: Part-1: Clang implementation described below. Two intrinsic are created to track CPP object scopes; eha_scope_begin() and eha_scope_end(). _scope_begin() is immediately added after ctor() is called and EHStack is pushed. So it must be an invoke, not a call. With that it's also guaranteed an EH-cleanup-pad is created regardless whether there exists a call in this scope. _scope_end is added before dtor(). These two intrinsics make the computation of Block-State possible in downstream code gen pass, even in the presence of ctor/dtor inlining. Two intrinsic, seh_try_begin() and seh_try_end(), are added for C-code to mark _try boundary and to prevent from exceptions being moved across _try boundary. All memory instructions inside a _try are considered as 'volatile' to assure 2nd and 3rd rules for C-code above. This is a little sub-optimized. But it's acceptable as the amount of code directly under _try is very small. Part-2 (will be in Part-2 patch): LLVM implementation described below. For both C++ & C-code, the state of each block is computed at the same place in BE (WinEHPreparing pass) where all other EH tables/maps are calculated. In addition to _scope_begin & _scope_end, the computation of block state also rely on the existing State tracking code (UnwindMap and InvokeStateMap). For both C++ & C-code, the state of each block with potential trap instruction is marked and reported in DAG Instruction Selection pass, the same place where the state for -EHsc (synchronous exceptions) is done. If the first instruction in a reported block scope can trap, a Nop is injected before this instruction. This nop is needed to accommodate LLVM Windows EH implementation, in which the address in IPToState table is offset by +1. (note the purpose of that is to ensure the return address of a call is in the same scope as the call address. The handler for catch(...) for -EHa must handle HW exception. So it is 'adjective' flag is reset (it cannot be IsStdDotDot (0x40) that only catches C++ exceptions). Suppress push/popTerminate() scope (from noexcept/noTHrow) so that HW exceptions can be passed through. Original llvm-dev [RFC] discussions can be found in these two threads below: https://lists.llvm.org/pipermail/llvm-dev/2020-March/140541.html https://lists.llvm.org/pipermail/llvm-dev/2020-April/141338.html Differential Revision: https://reviews.llvm.org/D80344/new/

This patch moves g_executables to private member of Runtime class and is renamed to HSAExecutables following LLVM naming convention. This movement required making Runtime::Initialize and Runtime::Finalize non-static. Verified the correctness of this change by running libomptarget tests on gfx906. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D102600

…ents for ArgListEntry"" This reverts commit 1f06f50.

This reverts commit 6351eb0.

…wering"" This reverts commit 757a9ce.

This reverts commit 950347a.

This reverts commit 113a664.

This reverts commit af9c013. First, f2f88f3 breaks omptarget for VE. I first revert this patch to compile omptarget for VE. But, I notice that reverting this on only our development branch causes a lot of merge conflicts. So, I'm reverting the revert of f2f88f3. However, this f2f88f3 completely doesn't concern about cross- compile environment. So, it is impossible to use their mechanism. This time, I will hack our llvm-dev script to replace generated cmake/ninja files to make them work for VE.

…-20210517 Reverting several cherry-picks in order to merge them belong to upstream. I tried to merge them leaving those cherry-pick patches, but it causes several errors. So, I'm going to reverte them by hand and merge them again belong to upstream.

…merge-upstream-20210517

…callback The `TypeSystemMap::m_mutex` guards against concurrent modifications of members of `TypeSystemMap`. In particular, `m_map`. `TypeSystemMap::ForEach` iterates through the entire `m_map` calling a user-specified callback for each entry. This is all done while `m_mutex` is locked. However, there's nothing that guarantees that the callback itself won't call back into `TypeSystemMap` APIs on the same thread. This lead to double-locking `m_mutex`, which is undefined behaviour. We've seen this cause a deadlock in the swift plugin with following backtrace: ``` int main() { std::unique_ptr<int> up = std::make_unique<int>(5); volatile int val = *up; return val; } clang++ -std=c++2a -g -O1 main.cpp ./bin/lldb -o “br se -p return” -o run -o “v *up” -o “expr *up” -b ``` ``` frame #4: std::lock_guard<std::mutex>::lock_guard frame #5: lldb_private::TypeSystemMap::GetTypeSystemForLanguage <<<< Lock #2 frame #6: lldb_private::TypeSystemMap::GetTypeSystemForLanguage frame #7: lldb_private::Target::GetScratchTypeSystemForLanguage ... frame #26: lldb_private::SwiftASTContext::LoadLibraryUsingPaths frame #27: lldb_private::SwiftASTContext::LoadModule frame #30: swift::ModuleDecl::collectLinkLibraries frame #31: lldb_private::SwiftASTContext::LoadModule frame #34: lldb_private::SwiftASTContext::GetCompileUnitImportsImpl frame #35: lldb_private::SwiftASTContext::PerformCompileUnitImports frame #36: lldb_private::TypeSystemSwiftTypeRefForExpressions::GetSwiftASTContext frame #37: lldb_private::TypeSystemSwiftTypeRefForExpressions::GetPersistentExpressionState frame #38: lldb_private::Target::GetPersistentSymbol frame #41: lldb_private::TypeSystemMap::ForEach <<<< Lock #1 frame #42: lldb_private::Target::GetPersistentSymbol frame #43: lldb_private::IRExecutionUnit::FindInUserDefinedSymbols frame #44: lldb_private::IRExecutionUnit::FindSymbol frame #45: lldb_private::IRExecutionUnit::MemoryManager::GetSymbolAddressAndPresence frame #46: lldb_private::IRExecutionUnit::MemoryManager::findSymbol frame #47: non-virtual thunk to lldb_private::IRExecutionUnit::MemoryManager::findSymbol frame #48: llvm::LinkingSymbolResolver::findSymbol frame #49: llvm::LegacyJITSymbolResolver::lookup frame #50: llvm::RuntimeDyldImpl::resolveExternalSymbols frame #51: llvm::RuntimeDyldImpl::resolveRelocations frame #52: llvm::MCJIT::finalizeLoadedModules frame #53: llvm::MCJIT::finalizeObject frame #54: lldb_private::IRExecutionUnit::ReportAllocations frame #55: lldb_private::IRExecutionUnit::GetRunnableInfo frame #56: lldb_private::ClangExpressionParser::PrepareForExecution frame #57: lldb_private::ClangUserExpression::TryParse frame #58: lldb_private::ClangUserExpression::Parse ``` Our solution is to simply iterate over a local copy of `m_map`. **Testing** * Confirmed on manual reproducer (would reproduce 100% of the time before the patch) Differential Revision: https://reviews.llvm.org/D149949

TNorthover and others added 30 commits May 14, 2021 19:21

SwiftAsync: remove duplicate instance in array. NFC.

709f2c7

Add another -Wdeprecated-copy hack for gtest

09499ef

[flang] s/TYPED_TEST_CASE/TYPED_TEST_SUITE/ as the former is deprecated

17ef101

Remove (unneeded) '-asan-use-after-return' from hoist-argument-init-i…

1b9972d

…nsts.ll. Remove (unneeded) '-asan-use-after-return' from hoist-argument-init-insts.ll. Reviewed By: vsk Differential Revision: https://reviews.llvm.org/D102448

[ProfData] Address a unit test FIXME

9c88fb4

GTEST_HAS_TR1_TUPLE is gone, stop defining it.

4901199

[sanitizer] Commit a missing change in BufferedStackTrace::Unwind

deb2b20

[SLP][NFC]Add a test for non-consecutive inserts, NFC.

20e2b4f

[AA] Support callCapturesBefore() on BatchAA (NFCI)

5e289cc

This is not expected to have any practical compile-time effect, as the alias() calls inside callCapturesBefore() are rare. This should still be supported for API completeness, and might be useful for reachability caching.

[LV] Add another more complex first-order recurrence sinking test.

68d52f0

[Scudo] Delete unused flag 'rss_limit_mb'.

6c913b2

EOM. Reviewed By: cryptoad Differential Revision: https://reviews.llvm.org/D102529

[LLD][MinGW] Ignore --no-undefined flag

f84a4cb

AFAIK this is the default behaviour when this flag is not passed. Differential Revision: https://reviews.llvm.org/D102516

[Polly] Run polly-update-format. NFC.

fb01b14

Thanks to Leonard Chan for reporting.

[NFC] Directly get GV type

e8448a5

Revert "[X86] Try to pass DebugLoc by const-ref to avoid costly Track…

7aa89c4

…ingMDNodeRef copies. NFCI." This reverts commit 5ed56a8. Reason: Broke the MSan buildbots. See Phabricator for more info (https://reviews.llvm.org/rG5ed56a821c0622869739a3ae752eea97a1ee1f48).

[mlir][Linalg] Add support for subtensor_insert comprehensive bufferi…

6f90955

…zation (3/n) Differential revision: https://reviews.llvm.org/D102417

[msan] [NFC] Add newline to EOF in test.

597ecf9

[mlir][Linalg] NFC - More gracefully degrade lookup into failure duri…

dd65f42

…ng comprehensive bufferization (4/n) Differential revsion: https://reviews.llvm.org/D102420

[compiler-rt] Fix deprection warnings on INSTANTIATE_TEST_CASE_P

cb84665

[libcxx][ranges] Add contiguous_iterator.

bede752

Differential Revision: https://reviews.llvm.org/D101396

Reinstate gtest fix from 4f0b0bf

a558ebb

I missed this when merging gtest 1.10.0, breaking all asan tests :|

[lld][WebAssembly] Remove unused method declaration. NFC

119f61a

This method was removed in https://reviews.llvm.org/D102265 but the declaration was missed.

anemet and others added 25 commits May 17, 2021 17:40

[Matrix] Fold the transpose into the matmul operand used to fetch sca…

fcffd08

…lars For column-major this is: A * B^t whereas for row-major: A^t * B Differential Revision: https://reviews.llvm.org/D101762

[NFC][OpaquePtr] Avoid using PointerType::getElementType() in VectorU…

cc64ece

…tils.cpp Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D102533

[NFC] Pass GV value type instead of pointer type to GetOrCreateLLVMGl…

9f7d552

…obal For opaque pointers, to avoid PointerType::getElementType(). Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102638

[Polly] Add support for -polly-dump-before(-file) with the NPM.

29bef8e

Only supported with -polly-position=early. Unfortunately, the extension point callpack for VectorizerStart only passes a FunctionPassManager, making it impossible to add a module pass.

[Polly] Add support for -polly-dump-after(-file) with the NPM.

ad568f4

For the same reason as with -polly-dump-before, it is only supported with -polly-position=early.

Revert "[ADT] Add new type traits for type pack indexes"

2d1f2ba

This reverts commit a6d3987.

Revert "[LV] Unconditionally branch from middle to scalar preheader i…

ed9d707

…f the scalar loop must execute (try 3)" This reverts commit 6d3e3ae. Still seeing PPC build bot failures, and one arm self host bot failing. I'm officially stumped, and need help from a bot owner to reduce.

[PowerPC] only check the load instruction result number 0.

15d4ed6

Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D102596

[lld][MachO] Adjust isCodeSection signature

dc2c6cf

This diff changes the type of the argument of isCodeSection to const InputSection *. NFC. Test plan: make check-lld-macho Differential revision: https://reviews.llvm.org/D102664

[Statepoint Lowering] Cleanup: remove unused option statepoint-always…

57c660f

…-spill-base.

Merge commit 'b159987' into develop

b3113da

Revert "Revert "[TargetLowering] Only inspect attributes in the argum…

a5d46f4

…ents for ArgListEntry"" This reverts commit 1f06f50.

Revert "AMDGPU/GlobalISel: Implement tail calls"

6b241b2

This reverts commit 6351eb0.

Revert "Revert "[NFC] Use ArgListEntry indirect types more in ISel lo…

8ce9026

…wering"" This reverts commit 757a9ce.

Revert "IR+AArch64: add a "swiftasync" argument attribute."

237a3d8

This reverts commit 950347a.

Revert "AMDGPU/GlobalISel: Implement tail calls"

5b4cfc8

This reverts commit 113a664.

Merge commit 'd7503c3bce491e2672386906ec879c7df5ede7a5' into feature/…

c79cc30

…merge-upstream-20210517

kaz7 merged commit 3abc595 into develop Jul 15, 2021

kaz7 deleted the feature/merge-upstream-20210517 branch July 15, 2021 02:21

This was referenced Jul 15, 2021

Need to apply b159987 #55

Closed

Need to find a way to compile libomptarget for VE out-of-tree #56

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature/merge upstream 20210517 #57

Feature/merge upstream 20210517 #57

Uh oh!

kaz7 commented Jul 15, 2021

Uh oh!

Uh oh!

Feature/merge upstream 20210517 #57

Feature/merge upstream 20210517 #57

Uh oh!

Conversation

kaz7 commented Jul 15, 2021

Uh oh!

Uh oh!