-
Notifications
You must be signed in to change notification settings - Fork 13.4k
attempt #1 #306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
attempt #1 #306
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
… are not shown The "undefined symbol" error message from lld-link displays up to 3 references to that symbol, and the number of extra references not shown. This patch removes the computation of the strings for those extra references. It fixes a freeze of lld-link we accidentally encountered when activating asan on a large project, without linking with the asan library. In that case, __asan_report_load8 was referenced more than 2 million times, causing the computation of that many display strings, of which only 3 were used. Differential Revision: https://reviews.llvm.org/D83510 (cherry picked from commit 3a108ab)
(cherry picked from commit 817767a)
…ogue Current powerpc backend generates wrong code sequence if stack pointer has to realign if -fstack-clash-protection enabled. When probing in prologue, backend should generate a subtraction instruction rather than a `stux` instruction to realign the stack pointer. This patch is part of fix of https://bugs.llvm.org/show_bug.cgi?id=46759. Differential Revision: https://reviews.llvm.org/D84218 (cherry picked from commit 8912252)
…ing dynalloc Current powerpc backend generates wrong code sequence if stack pointer has to realign if `-fstack-clash-protection` enabled. When probing dynamic stack allocation, current `PREPARE_PROBED_ALLOCA` takes `NegSizeReg` as input and returns `FinalStackPtr`. `FinalStackPtr=StackPtr+ActualNegSize` is calculated correctly, however code following `PREPARE_PROBED_ALLOCA` still uses value of `NegSizeReg`, which does not contain `ActualNegSize` if `MaxAlign > TargetAlign`, to calculate loop trip count and residual number of bytes. This patch is part of fix of https://bugs.llvm.org/show_bug.cgi?id=46759. Differential Revision: https://reviews.llvm.org/D84152 (cherry picked from commit c3f9697)
This way should be the same like with a.pcm for modules. An alternative way is 'clang++ -c empty.cpp -include-pch a.pch -o a.o -Xclang -building-pch-with-obj', which is what clang-cl's /Yc does internally. Differential Revision: https://reviews.llvm.org/D83716 (cherry picked from commit 3895466)
Using -fmodules-* options for PCHs is a bit confusing, so add -fpch-* variants. Having extra options also makes it simple to do a configure check for the feature. Also document the options in the release notes. Differential Revision: https://reviews.llvm.org/D83623 (cherry picked from commit 54eea61)
This assert was added to verify assumption that GEP's SCEV will be of pointer type, basing on fact that it should be a SCEVAddExpr with (at least) last operand being pointer. Two notes: - GEP's SCEV does not have to be a SCEVAddExpr after all simplifications; - In current state, GEP's SCEV does not have to have at least one pointer operands (all of them can become int during the transforms). However, we might want to be at a point where it is true. We are currently removing this assert and will try to enumerate the cases where "is pointer" notion might be lost during the transforms. When all of them are fixed, we can return it. Differential Revision: https://reviews.llvm.org/D84294 Reviewed By: lebedev.ri (cherry picked from commit b96114c)
since it's failing.
(cherry picked from commit 13ae440)
Fixes https://bugs.llvm.org/show_bug.cgi?id=46680. Just like insertions through IRBuilder, InsertNewInstBefore() should be using the deferred worklist mechanism, so that processing of newly added instructions is prioritized. There's one side-effect of the worklist order change which could be classified as a regression. An add op gets pushed through a select that at the time is not a umax. We could add a reverse transform that tries to push adds in the reverse direction to restore a min/max, but that seems like a sure way of getting infinite loops... Seems like something that should best wait on min/max intrinsics. Differential Revision: https://reviews.llvm.org/D84109 (cherry picked from commit d12ec0f)
…VECTOR(X,0)) patterns. getTargetShuffleMask is used by the various "SimplifyDemanded" folds so we can't assume that the bypassed extract_subvector can be safely simplified - getFauxShuffleMask performs a more general decode that allows us to more safely catch many of these cases so the impact is minimal. (cherry picked from commit 5b5dc24)
Differential Revision: https://reviews.llvm.org/D81385 (cherry picked from commit a41af6e)
Summary: We need to detect when certain TypoExprs are not being transformed due to invalid trees, otherwise we risk endlessly trying to fix it. Reviewers: rsmith Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D84067 (cherry picked from commit dde98c8)
…b asm instructions This patch provides optimization of bit manipulation operations by enabling the +experimental-b target feature. It adds matching of single block patterns of instructions to specific bit-manip instructions from the base subset (zbb subextension) of the experimental B extension of RISC-V. It adds also the correspondent codegen tests. This patch is based on Claire Wolf's proposal for the bit manipulation extension of RISCV: https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf Differential Revision: https://reviews.llvm.org/D79870 (cherry picked from commit e2692f0)
…p asm instructions This patch provides optimization of bit manipulation operations by enabling the +experimental-b target feature. It adds matching of single block patterns of instructions to specific bit-manip instructions from the permutation subset (zbp subextension) of the experimental B extension of RISC-V. It adds also the correspondent codegen tests. This patch is based on Claire Wolf's proposal for the bit manipulation extension of RISCV: https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf Differential Revision: https://reviews.llvm.org/D79871 (cherry picked from commit 31b52b4)
…bp asm instructions This patch provides optimization of bit manipulation operations by enabling the +experimental-b target feature. It adds matching of single block patterns of instructions to specific bit-manip instructions belonging to both the permutation and the base subsets of the experimental B extension of RISC-V. It adds also the correspondent codegen tests. This patch is based on Claire Wolf's proposal for the bit manipulation extension of RISCV: https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf Differential Revision: https://reviews.llvm.org/D79873 (cherry picked from commit 6144f0a)
…s asm instructions This patch provides optimization of bit manipulation operations by enabling the +experimental-b target feature. It adds matching of single block patterns of instructions to specific bit-manip instructions from the single-bit subset (zbs subextension) of the experimental B extension of RISC-V. It adds also the correspondent codegen tests. This patch is based on Claire Wolf's proposal for the bit manipulation extension of RISCV: https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf Differential Revision: https://reviews.llvm.org/D79874 (cherry picked from commit d4be333)
…t asm instructions This patch provides optimization of bit manipulation operations by enabling the +experimental-b target feature. It adds matching of single block patterns of instructions to specific bit-manip instructions from the ternary subset (zbt subextension) of the experimental B extension of RISC-V. It adds also the correspondent codegen tests. This patch is based on Claire Wolf's proposal for the bit manipulation extension of RISCV: https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf Differential Revision: https://reviews.llvm.org/D79875 (cherry picked from commit c9c955a)
For comdats (e.g. caused by -ffunction-sections), Section is already set here; make sure it's null, for the weak external symbol to be undefined. This fixes PR46779. Differential Revision: https://reviews.llvm.org/D84507 (cherry picked from commit 9e81d8b)
This fixes PR 42837. Differential Revision: https://reviews.llvm.org/D84465 (cherry picked from commit 4d09ed9)
…le loop counters. Summary: If the variable is constrcutible, its copy is created by calling a constructor. Such variables are duplicated and thus, must be captured. Reviewers: jdoerfert Subscribers: yaxunl, guansong, cfe-commits, sstefan1, caomhin Tags: #clang Differential Revision: https://reviews.llvm.org/D83909 (cherry picked from commit 9840208)
…th undef if needed when concatenating small or loads to match a larger load In the included test case the align 16 allowed the v23f32 load to handled as load v16f32, load v4f32, and load v4f32(one element not used). These loads all need to be concatenated together into a final vector. In this case we tried to concatenate the two v4f32 loads to match the type of the v16f32 load so we could do a second concat_vectors, but those loads alone only add up to v8f32. So we need to two v4f32 undefs to pad it. It appears we've tried to hack around a similar issue in this code before by adding undef padding to loads in one of the earlier loops in this function. Originally in r147964 by padding all loads narrower than previous loads to the same size. Later modifed to only the last load in r293088. This patch removes that earlier code and just handles it on demand where we know we need it. Fixes PR46820 Differential Revision: https://reviews.llvm.org/D84463 (cherry picked from commit 8131e19)
…oads Unfortunately this is another regression from my canonicalization patch (1fed131). The patch contained two implicit assumptions: 1. That we would have a permuted load only if we are loading a partial vector 2. That a partial vector load would necessarily be as wide as the splat However, assumption 2 is not correct since it is possible to do a wider load and only splat a half of it. This patch corrects this assumption by simply checking if the load is permuted and adjusting the offset if it is. (cherry picked from commit 7d076e1)
I mixed up the precedence of operators in the assert and thought I had it right since there was no compiler warning. This just adds the parentheses in the expression as needed. (cherry picked from commit cdead4f)
…irect branch (PR46857) SplitBlockPredecessors() can not split blocks that have such terminators, and in two other places we already ensure that we don't end up calling SplitBlockPredecessors() on such blocks. Do so in one more place. Fixes https://bugs.llvm.org/show_bug.cgi?id=46857 (cherry picked from commit 1da9834)
… different name For a weak symbol func in a comdat, the actual leader symbol ends up named like .weak.func.default*. Likewise, for stdcall on i386, the symbol may be named _func@4, while the section suffix only is "func", which the previous implementation didn't handle. This fixes unwinding through weak functions when using -ffunction-sections in mingw environments. Differential Revision: https://reviews.llvm.org/D84607 (cherry picked from commit 343ffa7)
(cherry picked from commit 30fa576)
As shown in D82998, the basic-aa-recphi option can cause miscompiles for gep's with negative constants. The option checks for recursive phi, that recurse through a contant gep. If it finds one, it performs aliasing calculations using the other phi operands with an unknown size, to specify that an unknown number of elements after the initial value are potentially accessed. This works fine expect where the constant is negative, as the size is still considered to be positive. So this patch expands the check to make sure that the constant is also positive. Differential Revision: https://reviews.llvm.org/D83576 (cherry picked from commit 311fafd)
Adds a new GettingInvolved page to documentation which provides details about mailing list, chats and calls. Adds a sidebar page which provides common links on all documentation pages. The links include: - Getting Started - Getting Involved - Github Repository - Bug Reports - Code Review Depends on https://reviews.llvm.org/D87242 Reviewed By: richard.barton.arm Differential Revision: https://reviews.llvm.org/D87270 (cherry picked from commit fe395ae)
2508ef0 fixed a bug about constant removal in negation. But after sanitizing check I found there's still some issue about it so it's reverted. Temporary nodes will be removed if useless in negation. Before the removal, they'd be checked if any other nodes used it. So the removal was moved after getNode. However in rare cases the node to be removed is the same as result of getNode. We missed that and will be fixed by this patch. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D87614 (cherry picked from commit a2fb544)
Matches C++20 API addition. Differential Revision: https://reviews.llvm.org/D83449 (cherry picked from commit a0385bd)
…y inserted after an INLINEASM_BR. findPHICopyInsertPoint special cases placement in a block with a callbr or invoke in it. In that case, we must ensure that the copy is placed before the INLINEASM_BR or call instruction, if the register is defined prior to that instruction, because it may jump out of the block. Previously, the code placed it immediately after the last def _or use_. This is wrong, if the use is the instruction which may jump. We could correctly place it immediately after the last def (ignoring uses), but that is non-optimal for register pressure. Instead, place the copy after the last def, or before the call/inlineasm_br, whichever is later. Differential Revision: https://reviews.llvm.org/D87865 (cherry picked from commit f7a53d8)
…uilder SelectionDAGBuilder was inconsistently mangling values based on ABI Calling Conventions when getting them through copyFromRegs in SelectionDAGBuilder, causing duplicate value type convertions for function arguments. The checking for the mangling requirement was based on the value's originating instruction and was performed outside of, and inspite of, the regular Calling Convention Lowering. The issue could be observed in a scenario such as: ``` %arg1 = load half, half* %const, align 2 %arg2 = call fastcc half @someFunc() call fastcc void @otherFunc(half %arg1, half %arg2) ; Here, %arg2 was incorrectly mangled twice, as the CallConv data from ; the call to @someFunc() was taken into consideration for the check ; when getting the value for processing the call to @otherFunc(...), ; after the proper convertion had taken place when lowering the return ; value of the first call. ``` This patch fixes the issue by disregarding the Calling Convention information for such copyFromRegs, making sure the ABI mangling is properly contanined in the Calling Convention Lowering. This fixes Bugzilla llvm#47454. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D87844 (cherry picked from commit 53d238a)
D79916 changed the behaviour from -O2 to -O1 but the documentation was not updated to reflect this. (cherry picked from commit 788c7d2)
This is the relevant portions of an assert fixed by b98f902.
This fixes a verifier error in the testcase from bug 47619. The stack passed s3 value was widened to 4-bytes, and producing a 4-byte memory access with a < 1 byte result type. We need to either widen the result type or narrow the access size. This copies the code directly from the AMDGPU handling, which narrows the load size. I don't like that every target has to handle this, but this is currently broken on the 11 release branch and this is the simplest fix. This reverts commit 42bfa7c. (cherry picked from commit 6cb0d23)
…res 8 bytes to store This is a fix for PR47630. The regression is caused by the D78011. After this change the code starts to call the `emitGlobalConstantLargeInt` even for constants which requires eight bytes to store. Differential revision: https://reviews.llvm.org/D88261 (cherry picked from commit c6c5629)
This commit fixes a regression (from LLVM 10 to LLVM 11 RC3) in the LLVM C API. Previously, commit 1ee6ec2 removed the mask operand from the ShuffleVector instruction, storing the mask data separately in the instruction instead; this reduced the number of operands of ShuffleVector from 3 to 2. AFAICT, this change unintentionally caused a regression in the LLVM C API. Specifically, it is no longer possible to get the mask of a ShuffleVector instruction through the C API. This patch introduces new functions which together allow a C API user to get the mask of a ShuffleVector instruction, restoring the functionality which was previously available through LLVMGetOperand(). This patch also adds tests for this change to the llvm-c-test executable, which involved adding support for InsertElement, ExtractElement, and ShuffleVector itself (as well as constant vectors) to echo.cpp. Previously, vector operations weren't tested at all in echo.ll. I also fixed some typos in comments and help-text nearby these changes, which I happened to spot while developing this patch. Since the typo fixes are technically unrelated other than being in the same files, I'm happy to take them out if you'd rather they not be included in the patch. Differential Revision: https://reviews.llvm.org/D88190 (cherry picked from commit 51cad04)
It is not a good idea to expose raw constants in the LLVM C API. Replace this with an explicit getter. Differential Revision: https://reviews.llvm.org/D88367 (cherry picked from commit 55f7273)
The test would fail in no-asserts release builds using MSVC for 64-bit Windows: Unexpected error message: TestBuffer:1:1: error: implicit format conflict between 'FOO' (%u) and '18\0' (%x), need an explicit format specifier Error message(s) not found: {implicit format conflict between 'FOO' (%u) and 'BAZ' (%x), need an explicit format specifier} It seems a string from a previous test case is finding its way into the latter one. This doesn't reproduce on master anymore after 998709b, so let's just hack around it here for the branch.
Differential Revision: https://reviews.llvm.org/D88479
…ating invalid MIR. During lowering of G_UMULO and friends, the previous code moved the builder's insertion point to be after the legalizing instruction. When that happened, if there happened to be a "G_CONSTANT i32 0" immediately after, the CSEMIRBuilder would try to find that constant during the buildConstant(zero) call, and since it dominates itself would return the iterator unchanged, even though the def of the constant was *after* the current insertion point. This resulted in the compare being generated *before* the constant which it was using. There's no need to modify the insertion point before building the mul-hi or constant. Delaying moving the insert point ensures those are built/CSEd before the G_ICMP is built. Fixes PR47679 Differential Revision: https://reviews.llvm.org/D88514 (cherry picked from commit 1d54e75)
We shift the significand right on a truncation, but that needs to be made NaN-safe: always set at least 1 bit in the significand. https://llvm.org/PR43907 See D88238 for the likely follow-up (but needs some plumbing fixes before it can proceed). Differential Revision: https://reviews.llvm.org/D87835 (cherry picked from commit e34bd1e)
By Lang Hames!
As suggested by Yvan.
This reverts partial of a2fb544 (actually, 2508ef0) about removing negated FP constant immediately if it has no uses. However, as discussed in bug 47517, there're cases when NegX is folded into constant from other places while NegY is removed by that line of code and NegX is equal to NegY. In these cases, NegX is deleted before used and crash happens. So revert the code and add necessary test case. (cherry picked from commit b326d4f)
A lot of our code building with clang-cl.exe using Clang 11 was failing with the following 2 type of errors: 1. explicit specialization of 'foo' after instantiation 2. no matching function for call to 'bar' Note that we also use -fdelayed-template-parsing in our builds. I tried pretty hard to get a small repro for these failures, but couldn't. So there is some subtle edge case in the -fpch-instantiate-templates feature introduced by this change: https://reviews.llvm.org/D69585 When I tried turning this off using -fno-pch-instantiate-templates, builds would silently fail with the same error without any indication that -fno-pch-instantiate-templates was being ignored by the compiler. Then I realized this "no" option wasn't actually working when I ran Clang under a debugger. Differential revision: https://reviews.llvm.org/D88680 (cherry picked from commit 66e4f07)
Tail duplication of a block with an INLINEASM_BR may result in a PHI node on the indirect branch. This is okay, but it also introduces a copy for that PHI node *after* the INLINEASM_BR, which is not okay. See: ClangBuiltLinux/linux#1125 Differential Revision: https://reviews.llvm.org/D88823 (cherry picked from commit d2c61d2)
This repository does not accept pull requests. Please follow http://llvm.org/docs/Contributing.html#how-to-submit-a-patch for contribution to LLVM. |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.