Skip to content

attempt #1 #306

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 283 commits into from
Closed

attempt #1 #306

wants to merge 283 commits into from

Conversation

tjdevries
Copy link

No description provided.

sylvain-audi and others added 30 commits July 22, 2020 13:12
… are not shown

The "undefined symbol" error message from lld-link displays up to 3 references to that symbol, and the number of extra references not shown.

This patch removes the computation of the strings for those extra references.

It fixes a freeze of lld-link we accidentally encountered when activating asan on a large project, without linking with the asan library.
In that case, __asan_report_load8 was referenced more than 2 million times, causing the computation of that many display strings, of which only 3 were used.

Differential Revision: https://reviews.llvm.org/D83510

(cherry picked from commit 3a108ab)
…ogue

Current powerpc backend generates wrong code sequence if stack pointer
has to realign if -fstack-clash-protection enabled. When probing in
prologue, backend should generate a subtraction instruction rather
than a `stux` instruction to realign the stack pointer.

This patch is part of fix of
https://bugs.llvm.org/show_bug.cgi?id=46759.

Differential Revision: https://reviews.llvm.org/D84218

(cherry picked from commit 8912252)
…ing dynalloc

Current powerpc backend generates wrong code sequence if stack pointer
has to realign if `-fstack-clash-protection` enabled. When probing
dynamic stack allocation, current `PREPARE_PROBED_ALLOCA` takes
`NegSizeReg` as input and returns
`FinalStackPtr`. `FinalStackPtr=StackPtr+ActualNegSize` is calculated
correctly, however code following `PREPARE_PROBED_ALLOCA` still uses
value of `NegSizeReg`, which does not contain `ActualNegSize` if
`MaxAlign > TargetAlign`, to calculate loop trip count and residual
number of bytes.

This patch is part of fix of
https://bugs.llvm.org/show_bug.cgi?id=46759.

Differential Revision: https://reviews.llvm.org/D84152

(cherry picked from commit c3f9697)
This way should be the same like with a.pcm for modules.
An alternative way is 'clang++ -c empty.cpp -include-pch a.pch -o a.o
-Xclang -building-pch-with-obj', which is what clang-cl's /Yc does
internally.

Differential Revision: https://reviews.llvm.org/D83716

(cherry picked from commit 3895466)
Using -fmodules-* options for PCHs is a bit confusing, so add -fpch-*
variants. Having extra options also makes it simple to do a configure
check for the feature.
Also document the options in the release notes.

Differential Revision: https://reviews.llvm.org/D83623

(cherry picked from commit 54eea61)
This assert was added to verify assumption that GEP's SCEV will be of pointer type,
basing on fact that it should be a SCEVAddExpr with (at least) last operand being
pointer. Two notes:
- GEP's SCEV does not have to be a SCEVAddExpr after all simplifications;
- In current state, GEP's SCEV does not have to have at least one pointer operands
  (all of them can become int during the transforms).

However, we might want to be at a point where it is true. We are currently removing
this assert and will try to enumerate the cases where "is pointer" notion might be
lost during the transforms. When all of them are fixed, we can return it.

Differential Revision: https://reviews.llvm.org/D84294
Reviewed By: lebedev.ri

(cherry picked from commit b96114c)
Fixes https://bugs.llvm.org/show_bug.cgi?id=46680.

Just like insertions through IRBuilder, InsertNewInstBefore()
should be using the deferred worklist mechanism, so that processing
of newly added instructions is prioritized.

There's one side-effect of the worklist order change which could be
classified as a regression. An add op gets pushed through a select
that at the time is not a umax. We could add a reverse transform
that tries to push adds in the reverse direction to restore a min/max,
but that seems like a sure way of getting infinite loops... Seems
like something that should best wait on min/max intrinsics.

Differential Revision: https://reviews.llvm.org/D84109

(cherry picked from commit d12ec0f)
…VECTOR(X,0)) patterns.

getTargetShuffleMask is used by the various "SimplifyDemanded" folds so we can't assume that the bypassed extract_subvector can be safely simplified - getFauxShuffleMask performs a more general decode that allows us to more safely catch many of these cases so the impact is minimal.

(cherry picked from commit 5b5dc24)
Summary:
We need to detect when certain TypoExprs are not being transformed
due to invalid trees, otherwise we risk endlessly trying to fix it.

Reviewers: rsmith

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D84067

(cherry picked from commit dde98c8)
…b asm instructions

This patch provides optimization of bit manipulation operations by
enabling the +experimental-b target feature.
It adds matching of single block patterns of instructions to specific
bit-manip instructions from the base subset (zbb subextension) of the
experimental B extension of RISC-V.
It adds also the correspondent codegen tests.

This patch is based on Claire Wolf's proposal for the bit manipulation
extension of RISCV:
https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf

Differential Revision: https://reviews.llvm.org/D79870

(cherry picked from commit e2692f0)
…p asm instructions

This patch provides optimization of bit manipulation operations by
enabling the +experimental-b target feature.
It adds matching of single block patterns of instructions to specific
bit-manip instructions from the permutation subset (zbp subextension) of
the experimental B extension of RISC-V.
It adds also the correspondent codegen tests.

This patch is based on Claire Wolf's proposal for the bit manipulation
extension of RISCV:
https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf

Differential Revision: https://reviews.llvm.org/D79871

(cherry picked from commit 31b52b4)
…bp asm instructions

This patch provides optimization of bit manipulation operations by
enabling the +experimental-b target feature.
It adds matching of single block patterns of instructions to specific
bit-manip instructions belonging to both the permutation and the base
subsets of the experimental B extension of RISC-V.
It adds also the correspondent codegen tests.

This patch is based on Claire Wolf's proposal for the bit manipulation
extension of RISCV:
https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf

Differential Revision: https://reviews.llvm.org/D79873

(cherry picked from commit 6144f0a)
…s asm instructions

This patch provides optimization of bit manipulation operations by
enabling the +experimental-b target feature.
It adds matching of single block patterns of instructions to specific
bit-manip instructions from the single-bit subset (zbs subextension) of
the experimental B extension of RISC-V.
It adds also the correspondent codegen tests.

This patch is based on Claire Wolf's proposal for the bit manipulation
extension of RISCV:
https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf

Differential Revision: https://reviews.llvm.org/D79874

(cherry picked from commit d4be333)
…t asm instructions

This patch provides optimization of bit manipulation operations by
enabling the +experimental-b target feature.
It adds matching of single block patterns of instructions to specific
bit-manip instructions from the ternary subset (zbt subextension) of the
experimental B extension of RISC-V.
It adds also the correspondent codegen tests.

This patch is based on Claire Wolf's proposal for the bit manipulation
extension of RISCV:
https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf

Differential Revision: https://reviews.llvm.org/D79875

(cherry picked from commit c9c955a)
For comdats (e.g. caused by -ffunction-sections), Section is already
set here; make sure it's null, for the weak external symbol to be undefined.

This fixes PR46779.

Differential Revision: https://reviews.llvm.org/D84507

(cherry picked from commit 9e81d8b)
This fixes PR 42837.

Differential Revision: https://reviews.llvm.org/D84465

(cherry picked from commit 4d09ed9)
…le loop counters.

Summary:
If the variable is constrcutible, its copy is created by calling a
constructor. Such variables are duplicated and thus, must be captured.

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, cfe-commits, sstefan1, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D83909

(cherry picked from commit 9840208)
…th undef if needed when concatenating small or loads to match a larger load

In the included test case the align 16 allowed the v23f32 load to handled as load v16f32, load v4f32, and load v4f32(one element not used). These loads all need to be concatenated together into a final vector. In this case we tried to concatenate the two v4f32 loads to match the type of the v16f32 load so we could do a second concat_vectors, but those loads alone only add up to v8f32. So we need to two v4f32 undefs to pad it.

It appears we've tried to hack around a similar issue in this code before by adding undef padding to loads in one of the earlier loops in this function. Originally in r147964 by padding all loads narrower than previous loads to the same size. Later modifed to only the last load in r293088. This patch removes that earlier code and just handles it on demand where we know we need it.

Fixes PR46820

Differential Revision: https://reviews.llvm.org/D84463

(cherry picked from commit 8131e19)
…oads

Unfortunately this is another regression from my canonicalization patch
(1fed131). The patch contained two implicit assumptions:
1. That we would have a permuted load only if we are loading a partial vector
2. That a partial vector load would necessarily be as wide as the splat

However, assumption 2 is not correct since it is possible to do a wider
load and only splat a half of it. This patch corrects this assumption by
simply checking if the load is permuted and adjusting the offset if it is.

(cherry picked from commit 7d076e1)
I mixed up the precedence of operators in the assert and thought I
had it right since there was no compiler warning. This just
adds the parentheses in the expression as needed.

(cherry picked from commit cdead4f)
…irect branch (PR46857)

SplitBlockPredecessors() can not split blocks that have such terminators,
and in two other places we already ensure that we don't end up calling
SplitBlockPredecessors() on such blocks. Do so in one more place.

Fixes https://bugs.llvm.org/show_bug.cgi?id=46857

(cherry picked from commit 1da9834)
Previously, the test could pass with one part of c3b1d73 removed.

(cherry picked from commit 8dc8203)
… different name

For a weak symbol func in a comdat, the actual leader symbol ends up
named like .weak.func.default*. Likewise, for stdcall on i386, the symbol
may be named _func@4, while the section suffix only is "func", which the
previous implementation didn't handle.

This fixes unwinding through weak functions when using
-ffunction-sections in mingw environments.

Differential Revision: https://reviews.llvm.org/D84607

(cherry picked from commit 343ffa7)
As shown in D82998, the basic-aa-recphi option can cause miscompiles for
gep's with negative constants. The option checks for recursive phi, that
recurse through a contant gep. If it finds one, it performs aliasing
calculations using the other phi operands with an unknown size, to
specify that an unknown number of elements after the initial value are
potentially accessed. This works fine expect where the constant is
negative, as the size is still considered to be positive. So this patch
expands the check to make sure that the constant is also positive.

Differential Revision: https://reviews.llvm.org/D83576

(cherry picked from commit 311fafd)
sameeran joshi and others added 25 commits September 17, 2020 13:27
Adds a new GettingInvolved page to documentation which provides details about
mailing list, chats and calls.

Adds a sidebar page which provides common links on
all documentation pages.
The links include:
-  Getting Started
-  Getting Involved
-  Github Repository
-  Bug Reports
-  Code Review

Depends on https://reviews.llvm.org/D87242

Reviewed By: richard.barton.arm

Differential Revision: https://reviews.llvm.org/D87270

(cherry picked from commit fe395ae)
2508ef0 fixed a bug about constant removal in negation. But after
sanitizing check I found there's still some issue about it so it's
reverted.

Temporary nodes will be removed if useless in negation. Before the
removal, they'd be checked if any other nodes used it. So the removal
was moved after getNode. However in rare cases the node to be removed is
the same as result of getNode. We missed that and will be fixed by this
patch.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D87614

(cherry picked from commit a2fb544)
Matches C++20 API addition.

Differential Revision: https://reviews.llvm.org/D83449

(cherry picked from commit a0385bd)
…y inserted after an INLINEASM_BR.

findPHICopyInsertPoint special cases placement in a block with a
callbr or invoke in it. In that case, we must ensure that the copy is
placed before the INLINEASM_BR or call instruction, if the register is
defined prior to that instruction, because it may jump out of the
block.

Previously, the code placed it immediately after the last def _or
use_. This is wrong, if the use is the instruction which may jump.  We
could correctly place it immediately after the last def (ignoring
uses), but that is non-optimal for register pressure.

Instead, place the copy after the last def, or before the
call/inlineasm_br, whichever is later.

Differential Revision: https://reviews.llvm.org/D87865

(cherry picked from commit f7a53d8)
…uilder

SelectionDAGBuilder was inconsistently mangling values based on ABI
Calling Conventions when getting them through copyFromRegs in
SelectionDAGBuilder, causing duplicate value type convertions for
function arguments. The checking for the mangling requirement was based
on the value's originating instruction and was performed outside of, and
inspite of, the regular Calling Convention Lowering.

The issue could be observed in a scenario such as:

```
%arg1 = load half, half* %const, align 2
%arg2 = call fastcc half @someFunc()
call fastcc void @otherFunc(half %arg1, half %arg2)
; Here, %arg2 was incorrectly mangled twice, as the CallConv data from
; the call to @someFunc() was taken into consideration for the check
; when getting the value for processing the call to @otherFunc(...),
; after the proper convertion had taken place when lowering the return
; value of the first call.
```

This patch fixes the issue by disregarding the Calling Convention
information for such copyFromRegs, making sure the ABI mangling is
properly contanined in the Calling Convention Lowering.

This fixes Bugzilla llvm#47454.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D87844

(cherry picked from commit 53d238a)
D79916 changed the behaviour from -O2 to -O1 but the documentation was
not updated to reflect this.

(cherry picked from commit 788c7d2)
This is the relevant portions of an assert fixed by
b98f902.
This fixes a verifier error in the testcase from bug 47619.

The stack passed s3 value was widened to 4-bytes, and producing a
4-byte memory access with a < 1 byte result type. We need to either
widen the result type or narrow the access size. This copies the code
directly from the AMDGPU handling, which narrows the load size. I
don't like that every target has to handle this, but this is currently
broken on the 11 release branch and this is the simplest fix.

This reverts commit 42bfa7c.

(cherry picked from commit 6cb0d23)
…res 8 bytes to store

This is a fix for PR47630. The regression is caused by the D78011. After
this change the code starts to call the `emitGlobalConstantLargeInt` even
for constants which requires eight bytes to store.

Differential revision: https://reviews.llvm.org/D88261

(cherry picked from commit c6c5629)
This commit fixes a regression (from LLVM 10 to LLVM 11 RC3) in the LLVM
C API.

Previously, commit 1ee6ec2 removed the mask operand from the
ShuffleVector instruction, storing the mask data separately in the
instruction instead; this reduced the number of operands of
ShuffleVector from 3 to 2. AFAICT, this change unintentionally caused
a regression in the LLVM C API. Specifically, it is no longer possible
to get the mask of a ShuffleVector instruction through the C API. This
patch introduces new functions which together allow a C API user to get
the mask of a ShuffleVector instruction, restoring the functionality
which was previously available through LLVMGetOperand().

This patch also adds tests for this change to the llvm-c-test
executable, which involved adding support for InsertElement,
ExtractElement, and ShuffleVector itself (as well as constant vectors)
to echo.cpp. Previously, vector operations weren't tested at all in
echo.ll.

I also fixed some typos in comments and help-text nearby these changes,
which I happened to spot while developing this patch. Since the typo
fixes are technically unrelated other than being in the same files, I'm
happy to take them out if you'd rather they not be included in the patch.

Differential Revision: https://reviews.llvm.org/D88190

(cherry picked from commit 51cad04)
It is not a good idea to expose raw constants in the LLVM C API. Replace this with an explicit getter.

Differential Revision: https://reviews.llvm.org/D88367

(cherry picked from commit 55f7273)
The test would fail in no-asserts release builds using MSVC
for 64-bit Windows:

Unexpected error message:
TestBuffer:1:1: error: implicit format conflict between 'FOO' (%u) and '18\0' (%x), need an explicit format specifier

Error message(s) not found:
{implicit format conflict between 'FOO' (%u) and 'BAZ' (%x), need an explicit format specifier}

It seems a string from a previous test case is finding its way
into the latter one.

This doesn't reproduce on master anymore after 998709b, so let's
just hack around it here for the branch.
…ating invalid MIR.

During lowering of G_UMULO and friends, the previous code moved the builder's
insertion point to be after the legalizing instruction. When that happened, if
there happened to be a "G_CONSTANT i32 0" immediately after, the CSEMIRBuilder
would try to find that constant during the buildConstant(zero) call, and since
it dominates itself would return the iterator unchanged, even though the def
of the constant was *after* the current insertion point. This resulted in the
compare being generated *before* the constant which it was using.

There's no need to modify the insertion point before building the mul-hi or
constant. Delaying moving the insert point ensures those are built/CSEd before
the G_ICMP is built.

Fixes PR47679

Differential Revision: https://reviews.llvm.org/D88514

(cherry picked from commit 1d54e75)
We shift the significand right on a truncation, but that needs to be made NaN-safe:
always set at least 1 bit in the significand.
https://llvm.org/PR43907

See D88238 for the likely follow-up (but needs some plumbing fixes before it can proceed).

Differential Revision: https://reviews.llvm.org/D87835

(cherry picked from commit e34bd1e)
This reverts partial of a2fb544 (actually, 2508ef0) about removing
negated FP constant immediately if it has no uses. However, as discussed
in bug 47517, there're cases when NegX is folded into constant from
other places while NegY is removed by that line of code and NegX is
equal to NegY. In these cases, NegX is deleted before used and crash
happens. So revert the code and add necessary test case.

(cherry picked from commit b326d4f)
A lot of our code building with clang-cl.exe using Clang 11 was failing with
the following 2 type of errors:

1. explicit specialization of 'foo' after instantiation
2. no matching function for call to 'bar'

Note that we also use -fdelayed-template-parsing in our builds.

I tried pretty hard to get a small repro for these failures, but couldn't. So
there is some subtle edge case in the -fpch-instantiate-templates feature
introduced by this change: https://reviews.llvm.org/D69585

When I tried turning this off using -fno-pch-instantiate-templates, builds
would silently fail with the same error without any indication that
-fno-pch-instantiate-templates was being ignored by the compiler. Then I
realized this "no" option wasn't actually working when I ran Clang under a
debugger.

Differential revision: https://reviews.llvm.org/D88680

(cherry picked from commit 66e4f07)
Tail duplication of a block with an INLINEASM_BR may result in a PHI
node on the indirect branch. This is okay, but it also introduces a copy
for that PHI node *after* the INLINEASM_BR, which is not okay.

See: ClangBuiltLinux/linux#1125

Differential Revision: https://reviews.llvm.org/D88823

(cherry picked from commit d2c61d2)
@repo-lockdown
Copy link

repo-lockdown bot commented Apr 22, 2021

This repository does not accept pull requests. Please follow http://llvm.org/docs/Contributing.html#how-to-submit-a-patch for contribution to LLVM.

@repo-lockdown repo-lockdown bot closed this Apr 22, 2021
@repo-lockdown repo-lockdown bot locked and limited conversation to collaborators Apr 22, 2021
@tjdevries tjdevries deleted the sg branch April 22, 2021 19:57
@tjdevries tjdevries restored the sg branch April 22, 2021 19:58
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.