Skip to content

Feature/merge upstream 20211214 #120

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 826 commits into from
Dec 14, 2021
Merged

Feature/merge upstream 20211214 #120

merged 826 commits into from
Dec 14, 2021

Conversation

kaz7
Copy link
Collaborator

@kaz7 kaz7 commented Dec 14, 2021

Merge upstream/main to 2021/12/14.

Pass internal regression tests.

CJ-Johnson and others added 30 commits December 9, 2021 17:41
This change applies two fixes to the abseil-cleanup-ctad check. It uses hasSingleDecl() to ensure only declStmt()s with one varDecl() are matched (leaving compount declStmt()s unchanged). It also addresses a bug in the handling of comments that surround the absl::MakeCleanup() calls by switching to the callArgs() combinator from Clang Transformer.

Reviewed By: ymandel

Differential Revision: https://reviews.llvm.org/D115452
Transform
```
(~a & b & c) | ~(a | b | c) -> ~(a | (b ^ c))
```
And swapped case:
```
(~a | b | c) & ~(a & b & c) -> ~a | (b ^ c)
```

```
----------------------------------------
define i4 @src(i4 %a, i4 %b, i4 %c) {
%0:
  %or1 = or i4 %b, %a
  %or2 = or i4 %or1, %c
  %not1 = xor i4 %or2, 15
  %not2 = xor i4 %a, 15
  %and1 = and i4 %b, %not2
  %and2 = and i4 %and1, %c
  %or3 = or i4 %and2, %not1
  ret i4 %or3
}
=>
define i4 @tgt(i4 %a, i4 %b, i4 %c) {
%0:
  %1 = xor i4 %c, %b
  %2 = or i4 %1, %a
  %or3 = xor i4 %2, 15
  ret i4 %or3
}
Transformation seems to be correct!
```
```
----------------------------------------
define i4 @src(i4 %a, i4 %b, i4 %c) {
%0:
  %and1 = and i4 %b, %a
  %and2 = and i4 %and1, %c
  %not1 = xor i4 %and2, 15
  %not2 = xor i4 %a, 15
  %or1 = or i4 %not2, %b
  %or2 = or i4 %or1, %c
  %and3 = and i4 %or2, %not1
  ret i4 %and3
}
=>
define i4 @tgt(i4 %a, i4 %b, i4 %c) {
%0:
  %xor = xor i4 %b, %c
  %not = xor i4 %a, 15
  %or = or i4 %xor, %not
  ret i4 %or
}
Transformation seems to be correct!
```

Differential Revision: https://reviews.llvm.org/D112966
This change allows us to estimate trip count from profile metadata for all multiple exit loops. We still do the estimate only from the latch, but that's fine as it causes us to over estimate the trip count at worst.

Reviewing the uses of the API, all but one are cases where we restrict a loop transformation (unroll, and vectorize respectively) when we know the trip count is short enough. So, as a result, the change makes these passes strictly less aggressive. The test change illustrates a case where we'd previously have runtime unrolled a loop which ran fewer iterations than the unroll factor. This is definitely unprofitable.

The one case where an upper bound on estimate trip count could drive a more aggressive transform is peeling, and I duplicated the logic being removed from the generic estimation there to keep it the same. The resulting heuristic makes no sense and should probably be immediately removed, but we can do that in a separate change.

This was noticed when analyzing regressions on D113939.

I plan to come back and incorporate estimated trip counts from other exits, but that's a minor improvement which can follow separately.

Differential Revision: https://reviews.llvm.org/D115362
This reverts commit e5c2a46 as this
change introduced a linker error when building sanitizer runtimes:

  ld.lld: error: undefined symbol: __sanitizer::internal_start_thread(void* (*)(void*), void*)
  >>> referenced by sanitizer_stackdepot.cpp:133 (compiler-rt/lib/sanitizer_common/sanitizer_stackdepot.cpp:133)
  >>>               compiler-rt/lib/sanitizer_common/CMakeFiles/RTSanitizerCommonSymbolizer.x86_64.dir/sanitizer_stackdepot.cpp.obj:(__sanitizer::(anonymous namespace)::CompressThread::NewWorkNotify())
This reorders existing transforms to put demanded elements last. The reasoning here is that when we have an example which can be scalarized or handled via demanded bits, we should prefer scalarization as that doesn't require dropping flags on arithmetic instructions.

This doesn't show major changes in the tests today, but once I add support for fast math flags to dropPoisonGeneratingFlags this becomes glaringly obvious.

Differential Revision: https://reviews.llvm.org/D115394
Depends on D114495.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D114498
As in D114934, or lsan crashes on the same bot.
The recurrence lowering code has handling which claims to be about flag intersection, but all the callers pass empty arrays to the arguments.  The sole exception is a caller of a method which has the argument, but no implementation.

I don't know what the intent was here, but it certaintly doesn't actually do anything today.
Just a simple typo fix that allows me to test landing a commit now that
I have commit access.

Reviewed By: xgupta

Differential Revision: https://reviews.llvm.org/D115414
The code claimed to handle nsw/nuw, but those aren't passed via builder state and the explicit IR construction just above never sets them.

The only case this bit of code is actually relevant for is FMF flags.  However, dropPoisonGeneratingFlags currently doesn't know about FMF at all, so this was a noop.  It's also unneeded, as the caller explicitly configures the flags on the builder before this call, and the flags on the individual ops should be controled by the intrinsic flags anyways.  If any of the flags aren't safe to propagate, the caller needs to make that change.
The comparator for the sort functions should provide strict weak
ordering relation between parameters. Current solution causes compiler
crash with some standard c++ library implementations, because it does
not meet this criteria. Tried to fix it + it improves the iverall
vectorization result.

Differential Revision: https://reviews.llvm.org/D115268
…cxx with compiler implied -lunwind

This does mostly the same as D112126, but for the runtimes cmake files.
Most of that is straightforward, but the interdependency between
libcxx and libunwind is tricky:

Libunwind is built at the same time as libcxx, but libunwind is not
installed yet. LIBCXXABI_USE_LLVM_UNWINDER makes libcxx link directly
against the just-built libunwind, but the compiler implicit -lunwind
isn't found. This patch avoids that by adding --unwindlib=none if
supported, if we are going to link explicitly against a newly built
unwinder anyway.

Since the previous attempt, this no longer uses
llvm_enable_language_nolink (and thus doesn't set
CMAKE_TRY_COMPILE_TARGET_TYPE=STATIC_LIBRARY during the compiler
sanity checks). Setting CMAKE_TRY_COMPILE_TARGET_TYPE=STATIC_LIBRARY
during compiler sanity checks makes cmake not learn about some
aspects of the compiler, which can make further find_library or
find_package fail. This caused OpenMP to not detect libelf and libffi,
disabling some OpenMP target plugins.

Instead, require the caller to set CMAKE_{C,CXX}_COMPILER_WORKS=YES
when building in a configuration with an incomplete toolchain.

Differential Revision: https://reviews.llvm.org/D113253
This reapplies a fix from 948ce4e,
whichn't originally submitted upstream. I has now been merged upstream
though, in google/benchmark#1302.

When benchmarks were unified in
5dda2ef, it lost this change,
but it also lost another local modification, where benchmark's
CMakeLists.txt was modified to comment out adding -Werror.
(This change was part of the original import in
0addd17.)

As the benchmark library is built automatically by default, when
building all of LLVM (contrary to the copy in libcxx, which wasn't
built by default), building it with -Werror by default is very brittle.

This fixes building LLVM with MinGW. (It wasn't broken in MSVC
mode, as the benchmark library doesn't add -Werror or anything
equivalent in MSVC mode, and it's unclear if this warning is
enabled in that mode at all.)

Differential Revision: https://reviews.llvm.org/D115434
StackDepot locks some stuff. As is there is small probability to
deadlock if we stop thread which locked the Depot.

We need either Lock/Unlock StackDepot for StopTheWorld, or don't
interact with StackDepot from there.

This patch does not run LeakReport under StopTheWorld. LeakReport
contains most of StackDepot access.

As a bonus this patch will help to resolve kMaxLeaksConsidered FIXME.

Depends on D114498.

Reviewed By: morehouse, kstoimenov

Differential Revision: https://reviews.llvm.org/D115284
Necessary for implementing some combines on floating point selects.

Differential Revision: https://reviews.llvm.org/D115372
val can be of any type accepted by Compare.
This removes the last use of StackDepot from StopTheWorld.

Depends on D115284.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D115319
The test has been flaky for years, and I think we should remove it to
eliminate noise on the buildbot.

Neither me nor dokyungs have been able to fully deflake the test, and it
tests a non-default Entropic flag.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D115453
This change adds options to llvm-ifs to allow it to generate multiple
types of stub files at a single invocation.

Differential Revision: https://reviews.llvm.org/D115024
G_PTR_ADD takes arguments of two different types, so it probably shouldn't be
considered commutative just on that basis. A recent G_PTR_ADD reassociation
optimization (https://reviews.llvm.org/D109528) can emit erroneous code if the
pattern matcher commutes the arguments; this can happen when the base pointer
was created by G_INTTOPTR of a G_CONSTANT and the offset register is variable.

This was discovered on the llvm-mos fork, but I added a failing test case that
should apply to AArch64 (and more generally).

Differential Revision: https://reviews.llvm.org/D114655
Wrong type was used for the result type in the tosa.conv_2d canonicalization.
The type should match the result element type should match the result type
not the input element type.

Differential Revision: https://reviews.llvm.org/D115463
…d from PPC specific code.

There are two signatures of setSpecialOperandAttr in TargetInstrInfo.
One of them is only called from PPCInstrInfo which has an override
of it.

Remove it from TargetInstrInfo and make it a non-virtual method in
PPCInstrInfo.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D115404
kazutakahirata and others added 28 commits December 12, 2021 08:34
The `not` program is used to test executions prefixed with `%libomptarget-run-fail-`. Currently `not` is not used for libomp tests, but might be used in the future and its dependency does not add any additional burden over the already established `FileCheck` dependency.

Required to add libomptarget testing to the Phabricator pre-merge check (see google/llvm-premerge-checks#368)

Reviewed By: jdenny, JonChesterfield

Differential Revision: https://reviews.llvm.org/D115454
llvm/llvm-project#48642

clang-format does not respect raw string literals when sorting includes

```
const char *RawStr = R"(
)";
```

Running clang-format over with SortIncludes enabled transforms this code to:

```
const char *RawStr = R"(
)";
```
The following code tries to minimize this impact during IncludeSorting, by treating R"( and )" as equivalent of // clang-format off/on

Reviewed By: HazardyKnusperkeks, curdeius

Differential Revision: https://reviews.llvm.org/D115168

Fixes #48642
(This reverts commit 7d9f11b,
to reland the Ryu code: ae53d02 relanded in abb5dd6).
This is a very old copy+paste typo - none of these binops have an immediate operand.

Noticed while trying to merge MMX instructions into some existing SSE instruction scheduler instregex patterns.
This is a very old copy+paste typo - none of these cvt ops have an immediate operand.

Noticed while trying to merge MMX instructions into some existing SSE instruction scheduler instregex patterns.
I stupidly lost these in a temp git stash :(
Currently the superalign option only increases the alignment of
variables that are moved into the module.lds block. Change that to all LDS
variables. Also only increase the alignment once, instead of once per function.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D115488
This updates several helper functions to use information provided by
VPTransformState instead of ILV directly, to help with the transition
out of ILV.
Apparently "help wanted" has some additional special meaning
Specify the integer width to ensure we're testing the correct instruction
Fix overrides to use both ports. Update the uops counts + port usage based off the most recent llvm-exegesis captures (PR36895) and what Intel AoM / Agner reports as well.
Also delete some cross-linux.c tests which are covered by linux-cross.cpp
…e caller. NFC

Avoid a function call in the majority of cases and make the output smaller.
… SHF_EXCLUDE && !relocatable. NFC

Avoid a comparison in the majority of cases.
An unstable sort suffices. In a large link (11.06s), this decreases .rela.dyn
writeTo time from 1.52s to 0.81s, resulting in 6% total time speedup (the
benefit will greatly dilute if --pack-dyn-relocs=relr becomes prevailing).

Encoding the dynamic relocations then sorting raw Elf_Rel/Elf_Rela doesn't seem
to improve much (doing that would require code duplicate because of
Elf_Rel/Elf_Rela plus unfortunate mips64le), so don't do that.
For the simple copy loop (see test case) vectorizer selects VF equal to 32 while the loop is known to have 17 iterations only. Such behavior makes no sense to me since such vector loop will never be executed. The only case we may want to select VF large than TC is masked vectoriztion. So I haven't touched that case.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D114528
They were accidentally removed in a previous change.
…merge-upstream-20211214

This merge simply discard RISCV changes in
llvm/include/llvm/IR/VPIntrinsics.def.
…merge-upstream-20211214

TODO: Need to update hasActiveVectorLength()
@kaz7 kaz7 merged commit 68aa85f into develop Dec 14, 2021
@kaz7 kaz7 deleted the feature/merge-upstream-20211214 branch December 14, 2021 00:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.