[EraVM] Do not set Uses=[Flags] unconditionally in JCl #558

atrosinenko · 2024-05-21T16:53:25Z

No description provided.

This patch is a joint work by: Dmitry Borisenkov <[email protected]> Vladimir Radosavljevic <[email protected]>

This pass tries to replace an stdlib call with either a call to the more efficient algorithm, or just an optimized sequence of instructions. Curretly, it handles only __exp(2^m, exp) calls and replaces them with calls to __exp_pow2() that just computes 1 << (m * exp) value.

Signed-off-by: Vladimir Radosavljevic <[email protected]>

Also remove TODO and FIXME that are no longer needed.

Signed-off-by: Vladimir Radosavljevic <[email protected]>

When lowering SDIV and SREM nodes, we are generating UDIV and UREM nodes with two outputs, instead of one. Signed-off-by: Vladimir Radosavljevic <[email protected]>

Substitute MVT::fatptr with MVT::i256 when creating a constant DAG node. To be squashed with ed7d384 when update LLVM to reduce local change noice.

This allows -print-after-all/-print-before-all to dump MIR around these passes. Signed-off-by: Vladimir Radosavljevic <[email protected]>

Signed-off-by: Vladimir Radosavljevic <[email protected]>

…insics Signed-off-by: Vladimir Radosavljevic <[email protected]>

Signed-off-by: Vladimir Radosavljevic <[email protected]>

* PostRA pass to fold conditional move with operand def instruction * added IR tests and MIR tests

This pass intends to recoginze memops that can be optimized to indexed memops and then rewrite them to favor the subsequent indexed memops combine pass.

Signed-off-by: Vladimir Radosavljevic <[email protected]>

In order to efficiently emit expanded SELECT instruction, it is best to coalesce a source register with the dest to avoid a redundant copy. This is done by tying the register allocator to assign them a same register.

The pass only worked correctly if a loop body was a latch, otherwise the phi with a wrong incoming BB was created. The patch changes the incoming BB to the actual latch. It also ensure that a loop has only one latch.

Support log.decommit, tload, tstore.

Signed-off-by: Vladimir Radosavljevic <[email protected]>

When one of SELECT's input operands is zero, we can fold it with its sole user to benefit the performance. The tie is used to ensure the two register operands in user inst will be allocated to same reg.

…in function (#88477) With this change, the wall time for of GVN pass decreased from 873,745.492 ms to 367,375.304 ms in an our internal testcase.

Make R1 register be understood as the default operand of `RETrl` and `REVERTrl` instructions (but make this alias non-default for printing until switched to the new asm syntax). Change panic instruction to either accept jump target operand or none at all.

In preparation for renaming of assembler mnemonics, split `EraVMOpcode` definitions for ret, revert and panic instructions into "without label" and "with label" variants, so their mnemonics can be specified independently.

Add tests for folding select with overflow intrinsics where we shouldn't attempt to inverse overflow LT caused by uaddo and umulo. Co-authored-by: Alan Li <[email protected]> PR: #521, Issue: #491.

After folding, the select instructions will use overflow condition code which is LT actually, this will give us shorter code sequence normally. Please be noted that GE can only be used as reversal code for overflow LT caused by usubo, but not for uaddo and umulo. Co-authored-by: Alan Li <[email protected]> PR: #521, Issue: #491.

Make NOP support the remaining combinations of operands. Reorganize the tests: add `nop.(s|txt)` test files similar to those for arithmetic instructions, put the test cases for `nop`, `incsp` and `decsp` aliases to `sp-changes.(s|txt)` files.

Simplify the syntax for specifying both "static" and "shard" modifiers of a far call: either `mnemonic.st.sh` or `mnemonic.sh.st`. Please note that ".st" and ".sh" are technically just parts of the mnemonic's name, so they should go before a predicate, if any.

Clarify the `@fat.ptr.call` test case before renaming the mnemonics. The instructions actually generated for `@fat.ptr.call` function are ptr.add r2, r0, r1 near_call r0, @fat.ptr.arg, @DEFAULT_UNWIND ret Before renaming `ptr.add` to `addp`, the first line was accidentally matched by the `add r2, r0, r1` substring.

Rename RETURN and REVERT pseudos to make their purpose easier to understand. Remove InstAlias for panic instruction. Since register operand of panic instruction was removed, it virtually became a MnemonicAlias, so update the definition of OpPanic instead as no alias needed in the new syntax.

Reuse the same `parseMCOperandsCode` and `parseMCOperandsStack` functions that are used by code emitter to analyze operands of MCInst. While it makes easier to improve formatting of printed assembly code, this commit strives to keep the existing formatting as much as possible, to not combine refactoring with massive changes of tests.

Explicitly specify EncodedOperandMode type and only convert it to int when necessary. Introduce MemOperandKind::OperandInvalid to not implicitly use OperandCode as invalid stack operand kind.

…ding

Rename load/store as well as several special instructions to reflect their current names according to new syntax. Remove trailing "r" from several instructions that only have a fixed set of supported operands and would need "rr" or "rrr" suffix anyway.

Update `(EraVMchange_sp GR256:$sp_change)` pattern to select NOPrrs instead of NOPSPr (used by `llvm.restorestack` intrinsic). Remove `(EraVMchange_sp imm16:$sp_change)` pattern as it is currently unused and untested. Change the `EraVMchange_sp` name to more descriptive `EraVMadd_to_sp`.

[skip ci]

@eop

If looking for a miscompile revert candidate, look here! The transform being enabled prefers comparing to a loop invariant exit value for a secondary IV over using an otherwise dead primary IV. This increases register pressure (by requiring the exit value to be live through the loop), but reduces the number of instructions within the loop by one. On RISC-V which has a large number of scalar registers, this is generally a profitable transform. We loose the ability to use a beqz on what is typically a count down IV, and pay the cost of computing the exit value on the secondary IV in the loop preheader, but save an add or sub in the loop body. For anything except an extremely short running loop, or one with extreme register pressure, this is profitable. On spec2017, we see a 0.42% geomean improvement in dynamic icount, with no individual workload regressing by more than 0.25%. Code size wise, we trade a (possibly compressible) beqz and a (possibly compressible) addi for a uncompressible beq. We also add instructions in the preheader. Net result is a slight regression overall, but neutral or better inside the loop. Previous versions of this transform had numerous cornercase correctness bugs. All of them ones I can spot by inspection have been fixed, and I have run this through all of spec2017, but there may be further issues lurking. Adding uses to an IV is a fraught thing to do given poison semantics, so this transform is somewhat inherently risky. This patch is a reworked version of D134893 by @eop. That patch has been abandoned since May, so I picked it up, reworked it a bit, and am landing it.

This patch makes `shouldFoldTerminatingConditionAfterLSR` return `true` for EraVM target. Thus LSR will try to replace main induction variable of a loop with one of secondary IVs. Thus it reduces the number of instructions in a loop by eliminating main IV increment. This transformation increases register pressure, but it's rarely the problem for EraVM. PR: #599, Issue: #580.

* Collapse OperandAddrMode and OperandAM fields of IBinary class * Use named constants instead of numbers * Drop synthetic Value field from OperandAddrModeValue class (renamed to SrcOperandMode)

This field's value has no particular meaning neither in TableGen nor in C++ and can be autogenerated if GenericEnum is ever desired.

Unlike mapping between different input operand kinds, the three other `InstrMapping`s only have two options each that are flipped by the mapper function. Remove the synthetic Value field from `DestAddrModeValue` (renamed to `DestOperandMode`) class. Keep Value fields in `mod_set_flags` and `mod_swap` as these fields are meaningful for opcode computation. Remove the `ToReg` vs. `ToRegReg` and `ToStack` vs. `ToStackReg` separation as it is not really used.

Perform a few preparations before removing repeated definitions of operands in `EraVMInstrInfo.td`: * move the definitions of EraVM-specific `AsmOperandClass`es and `Operand`s, so that they can be used in `EraVMInstrFormats.td` * define tablegen classes to be mixed into `Ixx_x` instruction classes

akiramenai and others added 30 commits April 8, 2024 00:05

[EraVM] Disable sp + reg + imm addressing

eaf56af

This patch is a joint work by: Dmitry Borisenkov <[email protected]> Vladimir Radosavljevic <[email protected]>

[EraVM] Precommit alias analysis tests

191d328

Signed-off-by: Vladimir Radosavljevic <[email protected]>

[EraVM] Add simple implementation of AliasAnalysis

6ab7a3d

Signed-off-by: Vladimir Radosavljevic <[email protected]>

[EraVM] Support modeling storage operations as memory

5207ac6

Signed-off-by: Vladimir Radosavljevic <[email protected]>

NFC Create tickets for all untracked TODO and FIXME

b8e3624

Also remove TODO and FIXME that are no longer needed.

[EraVM] Remove sstore and sload intrinsics

0d916c5

Signed-off-by: Vladimir Radosavljevic <[email protected]>

[EraVM] Tweak Loop Unroll parameters

0516e60

Signed-off-by: Vladimir Radosavljevic <[email protected]>

[EraVM] Change the __sha3 offset argument to a pointer.

8189acd

[EraVM] Always inline functions with one call site

f005d77

Signed-off-by: Vladimir Radosavljevic <[email protected]>

[EraVM] Adding __sha3 constant folding.

9e429f1

[EraVM] Fix lowering of SDIV and SREM nodes

1976bb2

When lowering SDIV and SREM nodes, we are generating UDIV and UREM nodes with two outputs, instead of one. Signed-off-by: Vladimir Radosavljevic <[email protected]>

[EraVM] Fix constant DAG nodes for MVT::fatptr

cbbaafa

Substitute MVT::fatptr with MVT::i256 when creating a constant DAG node. To be squashed with ed7d384 when update LLVM to reduce local change noice.

[EraVM] Register existing backend passes with LLVM pass manager

c48231d

This allows -print-after-all/-print-before-all to dump MIR around these passes. Signed-off-by: Vladimir Radosavljevic <[email protected]>

[EraVM] Precommit test for MemCpy optimization

a41025e

Signed-off-by: Vladimir Radosavljevic <[email protected]>

[EraVM] Don't generate memmove in MemCpy dependence optimization

96fb0be

Signed-off-by: Vladimir Radosavljevic <[email protected]>

[EraVM] Add InaccessibleMemOnly attribute to some of the context intr…

45d565c

…insics Signed-off-by: Vladimir Radosavljevic <[email protected]>

[EraVM] Enable AliasAnalysis for all location sizes

19c2f13

Signed-off-by: Vladimir Radosavljevic <[email protected]>

[EraVM][Codegen] Optimize select instruction coalescing

7df5d1e

* PostRA pass to fold conditional move with operand def instruction * added IR tests and MIR tests

[EraVM] Add a pass to prepare the indexed memops

86ab8a3

This pass intends to recoginze memops that can be optimized to indexed memops and then rewrite them to favor the subsequent indexed memops combine pass.

[EraVM] Implement isConstantPhysReg

68f4703

Signed-off-by: Vladimir Radosavljevic <[email protected]>

[EraVM] Allow backend passes to optimize LOADCONST

f38fc13

Signed-off-by: Vladimir Radosavljevic <[email protected]>

[EraVM] Optimize select sequence by tying src to dst regs

1502906

In order to efficiently emit expanded SELECT instruction, it is best to coalesce a source register with the dest to avoid a redundant copy. This is done by tying the register allocator to assign them a same register.

[EraVM] Implement __system_request

e7d73e5

NFC Cleanup EraVM code generator

7e96fb4

[EraVM] Fix EraVMIndexedMemOpsPrepare pass

38bd2a3

The pass only worked correctly if a loop body was a latch, otherwise the phi with a wrong incoming BB was created. The patch changes the incoming BB to the actual latch. It also ensure that a loop has only one latch.

[LLVM] Suppress linter in APInt destructor

685173a

[EraVM] Support ISA 1.5 instructions

af4b63c

Support log.decommit, tload, tstore.

[EraVM] Hoist flag setting instructions

f18ed4a

Signed-off-by: Vladimir Radosavljevic <[email protected]>

[EraVM] Add a pass to fold SELECT with input value zero

b911b11

When one of SELECT's input operands is zero, we can fold it with its sole user to benefit the performance. The tie is used to ensure the two register operands in user inst will be allocated to same reg.

Enna1 and others added 28 commits May 27, 2024 00:07

[GVN] lazily update dominator tree when merge unconditional branches …

5e05559

…in function (#88477) With this change, the wall time for of GVN pass decreased from 873,745.492 ms to 367,375.304 ms in an our internal testcase.

[EraVM][MC] Misc test fixes

80a9664

[EraVM] Add pre-commit tests for Fold select with overflow intrinsics

5237ccc

Add tests for folding select with overflow intrinsics where we shouldn't attempt to inverse overflow LT caused by uaddo and umulo. Co-authored-by: Alan Li <[email protected]> PR: #521, Issue: #491.

[EraVM][LLD] Add test on misaligned relocations

9593d53

ci: fix issues with forks and add build binaries workflow

aeeec90

[EraVM][MC] Clarify usage of MemOperandKind and EncodedOperandMode enums

7542703

Explicitly specify EncodedOperandMode type and only convert it to int when necessary. Introduce MemOperandKind::OperandInvalid to not implicitly use OperandCode as invalid stack operand kind.

[EraVM][MC] Reuse analyzeMCOperandsStack for adjusting opcode on enco…

3322c22

…ding

ci: disable build job completely if required is false

10a60db

[skip ci]

ci: use per-commit difference for formatting and clang-tidy

c06812c

[skip ci]

[EraVM] Refactor mapping between different input operand kinds

b4606e5

* Collapse OperandAddrMode and OperandAM fields of IBinary class * Use named constants instead of numbers * Drop synthetic Value field from OperandAddrModeValue class (renamed to SrcOperandMode)

[EraVM][MC] Remove dead code here and there

2d06889

[EraVM] Remove Value field from OpcodeEncoding

a5289f2

This field's value has no particular meaning neither in TableGen nor in C++ and can be autogenerated if GenericEnum is ever desired.

[EraVM] Remove unused OpcodeEncodings from EraVMOpcodes.td

ad59863

[EraVM] Do not set Uses=[Flags] unconditionally in JCl

c359618

atrosinenko force-pushed the eravm-mc/refine-uses-flags branch from 9d80fd6 to c359618 Compare June 12, 2024 15:55

akiramenai force-pushed the main branch from 4663b02 to 9ec5f52 Compare May 6, 2025 08:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[EraVM] Do not set Uses=[Flags] unconditionally in JCl #558

[EraVM] Do not set Uses=[Flags] unconditionally in JCl #558

atrosinenko commented May 21, 2024

Uh oh!

Uh oh!

[EraVM] Do not set Uses=[Flags] unconditionally in JCl #558

Are you sure you want to change the base?

[EraVM] Do not set Uses=[Flags] unconditionally in JCl #558

Conversation

atrosinenko commented May 21, 2024

Uh oh!

Uh oh!