-
Notifications
You must be signed in to change notification settings - Fork 23
[EraVM] Do not set Uses=[Flags] unconditionally in JCl #558
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
atrosinenko
wants to merge
786
commits into
main
Choose a base branch
from
eravm-mc/refine-uses-flags
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+96,611
−16,497
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This patch is a joint work by: Dmitry Borisenkov <[email protected]> Vladimir Radosavljevic <[email protected]>
This pass tries to replace an stdlib call with either a call to the more efficient algorithm, or just an optimized sequence of instructions. Curretly, it handles only __exp(2^m, exp) calls and replaces them with calls to __exp_pow2() that just computes 1 << (m * exp) value.
Signed-off-by: Vladimir Radosavljevic <[email protected]>
Signed-off-by: Vladimir Radosavljevic <[email protected]>
Signed-off-by: Vladimir Radosavljevic <[email protected]>
Also remove TODO and FIXME that are no longer needed.
Signed-off-by: Vladimir Radosavljevic <[email protected]>
Signed-off-by: Vladimir Radosavljevic <[email protected]>
Signed-off-by: Vladimir Radosavljevic <[email protected]>
When lowering SDIV and SREM nodes, we are generating UDIV and UREM nodes with two outputs, instead of one. Signed-off-by: Vladimir Radosavljevic <[email protected]>
Substitute MVT::fatptr with MVT::i256 when creating a constant DAG node. To be squashed with ed7d384 when update LLVM to reduce local change noice.
This allows -print-after-all/-print-before-all to dump MIR around these passes. Signed-off-by: Vladimir Radosavljevic <[email protected]>
Signed-off-by: Vladimir Radosavljevic <[email protected]>
Signed-off-by: Vladimir Radosavljevic <[email protected]>
…insics Signed-off-by: Vladimir Radosavljevic <[email protected]>
Signed-off-by: Vladimir Radosavljevic <[email protected]>
* PostRA pass to fold conditional move with operand def instruction * added IR tests and MIR tests
This pass intends to recoginze memops that can be optimized to indexed memops and then rewrite them to favor the subsequent indexed memops combine pass.
Signed-off-by: Vladimir Radosavljevic <[email protected]>
Signed-off-by: Vladimir Radosavljevic <[email protected]>
In order to efficiently emit expanded SELECT instruction, it is best to coalesce a source register with the dest to avoid a redundant copy. This is done by tying the register allocator to assign them a same register.
The pass only worked correctly if a loop body was a latch, otherwise the phi with a wrong incoming BB was created. The patch changes the incoming BB to the actual latch. It also ensure that a loop has only one latch.
Support log.decommit, tload, tstore.
Signed-off-by: Vladimir Radosavljevic <[email protected]>
When one of SELECT's input operands is zero, we can fold it with its sole user to benefit the performance. The tie is used to ensure the two register operands in user inst will be allocated to same reg.
…in function (#88477) With this change, the wall time for of GVN pass decreased from 873,745.492 ms to 367,375.304 ms in an our internal testcase.
Make R1 register be understood as the default operand of `RETrl` and `REVERTrl` instructions (but make this alias non-default for printing until switched to the new asm syntax). Change panic instruction to either accept jump target operand or none at all.
In preparation for renaming of assembler mnemonics, split `EraVMOpcode` definitions for ret, revert and panic instructions into "without label" and "with label" variants, so their mnemonics can be specified independently.
Add tests for folding select with overflow intrinsics where we shouldn't attempt to inverse overflow LT caused by uaddo and umulo. Co-authored-by: Alan Li <[email protected]> PR: #521, Issue: #491.
After folding, the select instructions will use overflow condition code which is LT actually, this will give us shorter code sequence normally. Please be noted that GE can only be used as reversal code for overflow LT caused by usubo, but not for uaddo and umulo. Co-authored-by: Alan Li <[email protected]> PR: #521, Issue: #491.
Make NOP support the remaining combinations of operands. Reorganize the tests: add `nop.(s|txt)` test files similar to those for arithmetic instructions, put the test cases for `nop`, `incsp` and `decsp` aliases to `sp-changes.(s|txt)` files.
Simplify the syntax for specifying both "static" and "shard" modifiers of a far call: either `mnemonic.st.sh` or `mnemonic.sh.st`. Please note that ".st" and ".sh" are technically just parts of the mnemonic's name, so they should go before a predicate, if any.
Clarify the `@fat.ptr.call` test case before renaming the mnemonics. The instructions actually generated for `@fat.ptr.call` function are ptr.add r2, r0, r1 near_call r0, @fat.ptr.arg, @DEFAULT_UNWIND ret Before renaming `ptr.add` to `addp`, the first line was accidentally matched by the `add r2, r0, r1` substring.
Rename RETURN and REVERT pseudos to make their purpose easier to understand. Remove InstAlias for panic instruction. Since register operand of panic instruction was removed, it virtually became a MnemonicAlias, so update the definition of OpPanic instead as no alias needed in the new syntax.
Reuse the same `parseMCOperandsCode` and `parseMCOperandsStack` functions that are used by code emitter to analyze operands of MCInst. While it makes easier to improve formatting of printed assembly code, this commit strives to keep the existing formatting as much as possible, to not combine refactoring with massive changes of tests.
Explicitly specify EncodedOperandMode type and only convert it to int when necessary. Introduce MemOperandKind::OperandInvalid to not implicitly use OperandCode as invalid stack operand kind.
Rename load/store as well as several special instructions to reflect their current names according to new syntax. Remove trailing "r" from several instructions that only have a fixed set of supported operands and would need "rr" or "rrr" suffix anyway.
Update `(EraVMchange_sp GR256:$sp_change)` pattern to select NOPrrs instead of NOPSPr (used by `llvm.restorestack` intrinsic). Remove `(EraVMchange_sp imm16:$sp_change)` pattern as it is currently unused and untested. Change the `EraVMchange_sp` name to more descriptive `EraVMadd_to_sp`.
If looking for a miscompile revert candidate, look here! The transform being enabled prefers comparing to a loop invariant exit value for a secondary IV over using an otherwise dead primary IV. This increases register pressure (by requiring the exit value to be live through the loop), but reduces the number of instructions within the loop by one. On RISC-V which has a large number of scalar registers, this is generally a profitable transform. We loose the ability to use a beqz on what is typically a count down IV, and pay the cost of computing the exit value on the secondary IV in the loop preheader, but save an add or sub in the loop body. For anything except an extremely short running loop, or one with extreme register pressure, this is profitable. On spec2017, we see a 0.42% geomean improvement in dynamic icount, with no individual workload regressing by more than 0.25%. Code size wise, we trade a (possibly compressible) beqz and a (possibly compressible) addi for a uncompressible beq. We also add instructions in the preheader. Net result is a slight regression overall, but neutral or better inside the loop. Previous versions of this transform had numerous cornercase correctness bugs. All of them ones I can spot by inspection have been fixed, and I have run this through all of spec2017, but there may be further issues lurking. Adding uses to an IV is a fraught thing to do given poison semantics, so this transform is somewhat inherently risky. This patch is a reworked version of D134893 by @eop. That patch has been abandoned since May, so I picked it up, reworked it a bit, and am landing it.
This patch makes `shouldFoldTerminatingConditionAfterLSR` return `true` for EraVM target. Thus LSR will try to replace main induction variable of a loop with one of secondary IVs. Thus it reduces the number of instructions in a loop by eliminating main IV increment. This transformation increases register pressure, but it's rarely the problem for EraVM. PR: #599, Issue: #580.
* Collapse OperandAddrMode and OperandAM fields of IBinary class * Use named constants instead of numbers * Drop synthetic Value field from OperandAddrModeValue class (renamed to SrcOperandMode)
This field's value has no particular meaning neither in TableGen nor in C++ and can be autogenerated if GenericEnum is ever desired.
Unlike mapping between different input operand kinds, the three other `InstrMapping`s only have two options each that are flipped by the mapper function. Remove the synthetic Value field from `DestAddrModeValue` (renamed to `DestOperandMode`) class. Keep Value fields in `mod_set_flags` and `mod_swap` as these fields are meaningful for opcode computation. Remove the `ToReg` vs. `ToRegReg` and `ToStack` vs. `ToStackReg` separation as it is not really used.
Perform a few preparations before removing repeated definitions of operands in `EraVMInstrInfo.td`: * move the definitions of EraVM-specific `AsmOperandClass`es and `Operand`s, so that they can be used in `EraVMInstrFormats.td` * define tablegen classes to be mixed into `Ixx_x` instruction classes
9d80fd6
to
c359618
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.