Skip to content

Reusing the register or variable that acted as an index prevents array indexing simplification. #4749

@lwerdna

Description

@lwerdna

Version and Platform (required):

Latest

Bug Description:

Array indexing doesn't work in certain circumstances.

Steps To Reproduce:

Open the included binary: hello.zip

Go to symbol array at 0x90e8 and set it's type to uint32_t array[0x10].

Now visit works0(), fails0(), works1(), and fails1() and in their code, set the reference to 0x90e8 to array type (hotkey "o"). It should now change to symbol "array".

In the works*() functions, it should show up as array[arg1].

But in the fails*() functions, it shows up as *(&array + <reg>).

Expected Behavior:

They should all show the nice array[arg1].

Screenshots:

image

Additional Information:

My attempts at simplifying revealed two separate causes of the array index deref failing:

  1. the array index gets written somewhere
  2. the array value doesn't get used

Peter figured out that these cases are subsumed by a more general case: The array index
is getting used more than once. Because my test code for cause #2 didn't return anything explicitly, so r0 was returned, which was the index by virtue of it being the first argument, a second use.

Why is it so important that it not get used again? Because if it's used exactly once,
the expression can be "folded" (inlined). For example:

exprA = arg1#0 << 2
exprB = *(0x91a4 + exprA)
...can become...
exprB = *(0x91a4 + (arg1#0 << 2))

And that right hand side, a (const pointer with addition of an amount that is a muliple
of the array member size) fits an array index pattern, so it can become 0x9a14[arg1].

Enforcement of the one use occurs in highlevelilastoptimizer.cpp:

  		if (uses.size() == 1)
  		{
  			// Destination of the assignment is used in exactly one place, this assignment
  			// can be folded into the instruction that uses it

The transfer to the array index syntax I think happens in highlevelilsimplifier.cpp:

 HighLevelILSimplifier::Simplify
 HighLevelILSimplifier::SimplifyAdd

There are at least two ways go solve this:

  1. Have another analysis phase, where instead of just folding/inlining expressions that are
    used once, do it also for expressions that are used perhaps more. Peter called this
    duplication, and it has the obvious follow-on problem of deciding what a reasonable
    threshold is.
  2. Insert another array index pattern, one where that permits a const pointer with addition
    of a variable, and that variable must resolve to an array index pattern, recursively.

Approach #2 is being pursued now.

Metadata

Metadata

Assignees

Labels

Core: HLILIssue involves High Level ILEffort: LowIssues require < 1 week of workImpact: HighIssue adds or blocks important functionality

Type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions