-
Notifications
You must be signed in to change notification settings - Fork 267
Description
Version and Platform (required):
Latest
Bug Description:
Array indexing doesn't work in certain circumstances.
Steps To Reproduce:
Open the included binary: hello.zip
Go to symbol array at 0x90e8 and set it's type to uint32_t array[0x10].
Now visit works0(), fails0(), works1(), and fails1() and in their code, set the reference to 0x90e8 to array type (hotkey "o"). It should now change to symbol "array".
In the works*() functions, it should show up as array[arg1].
But in the fails*() functions, it shows up as *(&array + <reg>).
Expected Behavior:
They should all show the nice array[arg1].
Screenshots:
Additional Information:
My attempts at simplifying revealed two separate causes of the array index deref failing:
- the array index gets written somewhere
- the array value doesn't get used
Peter figured out that these cases are subsumed by a more general case: The array index
is getting used more than once. Because my test code for cause #2 didn't return anything explicitly, so r0 was returned, which was the index by virtue of it being the first argument, a second use.
Why is it so important that it not get used again? Because if it's used exactly once,
the expression can be "folded" (inlined). For example:
exprA = arg1#0 << 2
exprB = *(0x91a4 + exprA)
...can become...
exprB = *(0x91a4 + (arg1#0 << 2))
And that right hand side, a (const pointer with addition of an amount that is a muliple
of the array member size) fits an array index pattern, so it can become 0x9a14[arg1].
Enforcement of the one use occurs in highlevelilastoptimizer.cpp:
if (uses.size() == 1)
{
// Destination of the assignment is used in exactly one place, this assignment
// can be folded into the instruction that uses itThe transfer to the array index syntax I think happens in highlevelilsimplifier.cpp:
HighLevelILSimplifier::Simplify
HighLevelILSimplifier::SimplifyAdd
There are at least two ways go solve this:
- Have another analysis phase, where instead of just folding/inlining expressions that are
used once, do it also for expressions that are used perhaps more. Peter called this
duplication, and it has the obvious follow-on problem of deciding what a reasonable
threshold is. - Insert another array index pattern, one where that permits a const pointer with addition
of a variable, and that variable must resolve to an array index pattern, recursively.
Approach #2 is being pursued now.
