Skip to content

Pointer_offset misuse in value_set_dereference.cpp? #188

Closed
@smowton

Description

@smowton

It appears the logic in value_set_dereferencet::build_reference_to can produce incorrect pointer_offset expressions, leading to wrong offsets in array accesses.

Example:

struct java_style_array {
  int length;
  void** data;
}

data will sometimes be treated generically (as void**) and sometimes with a specific type (e.g. struct java.lang.String**.

Array accesses are currently generated using expressions like:

(java.lang.String*)*(array->data + 2) (pretty-printed as array->data[2])

Suppose the symex phase identifies the actual allocation stored in data as dynamic_1_array and the struct as a whole dynamic_object_2. Then in response to this access it will notice (a) the actual-allocation and reference types don't match, so it must use the byte-extract path at value_set_dereference.cpp:585, and (b) the actual referent is dynamic_1_array.

So far so good: we'd like to see it emit something like (java.lang.String*)dynamic_1_array[2] or less optimistically byte_extract(java.lang.String*, 2 * sizeof(java.lang.String*), dynamic_1_array).

Instead of that offset however we get pointer_offset(dynamic_object_2.data + 2). This is incorrect, as it evaluates to pointer_offset(dynamic_object_2.data) + (2 * sizeof(void*)) = 24, producing an off-by-one error indexing the array.

In its most general terms, I think the problem is that the value-set analysis can walk across object boundaries, thus mapping array.data[2] to the underlying allocation for data not array, but the pointer_offset expression built at value_set_dereference.cpp:590 does not take this into account. In this case it needed to follow the recursive walk over the expression performed by value_sett::get_value_set_rec et al and note that since we had:

array -> dynamic_object_2
array->data -> dynamic_1_array
array->data[2] -> dynamic_1_array

...then we need to output dynamic_1_array + 2 rather than what amounts to dynamic_1_array + (&array->data - array) + 2.

My test cases for this are based on my work in progress on Java arrays, so can't be run against master, but I will follow up with a C test-case if I can reproduce the same behaviour there.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions