Skip to content

Conversation

@romainbrenguier
Copy link
Contributor

If a byte_extract operation is added to the lhs, then the invert
operation must be performed on the skeleton so that
new_skeleton[new_lhs] is equivalent to skeleton[lhs].

This requires adding a method revert_byte_extract on skeletont.
Unit tests are added for this method and for convert assign symbol.

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • [na] The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • [na] My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • White-space or formatting changes outside the feature-related changed lines are in commits of their own.

The function only has an effect if the byte_extract expression can be
simplified.
So there is no point for the caller to call it with do_simplify=false.
Since the type of expression to simplify is known, we avoid several
intermediary calls by calling directly the function that would
eventually be called.
If the expression e itself is nil then it would be incorrect to return a
pointer to that, so we special case that and return nullptr to denote
failure.
This makes it clear from the start of the function that this will only
modify byte_update_expressions.
The name of the variable does not add any information to the reader.
Copy link
Collaborator

@tautschnig tautschnig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could I please ask for a an explanation why we shouldn't instead just revert #4841? If a bugfix requires +584/-94 changes then we've got a serious problem in our architecture. This doesn't scale.

@romainbrenguier romainbrenguier force-pushed the bugfix/shift-indexed-access-to-lhs branch from a7a7542 to a5971b2 Compare July 12, 2019 12:42
This allows checking we are replacing the missing part by an appropriate
one.
This will also make some manipulation easier, like reverting a byte
extract operation on a member_expr skeleton.
This is meant to revert the effect of a byte extract on a skeleton.
This can be used in symex when assignments are transformed, by switching
byte_update/extract operations from one side of the assignment to the other.
Function defined in the header should be marked inline (except for
template which are inline by default).
Tests we function works as expected on skeleton containing member and
index access.
If a byte_extract operation is added to the lhs, then the revert
operation must be performed on the skeleton so that
new_skeleton[new_lhs] is equivalent to skeleton[lhs].
This tests the case where a byte_extract expression is assigned.
Since we change the lhs it does not make sense to keep the same skeleton
and we have to start from a fresh one.
@romainbrenguier romainbrenguier force-pushed the bugfix/shift-indexed-access-to-lhs branch from a5971b2 to 504262b Compare July 12, 2019 13:31
@romainbrenguier
Copy link
Contributor Author

@tautschnig

Could I please ask for a an explanation why we shouldn't instead just revert #4841? If a bugfix requires +584/-94 changes then we've got a serious problem in our architecture. This doesn't scale.

The problem is not with #4841, this was a refactoring to makes things clearer. The bug was already there before. It was introduced when shift_indexed_access_to_lhs was added, but there are other places with similar problems.

@tautschnig
Copy link
Collaborator

The problem is not with #4841, this was a refactoring to makes things clearer. The bug was already there before. It was introduced when shift_indexed_access_to_lhs was added, but there are other places with similar problems.

So which bug does it fix? I cannot seem to see a single regression test. I think this is the third PR on the expr_skeletont stuff, which tells me that we are not testing this properly.

byte_update.offset(),
byte_update.value().type()},
ns);
exprt byte_extract = simplify_exprt{ns}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not -- simplify can decide how to do whatever simplification seems appropriate; we shouldn't lift the lid in one place.

@smowton
Copy link
Contributor

smowton commented Jul 15, 2019

After staring at this for a while, thoughts:

  1. shift_indexed_access_to_lhs's documentation is not quite right: it seems designed to usually turn x = byte_update y offset z <- w into x = x with .field = (x.field with x.field[1] = w) or similar. That's assuming the while(byte_extract.id() == ID_index || byte_extract.id() == ID_member) loop is expected to terminate when it hits the original x symbol again, and it can't generate something like (byte_extract struct A from x offset (z-2)).field. I'm quite surprised there are situations where the byte_extract operation transferred to the LHS will simplify into array-cell or field accesses but the byte_update operation on the RHS won't simplify into with expressions! @tautschnig ?
  2. Regardless of what shift_indexed_access_to_lhs is doing, it doesn't affect the real lhs (the one recorded in the GOTO program), which is I think what full_lhs is supposed to be tracking. Therefore to my mind it should not change the skeleton.
  3. The same line of reasoning applies to rewrite_with_to_field_symbols -- yes, that should turn x = x with .field = value into x..field = value, but again the actual operation happening at GOTO level is unchanged, only the way symex is encoding it has changed. Therefore I think this function should also not touch the skeleton.
  4. On the other hand, line const exprt l2_full_lhs = state.rename(assignment.original_lhs_skeleton.apply(l2_lhs), ns).get(); in symex_assign.cpp seems highly suspicious -- during the symex_assign_rec loop we might have recorded various index and member operations which were undone by rewrite_with_to_field_symbols, then we proceed the blithely stick the ssa expressions together regardless. I think all the dancing around with the skeleton in rewrite_with_to_field_symbols and shift_indexed_access_to_lhs is just trying to make that step go as we hope it might, and instead we should take l2_lhs's root symbol, which should be the same as that of the lhs passed into assign_non_struct_symbol, then rename as at present.

To sum up: l2_full_lhs should be tracking the GOTO-level operation, not how symex chooses to represent it, therefore it should take no part in the field-sensitivity dance in symex_assignt::assign_non_struct_symbol. I'll put together an alternative PR tomorrow morning to test this solution.

@smowton
Copy link
Contributor

smowton commented Jul 17, 2019

Here's my rival PR: #4917

Long story short: do all the LHS rvalue renaming and simplifying right at the start of symex_assign, that way we can delete shift_indexed_access_to_lhs and rewrite_with_to_field_symbols entirely and thus the skeleton manipulations get much simpler.

@romainbrenguier
Copy link
Contributor Author

#4917 makes this irrelevant

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants