Skip to content

Conversation

@smowton
Copy link
Contributor

@smowton smowton commented Jul 17, 2019

The rough idea: instead of recursing over the structure of an assignment LHS, then performing post-hoc adjustments to permit field-sensitivity to have maximum effect, perform all expression simplification up to LHS field-sensitivity application right at the start of symex_assign. This both simplifies the symex_assign workflow, as shift_indexed_access_to_lhs and rewrite_with_to_field_symbols are no longer needed, and makes maintaining l2_full_lhs simpler, as the expression skeleton no longer needs to undergo any complicated transformations.

There is one down-side: with expressions not introduced by goto-symex are no longer handled. If they are only ever really produced within goto-symex then this is no loss -- the simplification and field-sensitivity application ahead of symex_assign achieves the same thing. If we expect to handle GOTO programs with with_exprt already present then we should do something a little cleverer.

This of course needs tests (or existing tests should be reviewed) to ensure we still have field-sensitivity in all the cases we intend.

Test PR for this + array-cell sensitivity: #4918


Long version: Rename LHS rvalues up front

This was already happening in most cases (e.g. symex_dereference would apply field-sensitivity
before symex_assign_rec was entered), but the case of a statically non-constant but dynamically
constant array index or byte-extract offset would only be handled after symex_assign_rec, leading
to an asymmetry in which operations were combined into the ssa_exprt and which ones accumulated in
the expr_skeletont.

With this change the LHS expression is maximally renamed and simplified before symex_assign_rec is
entered, which means that any field-sensitive symbols on the LHS are fully constructed as early as
possible, and expr_skeletont only accumulates those member, index and byte-extract operations which
cannot be associated with an ssa_exprt. This means there is no need to undo with-operations or
try to simplify byte-extract operations in goto_symext::assign_from_non_struct_symbol, as this has
been taken care of already, and the l2_full_lhs can be constructed from the expression skeleton
trivially.

@codecov-io
Copy link

Codecov Report

Merging #4917 into develop will decrease coverage by 0.02%.
The diff coverage is 91.25%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #4917      +/-   ##
===========================================
- Coverage    69.21%   69.18%   -0.03%     
===========================================
  Files         1307     1306       -1     
  Lines       108019   108024       +5     
===========================================
- Hits         74763    74735      -28     
- Misses       33256    33289      +33
Impacted Files Coverage Δ
src/goto-symex/symex_assign.cpp 81.73% <100%> (-1.83%) ⬇️
src/goto-symex/goto_symex.cpp 96.77% <100%> (+0.34%) ⬆️
src/goto-symex/goto_symex_state.h 100% <100%> (ø) ⬆️
unit/goto-symex/symex_assign.cpp 100% <100%> (ø) ⬆️
src/goto-symex/goto_symex_state.cpp 92.59% <80.55%> (-1.55%) ⬇️
src/goto-symex/expr_skeleton.cpp 51.85% <0%> (-48.15%) ⬇️
src/util/simplify_expr.cpp 85.36% <0%> (-1.93%) ⬇️
src/solvers/lowering/byte_operators.cpp 83.84% <0%> (-0.14%) ⬇️
src/util/optional.h

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f6ea83c...167f562. Read the comment docs.

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 167f562).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/119511318

@kroening
Copy link
Collaborator

I agree with the approach -- should this be rename_rhs, since 'rvalue' has a slightly different meaning since C++ has rvalue references? For symmetry, have rename_lhs?

@tautschnig ?

@smowton
Copy link
Contributor Author

smowton commented Jul 24, 2019

Added tests as #4944, which this PR passes. I regard this as ready to go once those tests are in.

@smowton smowton force-pushed the smowton/cleanup/rename-lhs-rvalues-up-front-on-develop branch from 167f562 to 86aacf6 Compare July 24, 2019 13:39
smowton added 6 commits July 25, 2019 09:18
In the process it is moved (without change) to renaming_level.h/cpp,
alongside the similarly-structured get_original_name functions.
This renames the rvalue components of an lvalue expression to L2, such as the index of an array
expression or the offset involved in byte-extract operation. Lvalue components (e.g. symbols) are
left alone.
This was already happening in most cases (e.g. symex_dereference would apply field-sensitivity
before symex_assign_rec was entered), but the case of a statically non-constant but dynamically
constant array index or byte-extract offset would only be handled *after* symex_assign_rec, leading
to an asymmetry in which operations were combined into the ssa_exprt and which ones accumulated in
the expr_skeletont.

With this change the LHS expression is maximally renamed and simplified before symex_assign_rec is
entered, which means that any field-sensitive symbols on the LHS are fully constructed as early as
possible, and expr_skeletont only accumulates those member, index and byte-extract operations which
*cannot* be associated with an ssa_exprt. This means there is no need to undo with-operations or
try to simplify byte-extract operations in goto_symext::assign_from_non_struct_symbol, as this has
been taken care of already, and the l2_full_lhs can be constructed from the expression skeleton
trivially.
These are unused now that we don't have shift_indexed_access_to_lhs or rewrite_with_to_field_symbols
@smowton smowton force-pushed the smowton/cleanup/rename-lhs-rvalues-up-front-on-develop branch from 86aacf6 to c4bc800 Compare July 25, 2019 08:18
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: c4bc800).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/120656593

@smowton
Copy link
Contributor Author

smowton commented Jul 26, 2019

@kroening re: naming, it's definitely not rename_rhs, since this acts specifically on an lvalue to rename components of it which are read (e.g. array and byte-extract indices, and ternary-conditional condition operands). Not wedded to rvalues as a term for these though.

@smowton
Copy link
Contributor Author

smowton commented Jul 26, 2019

Perhaps just l2_rename_lvalue, which can call regular old rename on things which are read and l2_rename_lvalue on sub-lvalues?

@smowton
Copy link
Contributor Author

smowton commented Jul 26, 2019

@andreast271 I hear from email relayed via @kroening that the out-of-bounds issue was not introduced by that PR, is that accurate? If so are you happy for this to go in?

@andreast271
Copy link
Contributor

@smowton That is accurate and I posted about it on 4933. I would be very happy for this PR to go in. It makes a huge positive difference for my use case.

@smowton
Copy link
Contributor Author

smowton commented Jul 26, 2019

So you did, I'm not sure why I didn't get email about that. Polling @kroening for a final approval here.

@kroening kroening merged commit 1aefd22 into diffblue:develop Jul 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants