Skip to content

Cleanup find_symbols API #6727

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
May 17, 2022
Merged

Conversation

tautschnig
Copy link
Collaborator

The various functions declared in find_symbols.h might sound alike, but actually have different semantics. I wasn't really sure what to use anymore, so I went for a cleanup instead. Requires #6725 and #6726. Please review commit-by-commit.

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • n/a The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • n/a My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • n/a White-space or formatting changes outside the feature-related changed lines are in commits of their own.

@tautschnig tautschnig self-assigned this Mar 14, 2022
@tautschnig tautschnig mentioned this pull request Mar 14, 2022
4 tasks
@tautschnig tautschnig changed the title Cleanup find_symbols API [depends-on: #6725, #6726] Cleanup find_symbols API [depends-on: #6766] Mar 25, 2022
@tautschnig tautschnig force-pushed the cleanup/find_symbols branch from 3809a49 to 3f6ecba Compare May 10, 2022 12:12
@kroening
Copy link
Member

This function is trying to do too much.

  1. Type-symbols and 'normal ones' are really very different, and the use-cases won't overlap. I'd make these separate functions.
  2. I don't think next_symbol is used anywhere in this repo. It can go away.
  3. This leaves bound vs. non-bound symbols. How about passing that as a category to a functor?

@codecov
Copy link

codecov bot commented May 10, 2022

Codecov Report

Merging #6727 (5bf57ef) into develop (5d74a03) will decrease coverage by 0.00%.
The diff coverage is 74.24%.

@@             Coverage Diff             @@
##           develop    #6727      +/-   ##
===========================================
- Coverage    77.78%   77.78%   -0.01%     
===========================================
  Files         1567     1567              
  Lines       179701   179769      +68     
===========================================
+ Hits        139789   139840      +51     
- Misses       39912    39929      +17     
Impacted Files Coverage Δ
src/analyses/interval_analysis.cpp 0.00% <0.00%> (ø)
src/ansi-c/expr2c.cpp 67.63% <ø> (+0.03%) ⬆️
src/goto-instrument/accelerate/accelerate.cpp 32.28% <0.00%> (+0.33%) ⬆️
.../goto-instrument/accelerate/acceleration_utils.cpp 2.23% <0.00%> (+<0.01%) ⬆️
src/goto-instrument/accelerate/trace_automaton.cpp 0.00% <ø> (ø)
src/goto-instrument/wmm/data_dp.cpp 75.71% <ø> (ø)
src/goto-programs/goto_program.h 90.70% <ø> (ø)
src/solvers/flattening/arrays.cpp 81.45% <ø> (ø)
src/util/find_symbols.cpp 65.71% <65.48%> (-18.45%) ⬇️
src/pointer-analysis/value_set_fi.cpp 69.83% <66.66%> (+0.04%) ⬆️
... and 34 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6061711...5bf57ef. Read the comment docs.

@tautschnig
Copy link
Collaborator Author

This function is trying to do too much.

1. Type-symbols and 'normal ones' are really very different, and the use-cases won't overlap. I'd make these separate functions.

That makes a lot of sense as far as the user-facing API is concerned (which already has different functions), but I'm not sure the underlying implementation would be very different? Expressions can be contained in types and types in expressions, so the iteration will be rather similar?

2. I don't think next_symbol is used anywhere in this repo. It can go away.

The precondition and postcondition code presumably still supports it, but likely no one cares. I'll clean this up.

3. This leaves bound vs. non-bound symbols. How about passing that as a category to a functor?

With the above changes this might end up as nothing more than a Boolean? I'll give it a try.

@kroening
Copy link
Member

That makes a lot of sense as far as the user-facing API is concerned (which already has different functions), but I'm not sure the underlying implementation would be very different? Expressions can be contained in types and types in expressions, so the iteration will be rather similar?

I can certainly see a case for offering separate, easy functions as part of the user-facing API, which then call one (complex) function internally to avoid code duplication.

The precondition and postcondition code presumably still supports it, but likely no one cares. I'll clean this up.

next_symbol is not to be confused with __CPROVER_old -- I don't believe pre/postconditions ever used next_symbol.

With the above changes this might end up as nothing more than a Boolean? I'll give it a try.

Super!

@tautschnig tautschnig force-pushed the cleanup/find_symbols branch from 3f6ecba to 5c0d9fc Compare May 11, 2022 11:08
@tautschnig
Copy link
Collaborator Author

@kroening I hope that the state at the end of commit series in this PR matches your expectations. The individual commits should be meaningful, but there is some back&forth, so it's probably best to either checkout the branch or just review all code changes at once.

@kroening
Copy link
Member

  1. Comments: you may wish to call the non-bound variables the 'free variables'.
  2. Can you add an explicit comment whether or not bound variables are considered?
  3. Is there a use-case for finding both free and the bound variables in the codebase?

@tautschnig tautschnig force-pushed the cleanup/find_symbols branch from 5c0d9fc to 307b67c Compare May 12, 2022 12:55
@tautschnig
Copy link
Collaborator Author

  1. Comments: you may wish to call the non-bound variables the 'free variables'.

Thank you, done.

  1. Can you add an explicit comment whether or not bound variables are considered?

Done.

  1. Is there a use-case for finding both free and the bound variables in the codebase?

I cannot rule out that such a case exists, but I'm not currently aware of one. There may well be places that currently feature a local lookalike of find_symbols.

@kroening
Copy link
Member

5. Is there a use-case for finding both free and the bound variables in the codebase?

I cannot rule out that such a case exists, but I'm not currently aware of one. There may well be places that currently feature a local lookalike of find_symbols.

Let's go with the PR as is for now. In the long term, it makes sense to change the default semantics of the find_symbol functions to exclude the bound symbols. I strongly suspect that this is what all users want.

@tautschnig tautschnig changed the title Cleanup find_symbols API [depends-on: #6766] Cleanup find_symbols API May 12, 2022
@peterschrammel peterschrammel removed their assignment May 17, 2022
The existing code did not permit configuring what kinds of symbols are
to be found. It had a single user within the code base, and even that
one was not a good use of it for it constructed a single-element set.

The new implementation does not yet make use of potential efficiency
advantages of "has a symbol" over "find the existing symbols," but such
improvements will be done based on further cleanup of find_symbols code.
"F_BOTH" is not a good fit when there are three (and not two) other
entries. While at it, also add support to selectively choose current or
next expression symbols.
Make the two Boolean parameters of find_symbols(expr, set, bool, bool)
optional so that each existing users can safely use that function
(or another, even more suitable, function) instead.
This is never used in CBMC's code base.
Each function used a different approach, with different potential
omissions of symbols. Instead, use a single way to traverse expressions
(and types), and configure the operation done via a parameter. All other
(non-deprecated) functions are now just thin wrappers.
This is just a change in the enum tag.
…_expr

With the removal of support for ID_next_symbol all uses of `has_symbol`
searched for symbol expressions only.
Adds a new symbol kind to distinguish bound from unbound ones. Add
documentation to all non-deprecated functions and clarify whether or not
bound symbols are included.
@tautschnig tautschnig force-pushed the cleanup/find_symbols branch from 307b67c to 5bf57ef Compare May 17, 2022 12:04
@tautschnig tautschnig merged commit 22999aa into diffblue:develop May 17, 2022
@tautschnig tautschnig deleted the cleanup/find_symbols branch May 17, 2022 13:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants