Skip to content

Added scalar assigns clause to code contracts #5403

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Oct 15, 2020

Conversation

azcwagner
Copy link

@azcwagner azcwagner commented Jun 30, 2020

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • White-space or formatting changes outside the feature-related changed lines are in commits of their own.

This PR expands code contracts functionality with an 'assigns' clause for defining the writable memory frame of a function. At this time, only scalars are supported, with more complex types to follow.

@karkhaz karkhaz added the aws Bugs or features of importance to AWS CBMC users label Jun 30, 2020
Copy link
Collaborator

@martin-cs martin-cs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this. It is really good to see the extensive testing. The code looks good and given the work that you have already done, this seems a bit cheeky but ... please could you write some documentation? I /think/ this is implementing some kind of data-flow / separating boundary annotation to cut down the number of false alarms when over-approximating function calls in deductive verification. Is this correct? If so, should some of the code be in with the generation of function bodies? Documentation will also make it much easier to review this.

@azcwagner
Copy link
Author

I have added assigns-clause.md under doc/cprover-manual. The intention of the assigns clause is to 1) ensure that when a function call is replaced with the contract, the appropriate memory is havocked, and 2) enforce that no memory outside the assigns clause is written. That is, the assigns clause specifies that everything which does not appear within it will not be touched.

In traditional Hoare-style reasoning, the postcondition is the only guarantee we have after the function is called. Any information about memory which is not mentioned in the postcondition is discarded. Simply replacing the function call with "assert(precondition); assume(postcondition);" does not match the semantics of the contract, because the assume statement leaves lots of information about program memory untouched. The assigns clause is meant to close this gap.

@azcwagner azcwagner closed this Jul 2, 2020
@azcwagner azcwagner reopened this Jul 2, 2020
@codecov
Copy link

codecov bot commented Jul 3, 2020

Codecov Report

Merging #5403 into develop will increase coverage by 0.11%.
The diff coverage is 92.69%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #5403      +/-   ##
===========================================
+ Coverage    68.40%   68.52%   +0.11%     
===========================================
  Files         1187     1187              
  Lines        98090    98265     +175     
===========================================
+ Hits         67102    67339     +237     
+ Misses       30988    30926      -62     
Flag Coverage Δ
#cproversmt2 42.96% <42.10%> (-0.01%) ⬇️
#regression 65.68% <92.69%> (+0.12%) ⬆️
#unit 32.26% <42.10%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/ansi-c/c_typecheck_base.h 100.00% <ø> (ø)
src/goto-instrument/code_contracts.h 100.00% <ø> (ø)
src/ansi-c/c_typecheck_code.cpp 81.98% <82.60%> (-0.13%) ⬇️
src/goto-instrument/code_contracts.cpp 83.16% <92.59%> (+46.69%) ⬆️
src/ansi-c/ansi_c_declaration.h 97.72% <100.00%> (+0.16%) ⬆️
src/ansi-c/c_typecheck_base.cpp 81.31% <100.00%> (+0.41%) ⬆️
src/ansi-c/c_typecheck_expr.cpp 75.18% <100.00%> (+0.16%) ⬆️
src/ansi-c/parser.y 78.28% <100.00%> (+0.22%) ⬆️
src/ansi-c/scanner.l 63.94% <100.00%> (+0.03%) ⬆️
src/solvers/strings/string_builtin_function.h 74.39% <0.00%> (-2.44%) ⬇️
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update aac9d61...071a725. Read the comment docs.

@martin-cs
Copy link
Collaborator

@azcwagner Thanks for clarifying the intent as a "frame" property.

Copy link
Collaborator

@martin-cs martin-cs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why not. I know @tautschnig was leading Code Contract things at one point so it would be good to have his thoughts on this.

It would also be good to see some kind of alignment between the code contract work and the generate function bodies because it feels like the "havoc just these variables" is something they could use as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems overall fine to me, I have a couple of nitpicks though. Especially take care to use the proper constructors for expressions and avoid unnecessary abbreviations.

--enforce-all-contracts
^EXIT=10$
^SIGNAL=0$
^VERIFICATION FAILED$

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to add clauses to this test that check for which assigns clauses are being violated and which aren’t? Ditto for other tests which have failures.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, the way the assigns clause is implemented adds an assertion before each assignment which requires that the left-hand-side which is about to be assigned actually aliases something in the list of assigns clause targets. They end up being stitched together with logical OR operations, and the whole thing ends up as one assertion per assignment. To identify a specific item from the assigns clause, we would need to restructure the assertion or augment it with some additional information. Is this what you mean?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd read @hannes-steffenhagen-diffblue 's comment as referring to clauses, not targets. So even with the disjunction it should be possible to distinguish the failing clause?

return false;
}

bool code_contractst::add_pointer_checks(const std::string &func_name)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

function_name, also this should probably just be an irep_idt if it’s indeed the name of a function.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is receiving the function name somewhat directly from the command line options. Is there an advantage to creating an irep_idt from it before this function is called?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the variable name has been updated to function_name now; thanks.

@hannes-steffenhagen-diffblue
Copy link
Contributor

The first commit title should be "Add scalar assigns clause to code contracts”, see this guide we frequently recommend. It should also have a short summary of what the commit is intended to do in its body. The following commits addressing review comments should be fixed up into the first commit as they are all for the same thing, just addressing issues from the first revision.

Copy link
Contributor

@danielsn danielsn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if two pointers alias? This would seem to be OK, but I want to make sure

int foo(int* x, int *y) __CPROVER_assigns(*x)
{
  *x = 2;
}

int bar(int *x) {
   return foo(x,x);
}

- Before each assignment statement, *lhs* := *rhs*, add an assertion (structured as a disjunction)

assert(&#8707; *tmp*<sub>*e*</sub>. __CPROVER_same_object(\&(*lhs*), *tmp*<sub>*e*</sub>)
- Before each function call with an assigns clause *a*, add an assertion for each *e*<sub>*a*</sub> &#8712; *exprs*<sub>*a*</sub> (also formulated as a disjunction)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens to a function call that doesn't have an assigns clause?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a function doesn't have an assigns clause, we can't make any assumptions about what memory it might write, so a call to such a function gets prepended with assert(FALSE).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really sure what this assertion is telling us?

@@ -645,6 +645,10 @@ void c_typecheck_baset::typecheck_declaration(
irept contract;

{
exprt spec_assigns =
static_cast<const exprt &>(declaration.find(ID_C_spec_assigns));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we guaranteed this cast is safe?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While find returns an irept object, which is a parent of exprt, the way the parser is currently implemented, we should always be getting an exprt when searching for ID_C_spec_assigns. If someone does something strange in the parser, we might encounter a problem, but I think this is very unlikely.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The usual thing to do is to add accessors to the relevant irept subclass (e.g. https://github.com/diffblue/cbmc/blob/develop/src/util/std_expr.h#L4195), or create a new subclass for your purposes, that ensures safety. It should wrap and hide the raw irept interface, accepting only things that conform to expectations (in this case exprt) and providing the corresponding downcast on the read side.

For example, if this is particular to ansi_c_declarationt then you might extend that class with

exprt &spec_assigns() { return static_cast<exprt &>(add(ID_C_spec_assigns)); }
const exprt &spec_assigns() const { return static_cast<const exprt &>(find(ID_C_spec_assigns)); }

Copy link
Collaborator

@feliperodri feliperodri Sep 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smowton I added spec_assigns to ansi_c_declarationt for assigns clauses, ensures and requires. We should add a single exprt subclass with accessors, but this is not specific to assigns clauses. We'd need to replace all casts for ensures and requires as well. Thus, I recommend doing so on a separate PR specific for this clean up.

- For each expression *e* &#8712; *exprs*<sub>*c*</sub>, create a temporary variable *tmp*<sub>*e*</sub> to store \&(*e*), the address of *e*, at the start of *f*.
- Before each assignment statement, *lhs* := *rhs*, add an assertion (structured as a disjunction)

assert(&#8707; *tmp*<sub>*e*</sub>. __CPROVER_same_object(\&(*lhs*), *tmp*<sub>*e*</sub>)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using same_object seems overly conservative. What if the pointers passed in are to two distinct locations, but from the same allocation?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the scenario you're describing. Could you provide an example?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danielsn when you say "pointers passed in," what are you referring to that the pointers are passed "in" to?

@azcwagner azcwagner force-pushed the assigns-clause-scalars branch 3 times, most recently from 45a5fbb to fb3fcaf Compare August 4, 2020 18:13
@feliperodri feliperodri force-pushed the assigns-clause-scalars branch 2 times, most recently from fd45e37 to f7684eb Compare August 7, 2020 04:47
@feliperodri
Copy link
Collaborator

@hannes-steffenhagen-diffblue the build AWS CodeBuild us-east-1 (cbmc-linux-cmake-gcc) seem stale, could you re-trigger?

Copy link
Contributor

@smowton smowton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Main suggestion: move all the static-casts into accessor functions in a wrapper class that extends irept, exprt or whatever else is appropriate, rather than scattering them throughout the code. This will provide a single place where consistency of the irept derivatives in use is checked.

@@ -645,6 +645,10 @@ void c_typecheck_baset::typecheck_declaration(
irept contract;

{
exprt spec_assigns =
static_cast<const exprt &>(declaration.find(ID_C_spec_assigns));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The usual thing to do is to add accessors to the relevant irept subclass (e.g. https://github.com/diffblue/cbmc/blob/develop/src/util/std_expr.h#L4195), or create a new subclass for your purposes, that ensures safety. It should wrap and hide the raw irept interface, accepting only things that conform to expectations (in this case exprt) and providing the corresponding downcast on the read side.

For example, if this is particular to ansi_c_declarationt then you might extend that class with

exprt &spec_assigns() { return static_cast<exprt &>(add(ID_C_spec_assigns)); }
const exprt &spec_assigns() const { return static_cast<const exprt &>(find(ID_C_spec_assigns)); }

typecheck_symbol(symbol);

// add code contract (if any); we typecheck this after the
// function body done above, so as to have parameter symbols
// available
symbolt &new_symbol = symbol_table.get_writeable_ref(identifier);

typecheck_assigns_exprs(
static_cast<codet &>(contract), ID_C_spec_assigns);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to_code?

Copy link
Collaborator

@feliperodri feliperodri Sep 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smowton to_code has a precondition to ensure exprt's ID is equal code. As I suggested in the previous comment, I recommend doing a clean up for this casts regarding assigns clauses, ensures, and requires on a dedicated PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@feliperodri Can you provide a pointer to the exact comment you are referring to?

@feliperodri feliperodri force-pushed the assigns-clause-scalars branch 3 times, most recently from ce678a5 to 0b3d015 Compare September 24, 2020 05:13
@feliperodri
Copy link
Collaborator

Main suggestion: move all the static-casts into accessor functions in a wrapper class that extends irept, exprt or whatever else is appropriate, rather than scattering them throughout the code. This will provide a single place where consistency of the irept derivatives in use is checked.

@smowton could you take another look at this PR? As I mentioned in the previous comments, I agree with the suggestion to have a exprt subclass specific for contracts and replace all static-casts; however, this would require to update all instances of ensures and requires as well, so I suggest doing so on a separate PR. The current implementation mirrors the one for ensures and requires.

@feliperodri feliperodri requested a review from smowton September 24, 2020 05:21
Copy link
Collaborator

@tautschnig tautschnig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's quite hard to see the actual changes among all the whitespace cleanup. Please put these in separate commits (or possibly even pull requests) as also suggested in the pull-request template.

@feliperodri feliperodri force-pushed the assigns-clause-scalars branch 2 times, most recently from 5093206 to 6da3630 Compare September 25, 2020 14:26
Copy link
Collaborator

@tautschnig tautschnig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a number of changes I'd like to please see completed before merging, but nothing fundamentally blocking.

## CBMC Assigns Clause

### Introduction
Here is described the syntax and semantics of the "assigns" clause as implemented in CBMC. This construct allows the user to specify a list of memory locations which may be written within a particular scope. While an assigns clause may, in general, be interpreted with either "writes" or "modifies" semantics, this design is based on the former.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. It would be good to limit lines to 80 characters or the likes to have smaller diffs with future changes to the text. Applies to this line and also all others in this file.
  2. Do we have documentation of the other code-contract related features? If so, we should make sure this documentation appears within the appropriate context. If not, we'll need that documentation as well. (We should then just track the absence of that documentation in an issue, it's not for this PR to address this problem.)
  3. The wording could do with some improvement. I think I'd just start with "The "assigns" clause allows the user ...," dropping the remainder of the first sentence.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Done.
  2. I didn't find an appropriate documentation for code contracts, so I'm tracking this on issue Improve code contract documentation #5525.
  3. I improved the wording as suggested, but we still need to improve the overall text (see Improve code contract documentation #5525).


### Introduction
Here is described the syntax and semantics of the "assigns" clause as implemented in CBMC. This construct allows the user to specify a list of memory locations which may be written within a particular scope. While an assigns clause may, in general, be interpreted with either "writes" or "modifies" semantics, this design is based on the former.
This means that memory not captured by the assigns clause must not be written within the given scope, even if the value(s) therein are not modified.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we verify that this is the case? If so, the documentation should say so.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm tracking this request on #5525.

idx++;
}
// Make sure there are no gotos that go back such that a malloc is between
// the goto and its destination (possible loop).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to what @hannes-steffenhagen-diffblue said - use natural_loops, as is done, e.g., in dump_c. That will directly provide you with a loop body, which can then be searched for those function calls.

@NlightNFotis
Copy link
Contributor

Where are we with this?

Is there anything outstanding, or can this get merged?

@hannes-steffenhagen-diffblue
Copy link
Contributor

@feliperodri @tautschnig Have outstanding comments been addressed / is this ready to merge now?

@feliperodri feliperodri force-pushed the assigns-clause-scalars branch from 6da3630 to 9998062 Compare October 15, 2020 05:09
@feliperodri feliperodri changed the title Added scalar assigns clause to code contracts. Added scalar assigns clause to code contracts Oct 15, 2020
@feliperodri
Copy link
Collaborator

@feliperodri @tautschnig Have outstanding comments been addressed / is this ready to merge now?

@hannes-steffenhagen-diffblue @NlightNFotis this is ready to merge now.

Christopher Wagner and others added 4 commits October 15, 2020 01:21
Expanding CBMC code contracts to include an
assigns clause for scalar variables and
their pointers.
@feliperodri feliperodri force-pushed the assigns-clause-scalars branch 2 times, most recently from a56005b to 943409b Compare October 15, 2020 05:43
@feliperodri feliperodri force-pushed the assigns-clause-scalars branch from 943409b to 071a725 Compare October 15, 2020 07:14
@NlightNFotis NlightNFotis merged commit a0e4c95 into diffblue:develop Oct 15, 2020
@feliperodri feliperodri deleted the assigns-clause-scalars branch October 15, 2020 13:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aws Bugs or features of importance to AWS CBMC users
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants