-
Notifications
You must be signed in to change notification settings - Fork 274
Added scalar assigns clause to code contracts #5403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added scalar assigns clause to code contracts #5403
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this. It is really good to see the extensive testing. The code looks good and given the work that you have already done, this seems a bit cheeky but ... please could you write some documentation? I /think/ this is implementing some kind of data-flow / separating boundary annotation to cut down the number of false alarms when over-approximating function calls in deductive verification. Is this correct? If so, should some of the code be in with the generation of function bodies? Documentation will also make it much easier to review this.
I have added assigns-clause.md under doc/cprover-manual. The intention of the assigns clause is to 1) ensure that when a function call is replaced with the contract, the appropriate memory is havocked, and 2) enforce that no memory outside the assigns clause is written. That is, the assigns clause specifies that everything which does not appear within it will not be touched. In traditional Hoare-style reasoning, the postcondition is the only guarantee we have after the function is called. Any information about memory which is not mentioned in the postcondition is discarded. Simply replacing the function call with "assert(precondition); assume(postcondition);" does not match the semantics of the contract, because the assume statement leaves lots of information about program memory untouched. The assigns clause is meant to close this gap. |
Codecov Report
@@ Coverage Diff @@
## develop #5403 +/- ##
===========================================
+ Coverage 68.40% 68.52% +0.11%
===========================================
Files 1187 1187
Lines 98090 98265 +175
===========================================
+ Hits 67102 67339 +237
+ Misses 30988 30926 -62
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
@azcwagner Thanks for clarifying the intent as a "frame" property. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why not. I know @tautschnig was leading Code Contract things at one point so it would be good to have his thoughts on this.
It would also be good to see some kind of alignment between the code contract work and the generate function bodies because it feels like the "havoc just these variables" is something they could use as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems overall fine to me, I have a couple of nitpicks though. Especially take care to use the proper constructors for expressions and avoid unnecessary abbreviations.
--enforce-all-contracts | ||
^EXIT=10$ | ||
^SIGNAL=0$ | ||
^VERIFICATION FAILED$ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to add clauses to this test that check for which assigns clauses are being violated and which aren’t? Ditto for other tests which have failures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, the way the assigns clause is implemented adds an assertion before each assignment which requires that the left-hand-side which is about to be assigned actually aliases something in the list of assigns clause targets. They end up being stitched together with logical OR operations, and the whole thing ends up as one assertion per assignment. To identify a specific item from the assigns clause, we would need to restructure the assertion or augment it with some additional information. Is this what you mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd read @hannes-steffenhagen-diffblue 's comment as referring to clauses, not targets. So even with the disjunction it should be possible to distinguish the failing clause?
return false; | ||
} | ||
|
||
bool code_contractst::add_pointer_checks(const std::string &func_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
function_name
, also this should probably just be an irep_idt
if it’s indeed the name of a function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is receiving the function name somewhat directly from the command line options. Is there an advantage to creating an irep_idt from it before this function is called?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, the variable name has been updated to function_name now; thanks.
The first commit title should be "Add scalar assigns clause to code contracts”, see this guide we frequently recommend. It should also have a short summary of what the commit is intended to do in its body. The following commits addressing review comments should be fixed up into the first commit as they are all for the same thing, just addressing issues from the first revision. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if two pointers alias? This would seem to be OK, but I want to make sure
int foo(int* x, int *y) __CPROVER_assigns(*x)
{
*x = 2;
}
int bar(int *x) {
return foo(x,x);
}
doc/cprover-manual/assigns-clause.md
Outdated
- Before each assignment statement, *lhs* := *rhs*, add an assertion (structured as a disjunction) | ||
|
||
assert(∃ *tmp*<sub>*e*</sub>. __CPROVER_same_object(\&(*lhs*), *tmp*<sub>*e*</sub>) | ||
- Before each function call with an assigns clause *a*, add an assertion for each *e*<sub>*a*</sub> ∈ *exprs*<sub>*a*</sub> (also formulated as a disjunction) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens to a function call that doesn't have an assigns clause?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a function doesn't have an assigns clause, we can't make any assumptions about what memory it might write, so a call to such a function gets prepended with assert(FALSE).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really sure what this assertion is telling us?
src/ansi-c/c_typecheck_base.cpp
Outdated
@@ -645,6 +645,10 @@ void c_typecheck_baset::typecheck_declaration( | |||
irept contract; | |||
|
|||
{ | |||
exprt spec_assigns = | |||
static_cast<const exprt &>(declaration.find(ID_C_spec_assigns)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we guaranteed this cast is safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While find returns an irept object, which is a parent of exprt, the way the parser is currently implemented, we should always be getting an exprt when searching for ID_C_spec_assigns. If someone does something strange in the parser, we might encounter a problem, but I think this is very unlikely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The usual thing to do is to add accessors to the relevant irept subclass (e.g. https://github.com/diffblue/cbmc/blob/develop/src/util/std_expr.h#L4195), or create a new subclass for your purposes, that ensures safety. It should wrap and hide the raw irept interface, accepting only things that conform to expectations (in this case exprt) and providing the corresponding downcast on the read side.
For example, if this is particular to ansi_c_declarationt
then you might extend that class with
exprt &spec_assigns() { return static_cast<exprt &>(add(ID_C_spec_assigns)); }
const exprt &spec_assigns() const { return static_cast<const exprt &>(find(ID_C_spec_assigns)); }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@smowton I added spec_assigns
to ansi_c_declarationt
for assigns clauses, ensures and requires. We should add a single exprt subclass with accessors, but this is not specific to assigns clauses. We'd need to replace all casts for ensures and requires as well. Thus, I recommend doing so on a separate PR specific for this clean up.
doc/cprover-manual/assigns-clause.md
Outdated
- For each expression *e* ∈ *exprs*<sub>*c*</sub>, create a temporary variable *tmp*<sub>*e*</sub> to store \&(*e*), the address of *e*, at the start of *f*. | ||
- Before each assignment statement, *lhs* := *rhs*, add an assertion (structured as a disjunction) | ||
|
||
assert(∃ *tmp*<sub>*e*</sub>. __CPROVER_same_object(\&(*lhs*), *tmp*<sub>*e*</sub>) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using same_object
seems overly conservative. What if the pointers passed in are to two distinct locations, but from the same allocation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand the scenario you're describing. Could you provide an example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@danielsn when you say "pointers passed in," what are you referring to that the pointers are passed "in" to?
45a5fbb
to
fb3fcaf
Compare
fd45e37
to
f7684eb
Compare
@hannes-steffenhagen-diffblue the build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Main suggestion: move all the static-casts into accessor functions in a wrapper class that extends irept, exprt or whatever else is appropriate, rather than scattering them throughout the code. This will provide a single place where consistency of the irept derivatives in use is checked.
src/ansi-c/c_typecheck_base.cpp
Outdated
@@ -645,6 +645,10 @@ void c_typecheck_baset::typecheck_declaration( | |||
irept contract; | |||
|
|||
{ | |||
exprt spec_assigns = | |||
static_cast<const exprt &>(declaration.find(ID_C_spec_assigns)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The usual thing to do is to add accessors to the relevant irept subclass (e.g. https://github.com/diffblue/cbmc/blob/develop/src/util/std_expr.h#L4195), or create a new subclass for your purposes, that ensures safety. It should wrap and hide the raw irept interface, accepting only things that conform to expectations (in this case exprt) and providing the corresponding downcast on the read side.
For example, if this is particular to ansi_c_declarationt
then you might extend that class with
exprt &spec_assigns() { return static_cast<exprt &>(add(ID_C_spec_assigns)); }
const exprt &spec_assigns() const { return static_cast<const exprt &>(find(ID_C_spec_assigns)); }
typecheck_symbol(symbol); | ||
|
||
// add code contract (if any); we typecheck this after the | ||
// function body done above, so as to have parameter symbols | ||
// available | ||
symbolt &new_symbol = symbol_table.get_writeable_ref(identifier); | ||
|
||
typecheck_assigns_exprs( | ||
static_cast<codet &>(contract), ID_C_spec_assigns); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to_code
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@feliperodri Can you provide a pointer to the exact comment you are referring to?
ce678a5
to
0b3d015
Compare
@smowton could you take another look at this PR? As I mentioned in the previous comments, I agree with the suggestion to have a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's quite hard to see the actual changes among all the whitespace cleanup. Please put these in separate commits (or possibly even pull requests) as also suggested in the pull-request template.
5093206
to
6da3630
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a number of changes I'd like to please see completed before merging, but nothing fundamentally blocking.
doc/cprover-manual/assigns-clause.md
Outdated
## CBMC Assigns Clause | ||
|
||
### Introduction | ||
Here is described the syntax and semantics of the "assigns" clause as implemented in CBMC. This construct allows the user to specify a list of memory locations which may be written within a particular scope. While an assigns clause may, in general, be interpreted with either "writes" or "modifies" semantics, this design is based on the former. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- It would be good to limit lines to 80 characters or the likes to have smaller diffs with future changes to the text. Applies to this line and also all others in this file.
- Do we have documentation of the other code-contract related features? If so, we should make sure this documentation appears within the appropriate context. If not, we'll need that documentation as well. (We should then just track the absence of that documentation in an issue, it's not for this PR to address this problem.)
- The wording could do with some improvement. I think I'd just start with "The "assigns" clause allows the user ...," dropping the remainder of the first sentence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Done.
- I didn't find an appropriate documentation for code contracts, so I'm tracking this on issue Improve code contract documentation #5525.
- I improved the wording as suggested, but we still need to improve the overall text (see Improve code contract documentation #5525).
doc/cprover-manual/assigns-clause.md
Outdated
|
||
### Introduction | ||
Here is described the syntax and semantics of the "assigns" clause as implemented in CBMC. This construct allows the user to specify a list of memory locations which may be written within a particular scope. While an assigns clause may, in general, be interpreted with either "writes" or "modifies" semantics, this design is based on the former. | ||
This means that memory not captured by the assigns clause must not be written within the given scope, even if the value(s) therein are not modified. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we verify that this is the case? If so, the documentation should say so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm tracking this request on #5525.
idx++; | ||
} | ||
// Make sure there are no gotos that go back such that a malloc is between | ||
// the goto and its destination (possible loop). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to what @hannes-steffenhagen-diffblue said - use natural_loops
, as is done, e.g., in dump_c
. That will directly provide you with a loop body, which can then be searched for those function calls.
Where are we with this? Is there anything outstanding, or can this get merged? |
@feliperodri @tautschnig Have outstanding comments been addressed / is this ready to merge now? |
6da3630
to
9998062
Compare
@hannes-steffenhagen-diffblue @NlightNFotis this is ready to merge now. |
Expanding CBMC code contracts to include an assigns clause for scalar variables and their pointers.
Signed-off-by: Felipe R. Monteiro <[email protected]>
Signed-off-by: Felipe R. Monteiro <[email protected]>
a56005b
to
943409b
Compare
Signed-off-by: Felipe R. Monteiro <[email protected]>
943409b
to
071a725
Compare
This PR expands code contracts functionality with an 'assigns' clause for defining the writable memory frame of a function. At this time, only scalars are supported, with more complex types to follow.