-
Notifications
You must be signed in to change notification settings - Fork 277
Add function pointer restriction option #5168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add function pointer restriction option #5168
Conversation
src/cbmc/cbmc_parse_options.cpp
Outdated
@@ -78,6 +79,7 @@ Author: Daniel Kroening, [email protected] | |||
|
|||
#include <pointer-analysis/add_failed_symbols.h> | |||
|
|||
#include <goto-programs/simplify_function_pointers.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move above (to the rest of goto-program
includes).
src/cbmc/cbmc_parse_options.cpp
Outdated
@@ -750,7 +752,8 @@ int cbmc_parse_optionst::get_goto_program( | |||
return CPROVER_EXIT_SUCCESS; | |||
} | |||
|
|||
if(cbmc_parse_optionst::process_goto_program(goto_model, options, log)) | |||
if(cbmc_parse_optionst::process_goto_program( | |||
goto_model, options, log, cmdline)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't you just copy the cmdline
part that you need into options
instead of passing the whole thing?
} | ||
|
||
template <typename GotoFunctionT, typename HandlerT> | ||
void for_each_goto_location(GotoFunctionT &&goto_function, HandlerT handle) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's called for_each_goto_location
but inside you iterate over instructions
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not against these but please note the amount of hatred that forall_goto_program_instructions
et al. used to get and try to get some kind of agreement on whether we should be doing more or less of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added this here for now, but I don't feel too strongly about this - I'll remove this if people don't like it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with it but I know it was a big source of argument at one point (I don't remember what was decided TBH) so be a little careful. I think it might be in the coding standard...
src/goto-programs/goto_program.h
Outdated
template <typename GotoFunctionT, typename Predicate, typename HandlerT> | ||
void for_each_goto_location_if( | ||
GotoFunctionT &&goto_function, | ||
Predicate predicate, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PredicateT
for consistency?
\*******************************************************************/ | ||
|
||
/// \file | ||
/// Given goto functions and a list function parameters or globals |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a list of function parameters
|
||
#include "goto_model.h" | ||
|
||
void simplify_function_pointers(goto_modelt &goto_model); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably deserves documenting as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, is there a test for this?
{ | ||
for_each_goto_location( | ||
goto_function.second, [&](goto_programt::targett it) { | ||
if(it->is_function_call()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure it would help much, but you could use the for_each_goto_location_if
and split the is_function_call /\ is_dereference
as Predicate
.
function_call.source_location(), | ||
goto_function_symbol_mode, | ||
goto_model.symbol_table); | ||
auto assign_instruction = goto_programt::instructiont{ASSIGN}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is goto_programt::make_assignment
.
assign_instruction.code = | ||
code_assignt{new_function_pointer_symbol.symbol_expr(), | ||
function_pointer_dereference.pointer()}; | ||
goto_function.second.body.instructions.insert( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe insert_before
would be more explicit.
PRECONDITION(begin != end); | ||
auto const pointer_to_function = | ||
address_of_exprt{symbol_table.lookup_ref(*begin).symbol_expr()}; | ||
id_iteratort const next = begin + 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So is const
first or is actual type first?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed this function entirely, it was there for an earlier version of this that I since discarded.
if(function_type.id() != ID_code) | ||
{ | ||
throw invalid_command_line_argument_exceptiont{ | ||
"not a function pointer: " + id2string(restriction.first), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it gives the same message (as the throw
above), couldn't it be merged?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe? But I think of them as two different failure conditions that just happen to have the same text in them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, (what about "not a pointer to a function").
{ | ||
throw invalid_command_line_argument_exceptiont{ | ||
"type mismatch: `" + id2string(restriction.first) + "' points to `" + | ||
type2c(function_type, ns) + "', but restriction `" + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is also util/format
but I'm not sure which should be prefered.
for_each_goto_location_if( | ||
goto_function, | ||
[](goto_programt::targett target) { return target->is_function_call(); }, | ||
[&](goto_programt::targett target) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed thanks
{ | ||
typecheck_function_pointer_restrictions(goto_model, restrictions); | ||
|
||
// we iterate over all restrictions we might have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The loop below does not seem to iterate over restrictions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vestigial comment, will fix
// we iterate over all restrictions we might have | ||
for(auto &goto_function : goto_model.goto_functions.function_map) | ||
{ | ||
// for each function call, we l |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Brain stopped mid sentence
auto const is_calling_pointer_symbol = | ||
to_dereference_expr(original_function_call.function()) | ||
.pointer() == pointer_symbol; | ||
const symbol_exprt &pointer_symbol = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, can we rename this: there already is one pointer_symbol
above? Second, can we move it below, closer to it's use?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two are actually the same thing, I'll remove this one
source_location.set_property_class(ID_assertion); | ||
// XXX I want to hide these assertions unless they fail - apparently this is | ||
/// not the correct way to do this? | ||
source_location.set_hide(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably don't want this here at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll keep it in here for now. I'll remove this when this goes out of draft, I'm still hoping someone knows how to do this correctly.
string2id); | ||
return target_names; | ||
})(); | ||
for(auto const &target_name : target_names) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't you run this check on target_name_strings
?
.second) | ||
{ | ||
throw invalid_command_line_argument_exceptiont{ | ||
"function pointer restriction for `" + id2string(pointer_name) + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pointer_name
is a string
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed thanks
} | ||
``` | ||
|
||
This works well enough for simple cases. However, this is a "dumb" replacement |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this perhaps isn't customer-facing text, I think it's high enough for us to call existing methods dumb. What about "inefficient" or "non-optimal". (Hedging level scientist.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"very over-approximate" or "too much of an over-approximation for some users".
|
||
### Example | ||
|
||
Take the motivating example. Let us assume that we know for a fact that `call` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe: the motivating example above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Maybe squash a bit.
d9deab0
to
8534a11
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not opposed to this from a user-space point of view, nor the code in general. However there are lots of open issues to do with function pointer removal:
(to mention just a few) and it would be good to try and get some consistency / order / plan here.
As far as I can see, the common theme is to separate the code into two parts:
A. things that get candidate sets of targets for a function pointer : from a user, from the signature, from the signature plus some kind of 'is dirty' analysis for when functions have their address taken, from the expression and the symbol table, from the abstract interpreter, etc.
B. things that use these candidate sets : for transforming the program, for doing AI with the function pointers still in, etc.
Adding this interface / separation should not be too hard.
} | ||
``` | ||
|
||
This works well enough for simple cases. However, this is a "dumb" replacement |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"very over-approximate" or "too much of an over-approximation for some users".
``` | ||
|
||
This works well enough for simple cases. However, this is a "dumb" replacement | ||
only based on the signature of the function, so if there are many functions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC there also used to be a global analysis of "which functions have their addresses taken".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@martin-cs I believe this is still in place actually, so maybe I should update this to be more accurate? On the other hand, I don't really want to replicate the documentation of remove-function-pointers (if it exists) here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be accurate and it should be consolidated with any other documentation for this. I don't think there is any but do have a grep through the manual at least.
return x; | ||
} | ||
|
||
// this is just here to distinguish the behavior from FP removal, which'd include |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
??? I'm not sure what this sentence is saying.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's because I didn't finish the sentence - this comes back to how FP removal works now, which is to scan for
- matching functions
- which have their addresses taken
so I'm adding this so it looks to FP removal as if an int(*)(int)
could also pint to g
rather than just to f
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, that makes a little more sense.
} | ||
|
||
template <typename GotoFunctionT, typename HandlerT> | ||
void for_each_goto_location(GotoFunctionT &&goto_function, HandlerT handle) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not against these but please note the amount of hatred that forall_goto_program_instructions
et al. used to get and try to get some kind of agreement on whether we should be doing more or less of this.
a3ebc2c
to
5040bc6
Compare
This adds a new option, --restrict-function-pointer, to cbmc. This lets a user specify a list of possible pointer targets for specific function pointer call sites, rather than have remove_function_pointers guess possible values. The intended purpose behind this is to prevent excessive symex time wasted on exploring paths the user knows the program can never actually take.
e23741f
to
860117d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔️
Passed Diffblue compatibility checks (cbmc commit: 860117d).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/135089826
A use-case question: This feature appears to be something that requires a specialist user, and I therefore don't expect wide usage. How about adding this to goto-instrument, to avoid adding complexity to CBMC's interface? |
@kroening I don't particularly mind; However the goal was for these mappings to be generated by something like e.g. goto-analyzer most of the time, so they hopefully shouldn't require too much expertise to use (hand writing these may be useful in some circumstances as well, but that wasn't the main point). |
We don't mind having this in goto-instrument at all. In general, we don't expect to be generating anything: often we have a good idea of the function that we want to restrict to. Thank you very much for this work, it looks great, and especially thank you for the documentation and test cases. Here are a couple of cases I'm not sure about. #include <assert.h>
typedef int (*fptr_t)(int);
void call(fptr_t fptr)
{
assert(fptr(10) == 10);
}
int f(int x)
{
return x;
}
int g(int x)
{
return x + 1;
}
int h(int x)
{
return x + 2;
}
int main(void)
{
call(nondet_int() ? f : g);
} > cbmc test.c --restrict-function-pointer call.function_pointer_call.1/f
** Results:
test.c function call
[call.assertion.2] line 7 assertion fptr(10) == 10: SUCCESS
[call.assertion.1] line 7 dereferenced function pointer at call.function_pointer_call.1 must be f: FAILURE ☝️is that expected? I'm restricting to Also, if I change the command line to restrict to
Would it be possible for this to be distinct, to make it clear that this is the case of a user error passing in an impossible function? |
@karkhaz thanks for having a look at this. The assertion
is intended to mean “the function pointer we got passed in was, indeed, a pointer to
The In that regard it is really no different than how the already existing |
860117d
to
031dd8e
Compare
031dd8e
to
860117d
Compare
Codecov Report
@@ Coverage Diff @@
## develop #5168 +/- ##
===========================================
+ Coverage 67.26% 67.28% +0.02%
===========================================
Files 1155 1157 +2
Lines 94476 94686 +210
===========================================
+ Hits 63545 63713 +168
- Misses 30931 30973 +42
Continue to review full report at Codecov.
|
860117d
to
fc37b79
Compare
fc37b79
to
860117d
Compare
This adds a new option, --restrict-function-pointer, to cbmc. This lets a user
specify a list of possible pointer targets for calls to a particular function
pointer variable, rather than have remove_function_pointers guess possible
values. The intended purpose behind this is to prevent excessive symex time
wasted on exploring paths the user knows the program can never actually take.