diff --git a/doc/cprover-manual/index.md b/doc/cprover-manual/index.md index abd028c3b92..9be7116c2a9 100644 --- a/doc/cprover-manual/index.md +++ b/doc/cprover-manual/index.md @@ -9,6 +9,7 @@ [A Short Tutorial](cbmc/tutorial/), [Loop Unwinding](cbmc/unwinding/), [Assertion Checking](cbmc/assertions/), +[Restricting function pointers](cbmc/restrict-function-pointer/), [Memory Analyzer](cbmc/memory-analyzer/), [Program Harness](cbmc/goto-harness/) diff --git a/doc/cprover-manual/restrict-function-pointer.md b/doc/cprover-manual/restrict-function-pointer.md new file mode 100644 index 00000000000..5b63b2b0c1a --- /dev/null +++ b/doc/cprover-manual/restrict-function-pointer.md @@ -0,0 +1,180 @@ +[CPROVER Manual TOC](../../) + +## Restricting function pointers + +In this document, we describe the `goto-instrument` feature to replace calls +through function pointers by case distinctions over calls to given sets of +functions. + +### Motivation + +The CPROVER framework includes a goto program transformation pass +`remove_function_pointers()` to resolve calls to function pointers to direct +function calls. The pass is needed by `cbmc`, as symbolic execution itself can't +handle calls to function pointers. In practice, the transformation pass works as +follows: + +Given that there are functions with these signatures available in the program: + +``` +int f(int x); +int g(int x); +int h(int x); +``` + +And we have a call site like this: + +``` +typedef int(*fptr_t)(int x); +void call(fptr_t fptr) { + int r = fptr(10); + assert(r > 0); +} +``` + +Function pointer removal will turn this into code similar to this: + +``` +void call(fptr_t fptr) { + int r; + if(fptr == &f) { + r = f(10); + } else if(fptr == &g) { + r = g(10); + } else if(fptr == &h) { + r = h(10); + } else { + // sanity check + assert(false); + assume(false); + } + return r; +} +``` + +This works well enough for simple cases. However, this is a very simple +replacement only based on the signature of the function (and whether or not they +have their address taken somewhere in the program), so if there are many +functions matching a particular signature, or if some of these functions are +expensive in symex (e.g. functions with lots of loops or recursion), then this +can be a bit cumbersome - especially if we, as a user, already know that a +particular function pointer will only resolve to a single function or a small +set of functions. The `goto-instrument` option `--restrict-function-pointer` +allows to manually specify this set of functions. + +### Example + +Take the motivating example above. Let us assume that we know for a fact that +`call` will always receive pointers to either `f` or `g` during actual +executions of the program, and symbolic execution for `h` is too expensive to +simply ignore the cost of its branch. For this, we will label the places in each +function where function pointers are being called, to this pattern: + +``` +.function_pointer_call. +``` + +where `N` is referring to which function call it is - so the first call to a +function pointer in a function will have `N=1`, the 5th `N=5` etc. + +We can call `goto-instrument --restrict-function-pointer +call.function_pointer_call.1/f,g in.gb out.gb`. This can be read as + +> For the first call to a function pointer in the function `call`, assume that +> it can only be a call to `f` or `g` + +The resulting output (written to goto binary `out.gb`) looks similar to the +original example, except now there will not be a call to `h`: + +``` +void call(fptr_t fptr) { + int r; + if(fptr == &f) { + r = f(10); + } else if(fptr == &g) { + r = g(10); + } else { + // sanity check + assert(false); + assume(false); + } + return r; +} +``` + +Another example: Imagine we have a simple virtual filesystem API and implementation +like this: + +``` +typedef struct filesystem_t filesystem_t; +struct filesystem_t { + int (*open)(filesystem_t *filesystem, const char* file_name); +}; + +int fs_open(filesystem_t *filesystem, const char* file_name) { + filesystem->open(filesystem, file_name); +} + +int nullfs_open(filesystem_t *filesystem, const char* file_name) { + return -1; +} + +filesystem_t nullfs_val = {.open = nullfs_open}; +filesystem *const nullfs = &nullfs_val; + +filesystem_t *get_fs_impl() { + // some fancy logic to determine + // which filesystem we're getting - + // in-memory, backed by a database, OS file system + // - but in our case, we know that + // it always ends up being nullfs + // for the cases we care about + return nullfs; +} +int main(void) { + filesystem_t *fs = get_fs_impl(); + assert(fs_open(fs, "hello.txt") != -1); +} +``` + +In this case, the assumption is that *we* know that in our `main`, `fs` can be +nothing other than `nullfs`; But perhaps due to the logic being too complicated +symex ends up being unable to figure this out, so in the call to `fs_open()` we +end up branching on all functions matching the signature of +`filesystem_t::open`, which could be quite a few functions within the program. +Worst of all, if its address is ever taken in the program, as far as the "dumb" +function pointer removal is concerned it could be `fs_open()` itself due to it +having a matching signature, leading to symex being forced to follow a +potentially infinite recursion until its unwind limit. + +In this case we can again restrict the function pointer to the value which we +know it will have: + +``` +--restrict-function-pointer fs_open.function_pointer_call.1/nullfs_open +``` + +### Loading from file + +If you have many places where you want to restrict function pointers, it'd be a +nuisance to have to specify them all on the command line. In these cases, you +can specify a file to load the restrictions from instead, via the +`--function-pointer-restrictions-file` option, which you can give the name of a +JSON file with this format: + +``` +{ + "function_call_site_name": ["function1", "function2", ...], + ... +} +``` + +**Note:** If you pass in multiple files, or a mix of files and command line +restrictions, the final restrictions will be a set union of all specified +restrictions. + +**Note:** as of now, if something goes wrong during type checking (i.e. making +sure that all function pointer replacements refer to functions in the symbol +table that have the correct type), the error message will refer to the command +line option `--restrict-function-pointer` regardless of whether the restriction +in question came from the command line or a file. diff --git a/regression/goto-instrument/restrict-function-pointer-to-complex-expression/test.c b/regression/goto-instrument/restrict-function-pointer-to-complex-expression/test.c new file mode 100644 index 00000000000..97c213b654c --- /dev/null +++ b/regression/goto-instrument/restrict-function-pointer-to-complex-expression/test.c @@ -0,0 +1,61 @@ +#include + +typedef int (*fptr_t)(int); + +fptr_t get_f(void); +fptr_t get_g(void); + +void use_fg(int choice, fptr_t fptr, fptr_t gptr) +{ + assert((choice ? fptr : gptr)(10) == 10 + choice); +} + +int main(void) +{ + use_fg(0, get_f(), get_g()); + use_fg(1, get_f(), get_g()); +} + +int f(int x) +{ + return x + 1; +} + +int g(int x) +{ + return x; +} + +int h(int x) +{ + return x / 2; +} + +// Below we take the address of f, g, and h. Thus remove_function_pointers() +// would create a case distinction involving f, g, and h in the function +// use_fg() above. +int always_false_cond = 0; + +fptr_t get_f(void) +{ + if(!always_false_cond) + { + return f; + } + else + { + return h; + } +} + +fptr_t get_g(void) +{ + if(!always_false_cond) + { + return g; + } + else + { + return h; + } +} diff --git a/regression/goto-instrument/restrict-function-pointer-to-complex-expression/test.desc b/regression/goto-instrument/restrict-function-pointer-to-complex-expression/test.desc new file mode 100644 index 00000000000..c6fb8983cc1 --- /dev/null +++ b/regression/goto-instrument/restrict-function-pointer-to-complex-expression/test.desc @@ -0,0 +1,11 @@ +CORE +test.c +--restrict-function-pointer "use_fg.function_pointer_call.1/f,g" +\[use_fg.assertion.2\] line \d+ assertion \(choice \? fptr : gptr\)\(10\) == 10 \+ choice: SUCCESS +\[use_fg.assertion.1\] line \d+ dereferenced function pointer at use_fg.function_pointer_call.1 must be one of \[(f|g), (f|g)\]: SUCCESS +^EXIT=0$ +^SIGNAL=0$ +-- +-- +This test checks that the selected function pointer is replaced by a case +distinction over both functions f and g. diff --git a/regression/goto-instrument/restrict-function-pointer-to-complex-expression/test.json b/regression/goto-instrument/restrict-function-pointer-to-complex-expression/test.json new file mode 100644 index 00000000000..90f8c5c7cae --- /dev/null +++ b/regression/goto-instrument/restrict-function-pointer-to-complex-expression/test.json @@ -0,0 +1,3 @@ +{ + "use_fg.function_pointer_call.1": ["f"] +} diff --git a/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-code-check/test.c b/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-code-check/test.c new file mode 100644 index 00000000000..0acdb6e3495 --- /dev/null +++ b/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-code-check/test.c @@ -0,0 +1,70 @@ +#include + +typedef int (*fptr_t)(int); + +fptr_t get_f(void); + +void use_f(fptr_t fptr) +{ + assert(fptr(10) >= 10); +} + +void select_f(void); +void select_g(void); +void select_h(void); + +int main(void) +{ + select_f(); + use_f(get_f()); + select_g(); + use_f(get_f()); +} + +int f(int x) +{ + return x + 1; +} + +int g(int x) +{ + return x; +} + +int h(int x) +{ + return x - 1; +} + +int select_function = 0; + +void select_f(void) +{ + select_function = 0; +} + +void select_g(void) +{ + select_function = 1; +} + +void select_h(void) +{ + select_function = 2; +} + +fptr_t get_f(void) +{ + if(select_function == 0) + { + return f; + } + else if(select_function == 1) + { + return g; + } + else + { + return h; + } +} diff --git a/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-code-check/test.desc b/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-code-check/test.desc new file mode 100644 index 00000000000..f18cbd6515f --- /dev/null +++ b/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-code-check/test.desc @@ -0,0 +1,12 @@ +CORE +test.c +--restrict-function-pointer use_f.function_pointer_call.1/f,g --show-goto-functions +f\(10\) +g\(10\) +^EXIT=0$ +^SIGNAL=0$ +-- +h\(10\) +-- +This test checks that no call to g appears in the goto program after the +transformation by the restrict function pointer feature. diff --git a/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-incorrectly/test.c b/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-incorrectly/test.c new file mode 100644 index 00000000000..2a2840b81e3 --- /dev/null +++ b/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-incorrectly/test.c @@ -0,0 +1,72 @@ +#include + +typedef int (*fptr_t)(int); + +fptr_t get_f(void); + +void use_f(fptr_t fptr) +{ + assert(fptr(10) >= 10); +} + +void select_f(void); +void select_g(void); +void select_h(void); + +int main(void) +{ + select_f(); + use_f(get_f()); + select_g(); + use_f(get_f()); + select_h(); + use_f(get_f()); +} + +int f(int x) +{ + return x + 1; +} + +int g(int x) +{ + return x; +} + +int h(int x) +{ + return x - 1; +} + +int g_select_function = 0; + +void select_f(void) +{ + g_select_function = 0; +} + +void select_g(void) +{ + g_select_function = 1; +} + +void select_h(void) +{ + g_select_function = 2; +} + +fptr_t get_f(void) +{ + if(g_select_function == 0) + { + return f; + } + else if(g_select_function == 1) + { + return g; + } + else + { + return h; + } +} diff --git a/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-incorrectly/test.desc b/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-incorrectly/test.desc new file mode 100644 index 00000000000..18c7fac80e3 --- /dev/null +++ b/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-incorrectly/test.desc @@ -0,0 +1,18 @@ +CORE +test.c +--restrict-function-pointer use_f.function_pointer_call.1/f,g +\[use_f\.assertion\.1\] line \d+ dereferenced function pointer at use_f.function_pointer_call.1 must be one of \[(f|g), (f|g)\]: FAILURE +^EXIT=10$ +^SIGNAL=0$ +-- +-- +This test checks that multiple restrictions for a given function pointer can be +given. + +The test further checks that the correct safety assertions are generated. The +function pointer restriction feature outputs safety assertions for all calls +that it replaces. That is, when it replaces a call with a case distinction over +a given set of functions, it adds an assertion checking that in the original +program, indeed no other function could have been called at that location. In +this case, the function h could also be called, but the given restrictions only +include f and g, hence the safety assertion fails. diff --git a/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-via-file-and-command-line-options/restrictions.json b/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-via-file-and-command-line-options/restrictions.json new file mode 100644 index 00000000000..e9a374713bf --- /dev/null +++ b/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-via-file-and-command-line-options/restrictions.json @@ -0,0 +1,3 @@ +{ + "use_f.function_pointer_call.1": ["f"] +} diff --git a/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-via-file-and-command-line-options/test.c b/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-via-file-and-command-line-options/test.c new file mode 100644 index 00000000000..7eeb597247c --- /dev/null +++ b/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-via-file-and-command-line-options/test.c @@ -0,0 +1,72 @@ +#include + +typedef int (*fptr_t)(int); + +fptr_t get_f(void); + +void use_f(fptr_t fptr) +{ + assert(fptr(10) >= 10); +} + +void select_f(void); +void select_g(void); +void select_h(void); + +int main(void) +{ + select_f(); + use_f(get_f()); + select_g(); + use_f(get_f()); + select_h(); + use_f(get_f()); +} + +int f(int x) +{ + return x + 1; +} + +int g(int x) +{ + return x; +} + +int h(int x) +{ + return x - 1; +} + +int select_function = 0; + +void select_f(void) +{ + select_function = 0; +} + +void select_g(void) +{ + select_function = 1; +} + +void select_h(void) +{ + select_function = 2; +} + +fptr_t get_f(void) +{ + if(select_function == 0) + { + return f; + } + else if(select_function == 1) + { + return g; + } + else + { + return h; + } +} diff --git a/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-via-file-and-command-line-options/test.desc b/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-via-file-and-command-line-options/test.desc new file mode 100644 index 00000000000..98bb04f6b6a --- /dev/null +++ b/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-via-file-and-command-line-options/test.desc @@ -0,0 +1,18 @@ +CORE +test.c +--function-pointer-restrictions-file restrictions.json --restrict-function-pointer use_f.function_pointer_call.1/g +\[use_f\.assertion\.1\] line \d+ dereferenced function pointer at use_f.function_pointer_call.1 must be one of \[(f|g), (f|g)\]: FAILURE +^EXIT=10$ +^SIGNAL=0$ +-- +-- +This test checks that the restrictions for a function pointer are the union of +the restrictions given in a file and on the command line. + +The test further checks that the correct safety assertions are generated. The +function pointer restriction feature outputs safety assertions for all calls +that it replaces. That is, when it replaces a call with a case distinction over +a given set of functions, it adds an assertion checking that in the original +program, indeed no other function could have been called at that location. In +this case, the function h could also be called, but the given restrictions only +include f and g, hence the safety assertion fails. diff --git a/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-via-file/restrictions.json b/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-via-file/restrictions.json new file mode 100644 index 00000000000..9538cc31654 --- /dev/null +++ b/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-via-file/restrictions.json @@ -0,0 +1,3 @@ +{ + "use_f.function_pointer_call.1": ["f", "g"] +} diff --git a/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-via-file/test.c b/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-via-file/test.c new file mode 100644 index 00000000000..7eeb597247c --- /dev/null +++ b/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-via-file/test.c @@ -0,0 +1,72 @@ +#include + +typedef int (*fptr_t)(int); + +fptr_t get_f(void); + +void use_f(fptr_t fptr) +{ + assert(fptr(10) >= 10); +} + +void select_f(void); +void select_g(void); +void select_h(void); + +int main(void) +{ + select_f(); + use_f(get_f()); + select_g(); + use_f(get_f()); + select_h(); + use_f(get_f()); +} + +int f(int x) +{ + return x + 1; +} + +int g(int x) +{ + return x; +} + +int h(int x) +{ + return x - 1; +} + +int select_function = 0; + +void select_f(void) +{ + select_function = 0; +} + +void select_g(void) +{ + select_function = 1; +} + +void select_h(void) +{ + select_function = 2; +} + +fptr_t get_f(void) +{ + if(select_function == 0) + { + return f; + } + else if(select_function == 1) + { + return g; + } + else + { + return h; + } +} diff --git a/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-via-file/test.desc b/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-via-file/test.desc new file mode 100644 index 00000000000..7f5caa4b0a4 --- /dev/null +++ b/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-via-file/test.desc @@ -0,0 +1,18 @@ +CORE +test.c +--function-pointer-restrictions-file restrictions.json +\[use_f\.assertion\.1\] line \d+ dereferenced function pointer at use_f.function_pointer_call.1 must be one of \[(f|g), (f|g)\]: FAILURE +^EXIT=10$ +^SIGNAL=0$ +-- +-- +This test checks the reading of function pointer restrictions from a given json +file. + +The test further checks that the correct safety assertions are generated. The +function pointer restriction feature outputs safety assertions for all calls +that it replaces. That is, when it replaces a call with a case distinction over +a given set of functions, it adds an assertion checking that in the original +program, indeed no other function could have been called at that location. In +this case, the function h could also be called, but the given restrictions only +include f and g, hence the safety assertion fails. diff --git a/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-via-multiple-files/restrictions_1.json b/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-via-multiple-files/restrictions_1.json new file mode 100644 index 00000000000..e9a374713bf --- /dev/null +++ b/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-via-multiple-files/restrictions_1.json @@ -0,0 +1,3 @@ +{ + "use_f.function_pointer_call.1": ["f"] +} diff --git a/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-via-multiple-files/restrictions_2.json b/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-via-multiple-files/restrictions_2.json new file mode 100644 index 00000000000..a0c410ebec7 --- /dev/null +++ b/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-via-multiple-files/restrictions_2.json @@ -0,0 +1,3 @@ +{ + "use_f.function_pointer_call.1": ["g"] +} diff --git a/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-via-multiple-files/test.c b/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-via-multiple-files/test.c new file mode 100644 index 00000000000..7eeb597247c --- /dev/null +++ b/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-via-multiple-files/test.c @@ -0,0 +1,72 @@ +#include + +typedef int (*fptr_t)(int); + +fptr_t get_f(void); + +void use_f(fptr_t fptr) +{ + assert(fptr(10) >= 10); +} + +void select_f(void); +void select_g(void); +void select_h(void); + +int main(void) +{ + select_f(); + use_f(get_f()); + select_g(); + use_f(get_f()); + select_h(); + use_f(get_f()); +} + +int f(int x) +{ + return x + 1; +} + +int g(int x) +{ + return x; +} + +int h(int x) +{ + return x - 1; +} + +int select_function = 0; + +void select_f(void) +{ + select_function = 0; +} + +void select_g(void) +{ + select_function = 1; +} + +void select_h(void) +{ + select_function = 2; +} + +fptr_t get_f(void) +{ + if(select_function == 0) + { + return f; + } + else if(select_function == 1) + { + return g; + } + else + { + return h; + } +} diff --git a/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-via-multiple-files/test.desc b/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-via-multiple-files/test.desc new file mode 100644 index 00000000000..b1a1756ad8b --- /dev/null +++ b/regression/goto-instrument/restrict-function-pointer-to-multiple-functions-via-multiple-files/test.desc @@ -0,0 +1,20 @@ +CORE +test.c +--function-pointer-restrictions-file restrictions_1.json --function-pointer-restrictions-file restrictions_2.json +\[use_f\.assertion\.1\] line \d+ dereferenced function pointer at use_f.function_pointer_call.1 must be one of \[(f|g), (f|g)\]: FAILURE +^EXIT=10$ +^SIGNAL=0$ +-- +-- +This test checks that the restrictions for a function pointer are the union of +the restrictions given in multiple files. In this case, the restrictions for the +first function pointer call in use_f are f and g, provided by the files +restrictions_1.json and restrictions_2.json. + +The test further checks that the correct safety assertions are generated. The +function pointer restriction feature outputs safety assertions for all calls +that it replaces. That is, when it replaces a call with a case distinction over +a given set of functions, it adds an assertion checking that in the original +program, indeed no other function could have been called at that location. In +this case, the function h could also be called, but the given restrictions only +include f and g, hence the safety assertion fails. diff --git a/regression/goto-instrument/restrict-function-pointer-to-multiple-functions/test.c b/regression/goto-instrument/restrict-function-pointer-to-multiple-functions/test.c new file mode 100644 index 00000000000..ead7fd6ba9c --- /dev/null +++ b/regression/goto-instrument/restrict-function-pointer-to-multiple-functions/test.c @@ -0,0 +1,66 @@ +#include + +typedef int (*fptr_t)(int); + +fptr_t get_f(void); + +void use_f(fptr_t fptr) +{ + assert(fptr(10) >= 10); +} + +void select_f(void); +void select_g(void); +void select_h(void); + +int main(void) +{ + select_f(); + use_f(get_f()); + select_g(); + use_f(get_f()); +} + +int f(int x) +{ + return x + 1; +} +int g(int x) +{ + return x; +} +int h(int x) +{ + return x - 1; +} + +int select_function = 0; + +void select_f(void) +{ + select_function = 0; +} +void select_g(void) +{ + select_function = 1; +} +void select_h(void) +{ + select_function = 2; +} + +fptr_t get_f(void) +{ + if(select_function == 0) + { + return f; + } + else if(select_function == 1) + { + return g; + } + else + { + return h; + } +} diff --git a/regression/goto-instrument/restrict-function-pointer-to-multiple-functions/test.desc b/regression/goto-instrument/restrict-function-pointer-to-multiple-functions/test.desc new file mode 100644 index 00000000000..878139b431e --- /dev/null +++ b/regression/goto-instrument/restrict-function-pointer-to-multiple-functions/test.desc @@ -0,0 +1,6 @@ +CORE +test.c +--restrict-function-pointer use_f.function_pointer_call.1/f,g +\[use_f\.assertion\.2\] line \d+ assertion fptr\(10\) >= 10: SUCCESS +^EXIT=0$ +^SIGNAL=0$ diff --git a/regression/goto-instrument/restrict-function-pointer-to-single-function-code-check/test.c b/regression/goto-instrument/restrict-function-pointer-to-single-function-code-check/test.c new file mode 100644 index 00000000000..3b9844dc0dc --- /dev/null +++ b/regression/goto-instrument/restrict-function-pointer-to-single-function-code-check/test.c @@ -0,0 +1,28 @@ +#include + +typedef int (*fptr_t)(int); + +void use_f(fptr_t fptr) +{ + assert(fptr(10) == 11); +} + +int f(int x) +{ + return x + 1; +} + +int g(int x) +{ + return x; +} + +int main(void) +{ + int one = 1; + + // We take the address of f and g. In this case remove_function_pointers() + // would create a case distinction involving both f and g in the function + // use_f() above. + use_f(one ? f : g); +} diff --git a/regression/goto-instrument/restrict-function-pointer-to-single-function-code-check/test.desc b/regression/goto-instrument/restrict-function-pointer-to-single-function-code-check/test.desc new file mode 100644 index 00000000000..ef4e9827636 --- /dev/null +++ b/regression/goto-instrument/restrict-function-pointer-to-single-function-code-check/test.desc @@ -0,0 +1,11 @@ +CORE +test.c +--restrict-function-pointer use_f.function_pointer_call.1/f --show-goto-functions +f\(10\) +^EXIT=0$ +^SIGNAL=0$ +-- +g\(10\) +-- +This test checks that no call to g appears in the goto program after the +transformation by the restrict function pointer feature. diff --git a/regression/goto-instrument/restrict-function-pointer-to-single-function-incorrectly/test.c b/regression/goto-instrument/restrict-function-pointer-to-single-function-incorrectly/test.c new file mode 100644 index 00000000000..fff24d46413 --- /dev/null +++ b/regression/goto-instrument/restrict-function-pointer-to-single-function-incorrectly/test.c @@ -0,0 +1,28 @@ +#include + +typedef int (*fptr_t)(int); + +void use_f(fptr_t fptr) +{ + assert(fptr(10) == 11); +} + +int f(int x) +{ + return x + 1; +} + +int g(int x) +{ + return x; +} + +int main(void) +{ + int one = 1; + + // We take the address of f and g. In this case remove_function_pointers() + // would create a case distinction involving both f and g in the function + // use_f() above. + use_f(!one ? f : g); // selects g +} diff --git a/regression/goto-instrument/restrict-function-pointer-to-single-function-incorrectly/test.desc b/regression/goto-instrument/restrict-function-pointer-to-single-function-incorrectly/test.desc new file mode 100644 index 00000000000..8ab6995bf6e --- /dev/null +++ b/regression/goto-instrument/restrict-function-pointer-to-single-function-incorrectly/test.desc @@ -0,0 +1,15 @@ +CORE +test.c +--restrict-function-pointer use_f.function_pointer_call.1/f +\[use_f\.assertion\.1\] line \d+ dereferenced function pointer at use_f.function_pointer_call.1 must be f: FAILURE +^EXIT=10$ +^SIGNAL=0$ +-- +-- +The function pointer restriction feature outputs safety assertions for all calls +that it replaces. That is, when it replaces a call with a case distinction over +a given set of functions, it adds an assertion checking that in the original +program, indeed no other function could have been called at that location. This +test verifies this feature, by checking that the safety assertion fails when a +function pointer call is replaced by a call to f, when however g could be called +in the original program. diff --git a/regression/goto-instrument/restrict-function-pointer-to-single-function/test.c b/regression/goto-instrument/restrict-function-pointer-to-single-function/test.c new file mode 100644 index 00000000000..3b9844dc0dc --- /dev/null +++ b/regression/goto-instrument/restrict-function-pointer-to-single-function/test.c @@ -0,0 +1,28 @@ +#include + +typedef int (*fptr_t)(int); + +void use_f(fptr_t fptr) +{ + assert(fptr(10) == 11); +} + +int f(int x) +{ + return x + 1; +} + +int g(int x) +{ + return x; +} + +int main(void) +{ + int one = 1; + + // We take the address of f and g. In this case remove_function_pointers() + // would create a case distinction involving both f and g in the function + // use_f() above. + use_f(one ? f : g); +} diff --git a/regression/goto-instrument/restrict-function-pointer-to-single-function/test.desc b/regression/goto-instrument/restrict-function-pointer-to-single-function/test.desc new file mode 100644 index 00000000000..4108fed725e --- /dev/null +++ b/regression/goto-instrument/restrict-function-pointer-to-single-function/test.desc @@ -0,0 +1,12 @@ +CORE +test.c +--restrict-function-pointer use_f.function_pointer_call.1/f +\[use_f\.assertion\.2\] line \d+ assertion fptr\(10\) == 11: SUCCESS +^EXIT=0$ +^SIGNAL=0$ +-- +-- +This test checks that the function f is called for the first function pointer +call in function use_f. Without the restrict function pointer option, the +regular function pointer removal would create a case distinction over both f and +g for the given program (as the address of both is taken). diff --git a/src/goto-instrument/goto_instrument_parse_options.cpp b/src/goto-instrument/goto_instrument_parse_options.cpp index f65c053c6e9..f4aefa1cbb0 100644 --- a/src/goto-instrument/goto_instrument_parse_options.cpp +++ b/src/goto-instrument/goto_instrument_parse_options.cpp @@ -29,6 +29,7 @@ Author: Daniel Kroening, kroening@kroening.com #include #include #include +#include #include #include #include @@ -39,6 +40,7 @@ Author: Daniel Kroening, kroening@kroening.com #include #include #include +#include #include #include #include @@ -1032,6 +1034,20 @@ void goto_instrument_parse_optionst::instrument_goto_program() options.set_option("unwind", cmdline.get_value("unwind")); } + { + parse_function_pointer_restriction_options_from_cmdline(cmdline, options); + + const auto function_pointer_restrictions = + function_pointer_restrictionst::from_options( + options, log.get_message_handler()); + + if(!function_pointer_restrictions.restrictions.empty()) + { + label_function_pointer_call_sites(goto_model); + restrict_function_pointers(goto_model, function_pointer_restrictions); + } + } + // skip over selected loops if(cmdline.isset("skip-loops")) { @@ -1851,6 +1867,7 @@ void goto_instrument_parse_optionst::help() " --no-caching disable caching of intermediate results during transitive function inlining\n" // NOLINT(*) " --log log in json format which code segments were inlined, use with --function-inline\n" // NOLINT(*) " --remove-function-pointers replace function pointers by case statement over function calls\n" // NOLINT(*) + HELP_RESTRICT_FUNCTION_POINTER HELP_REMOVE_CALLS_NO_BODY HELP_REMOVE_CONST_FUNCTION_POINTERS " --add-library add models of C library functions\n" diff --git a/src/goto-instrument/goto_instrument_parse_options.h b/src/goto-instrument/goto_instrument_parse_options.h index ea147141512..42eeed6eca4 100644 --- a/src/goto-instrument/goto_instrument_parse_options.h +++ b/src/goto-instrument/goto_instrument_parse_options.h @@ -23,6 +23,7 @@ Author: Daniel Kroening, kroening@kroening.com #include #include #include +#include #include #include @@ -117,6 +118,7 @@ Author: Daniel Kroening, kroening@kroening.com "(validate-goto-binary)" \ OPT_VALIDATE \ OPT_ANSI_C_LANGUAGE \ + OPT_RESTRICT_FUNCTION_POINTER \ "(ensure-one-backedge-per-target)" \ // empty last line diff --git a/src/goto-programs/Makefile b/src/goto-programs/Makefile index ec7887728ce..429bbc046c3 100644 --- a/src/goto-programs/Makefile +++ b/src/goto-programs/Makefile @@ -28,6 +28,7 @@ SRC = adjust_float_expressions.cpp \ interpreter_evaluate.cpp \ json_expr.cpp \ json_goto_trace.cpp \ + label_function_pointer_call_sites.cpp \ link_goto_model.cpp \ link_to_library.cpp \ loop_ids.cpp \ @@ -50,6 +51,7 @@ SRC = adjust_float_expressions.cpp \ remove_unused_functions.cpp \ remove_vector.cpp \ remove_virtual_functions.cpp \ + restrict_function_pointers.cpp \ rewrite_union.cpp \ resolve_inherited_component.cpp \ safety_checker.cpp \ diff --git a/src/goto-programs/goto_program.h b/src/goto-programs/goto_program.h index 0192b0c8cbf..5d518b53aac 100644 --- a/src/goto-programs/goto_program.h +++ b/src/goto-programs/goto_program.h @@ -1169,6 +1169,30 @@ struct pointee_address_equalt } }; +template +void for_each_instruction_if( + GotoFunctionT &&goto_function, + PredicateT predicate, + HandlerT handler) +{ + auto &&instructions = goto_function.body.instructions; + for(auto it = instructions.begin(); it != instructions.end(); ++it) + { + if(predicate(it)) + { + handler(it); + } + } +} + +template +void for_each_instruction(GotoFunctionT &&goto_function, HandlerT handler) +{ + using iterator_t = decltype(goto_function.body.instructions.begin()); + for_each_instruction_if( + goto_function, [](const iterator_t &) { return true; }, handler); +} + #define forall_goto_program_instructions(it, program) \ for(goto_programt::instructionst::const_iterator \ it=(program).instructions.begin(); \ diff --git a/src/goto-programs/label_function_pointer_call_sites.cpp b/src/goto-programs/label_function_pointer_call_sites.cpp new file mode 100644 index 00000000000..6c8a050d7e4 --- /dev/null +++ b/src/goto-programs/label_function_pointer_call_sites.cpp @@ -0,0 +1,64 @@ +/*******************************************************************\ +Module: Label function pointer call sites +Author: Diffblue Ltd. +\*******************************************************************/ + +/// \file +/// Label function pointer call sites across a goto model + +#include "label_function_pointer_call_sites.h" + +#include + +void label_function_pointer_call_sites(goto_modelt &goto_model) +{ + for(auto &goto_function : goto_model.goto_functions.function_map) + { + std::size_t function_pointer_call_counter = 0; + for_each_instruction_if( + goto_function.second, + [](const goto_programt::targett it) { + return it->is_function_call() && can_cast_expr( + it->get_function_call().function()); + }, + [&](goto_programt::targett it) { + auto const &function_call = it->get_function_call(); + auto const &function_pointer_dereference = + to_dereference_expr(function_call.function()); + auto const &source_location = function_call.source_location(); + auto const &goto_function_symbol_mode = + goto_model.symbol_table.lookup_ref(goto_function.first).mode; + + auto const call_site_symbol_name = + irep_idt{id2string(goto_function.first) + ".function_pointer_call." + + std::to_string(++function_pointer_call_counter)}; + + // insert new function pointer variable into the symbol table + goto_model.symbol_table.insert([&] { + symbolt function_call_site_symbol{}; + function_call_site_symbol.name = function_call_site_symbol.base_name = + function_call_site_symbol.pretty_name = call_site_symbol_name; + function_call_site_symbol.type = + function_pointer_dereference.pointer().type(); + function_call_site_symbol.location = function_call.source_location(); + function_call_site_symbol.is_lvalue = true; + function_call_site_symbol.mode = goto_function_symbol_mode; + return function_call_site_symbol; + }()); + + auto const new_function_pointer = + goto_model.symbol_table.lookup_ref(call_site_symbol_name) + .symbol_expr(); + + // add assignment to the new function pointer variable, followed by a + // call of the new variable + auto const assign_instruction = goto_programt::make_assignment( + code_assignt{new_function_pointer, + function_pointer_dereference.pointer()}, + source_location); + goto_function.second.body.insert_before(it, assign_instruction); + to_code_function_call(it->code).function() = + dereference_exprt{new_function_pointer}; + }); + } +} diff --git a/src/goto-programs/label_function_pointer_call_sites.h b/src/goto-programs/label_function_pointer_call_sites.h new file mode 100644 index 00000000000..f9157df333d --- /dev/null +++ b/src/goto-programs/label_function_pointer_call_sites.h @@ -0,0 +1,30 @@ +/*******************************************************************\ +Module: Label function pointer call sites +Author: Diffblue Ltd. +\*******************************************************************/ + +/// \file +/// Label function pointer call sites across a goto model +/// \see simplify_function_pointers + +#ifndef CPROVER_GOTO_PROGRAMS_LABEL_FUNCTION_POINTER_CALL_SITES_H +#define CPROVER_GOTO_PROGRAMS_LABEL_FUNCTION_POINTER_CALL_SITES_H + +#include "goto_model.h" + +/// This ensures that call instructions can be only one of two things: +/// +/// 1. A "normal" function call to a concrete function +/// 2. A dereference of a symbol expression of a function pointer type +/// +/// This makes following stages dealing with function calls easier, because they +/// only need to be able to handle these two cases. +/// +/// It does this by replacing all CALL instructions to function pointers with an +/// assignment to a new function pointer variable followed by a call to that new +/// function pointer. The name of the introduced variable follows the pattern +/// [function_name].function_pointer_call.[N], where N is the nth call to a +/// function pointer in the function function_name. +void label_function_pointer_call_sites(goto_modelt &goto_model); + +#endif // CPROVER_GOTO_PROGRAMS_LABEL_FUNCTION_POINTER_CALL_SITES_H diff --git a/src/goto-programs/restrict_function_pointers.cpp b/src/goto-programs/restrict_function_pointers.cpp new file mode 100644 index 00000000000..76b34993e89 --- /dev/null +++ b/src/goto-programs/restrict_function_pointers.cpp @@ -0,0 +1,461 @@ +/*******************************************************************\ + +Module: Restrict function pointers + +Author: Diffblue Ltd. + +\*******************************************************************/ + +#include "restrict_function_pointers.h" + +#include +#include +#include +#include + +#include +#include +#include +#include + +namespace +{ +void typecheck_function_pointer_restrictions( + const goto_modelt &goto_model, + const function_pointer_restrictionst &restrictions) +{ + for(auto const &restriction : restrictions.restrictions) + { + auto const function_pointer_sym = + goto_model.symbol_table.lookup(restriction.first); + if(function_pointer_sym == nullptr) + { + throw invalid_command_line_argument_exceptiont{ + id2string(restriction.first) + " not found in the symbol table", + "--restrict-function-pointer"}; + } + auto const &function_pointer_type = function_pointer_sym->type; + if(function_pointer_type.id() != ID_pointer) + { + throw invalid_command_line_argument_exceptiont{ + "not a function pointer: " + id2string(restriction.first), + "--restrict-function-pointer"}; + } + auto const &function_type = + to_pointer_type(function_pointer_type).subtype(); + if(function_type.id() != ID_code) + { + throw invalid_command_line_argument_exceptiont{ + "not a function pointer: " + id2string(restriction.first), + "--restrict-function-pointer"}; + } + auto const &ns = namespacet{goto_model.symbol_table}; + for(auto const &function_pointer_target : restriction.second) + { + auto const function_pointer_target_sym = + goto_model.symbol_table.lookup(function_pointer_target); + if(function_pointer_target_sym == nullptr) + { + throw invalid_command_line_argument_exceptiont{ + "symbol not found: " + id2string(function_pointer_target), + "--restrict-function-pointer"}; + } + auto const &function_pointer_target_type = + function_pointer_target_sym->type; + if(function_type != function_pointer_target_type) + { + throw invalid_command_line_argument_exceptiont{ + "type mismatch: `" + id2string(restriction.first) + "' points to `" + + type2c(function_type, ns) + "', but restriction `" + + id2string(function_pointer_target) + "' has type `" + + type2c(function_pointer_target_type, ns) + "'", + "--restrict-function-pointer"}; + } + } + } +} + +source_locationt make_function_pointer_restriction_assertion_source_location( + source_locationt source_location, + const function_pointer_restrictionst::restrictiont restriction) +{ + std::stringstream comment; + + comment << "dereferenced function pointer at " << restriction.first + << " must be "; + + if(restriction.second.size() == 1) + { + comment << *restriction.second.begin(); + } + else + { + comment << "one of ["; + + join_strings( + comment, restriction.second.begin(), restriction.second.end(), ", "); + + comment << ']'; + } + + source_location.set_comment(comment.str()); + source_location.set_property_class(ID_assertion); + + return source_location; +} + +template +void for_each_function_call(goto_functiont &goto_function, Handler handler) +{ + for_each_instruction_if( + goto_function, + [](goto_programt::targett target) { return target->is_function_call(); }, + handler); +} + +void handle_call( + goto_functiont &goto_function, + goto_modelt &goto_model, + const function_pointer_restrictionst &restrictions, + const goto_programt::targett &location) +{ + PRECONDITION(location->is_function_call()); + + // for each function call, we check if it is using a symbol we have + // restrictions for, and if so branch on its value and insert concrete calls + // to the restriction functions + + // Check if this is calling a function pointer, and if so if it is one + // we have a restriction for + const auto &original_function_call = location->get_function_call(); + + if(!can_cast_expr(original_function_call.function())) + return; + + // because we run the label function pointer calls transformation pass before + // this stage a dereference can only dereference a symbol expression + auto const &called_function_pointer = + to_dereference_expr(original_function_call.function()).pointer(); + PRECONDITION(can_cast_expr(called_function_pointer)); + auto const &pointer_symbol = to_symbol_expr(called_function_pointer); + auto const restriction_iterator = + restrictions.restrictions.find(pointer_symbol.get_identifier()); + + if(restriction_iterator == restrictions.restrictions.end()) + return; + + auto const &restriction = *restriction_iterator; + + // this is intentionally a copy because we're about to change the + // instruction this iterator points to + // if we can, we will replace uses of it by a case distinction over + // given functions the function pointer can point to + auto const original_function_call_instruction = *location; + + *location = goto_programt::make_assertion( + false_exprt{}, + make_function_pointer_restriction_assertion_source_location( + original_function_call_instruction.source_location, restriction)); + + auto const assume_false_location = goto_function.body.insert_after( + location, + goto_programt::make_assumption( + false_exprt{}, original_function_call_instruction.source_location)); + + // this is mutable because we'll update this at the end of each + // loop iteration to always point at the start of the branch + // we created + auto else_location = location; + + auto const end_if_location = goto_function.body.insert_after( + assume_false_location, goto_programt::make_skip()); + + for(auto const &restriction_target : restriction.second) + { + auto new_instruction = original_function_call_instruction; + // can't use get_function_call because that'll return a const ref + const symbol_exprt &function_pointer_target_symbol_expr = + goto_model.symbol_table.lookup_ref(restriction_target).symbol_expr(); + to_code_function_call(new_instruction.code).function() = + function_pointer_target_symbol_expr; + auto const goto_end_if_location = goto_function.body.insert_before( + else_location, + goto_programt::make_goto( + end_if_location, original_function_call_instruction.source_location)); + auto const replaced_instruction_location = + goto_function.body.insert_before(goto_end_if_location, new_instruction); + else_location = goto_function.body.insert_before( + replaced_instruction_location, + goto_programt::make_goto( + else_location, + notequal_exprt{pointer_symbol, + address_of_exprt{function_pointer_target_symbol_expr}})); + } +} +} // namespace + +void restrict_function_pointers( + goto_modelt &goto_model, + const function_pointer_restrictionst &restrictions) +{ + typecheck_function_pointer_restrictions(goto_model, restrictions); + + for(auto &function_item : goto_model.goto_functions.function_map) + { + goto_functiont &goto_function = function_item.second; + + for_each_function_call(goto_function, [&](const goto_programt::targett it) { + handle_call(goto_function, goto_model, restrictions, it); + }); + } +} + +void parse_function_pointer_restriction_options_from_cmdline( + const cmdlinet &cmdline, + optionst &options) +{ + options.set_option( + RESTRICT_FUNCTION_POINTER_OPT, + cmdline.get_values(RESTRICT_FUNCTION_POINTER_OPT)); + options.set_option( + RESTRICT_FUNCTION_POINTER_FROM_FILE_OPT, + cmdline.get_values(RESTRICT_FUNCTION_POINTER_FROM_FILE_OPT)); +} + +function_pointer_restrictionst::restrictionst +function_pointer_restrictionst::merge_function_pointer_restrictions( + function_pointer_restrictionst::restrictionst lhs, + const function_pointer_restrictionst::restrictionst &rhs) +{ + auto &result = lhs; + + for(auto const &restriction : rhs) + { + auto emplace_result = result.emplace(restriction.first, restriction.second); + if(!emplace_result.second) + { + for(auto const &target : restriction.second) + { + emplace_result.first->second.insert(target); + } + } + } + + return result; +} + +function_pointer_restrictionst::restrictionst function_pointer_restrictionst:: + parse_function_pointer_restrictions_from_command_line( + const std::list &restriction_opts) +{ + auto function_pointer_restrictions = + function_pointer_restrictionst::restrictionst{}; + + for(const std::string &restriction_opt : restriction_opts) + { + const auto restriction = + parse_function_pointer_restriction(restriction_opt); + + const bool inserted = function_pointer_restrictions + .emplace(restriction.first, restriction.second) + .second; + + if(!inserted) + { + throw invalid_command_line_argument_exceptiont{ + "function pointer restriction for `" + id2string(restriction.first) + + "' was specified twice", + "--" RESTRICT_FUNCTION_POINTER_OPT}; + } + } + + return function_pointer_restrictions; +} + +function_pointer_restrictionst::restrictionst +function_pointer_restrictionst::parse_function_pointer_restrictions_from_file( + const std::list &filenames, + message_handlert &message_handler) +{ + auto merged_restrictions = function_pointer_restrictionst::restrictionst{}; + for(auto const &filename : filenames) + { + auto const restrictions = + function_pointer_restrictionst::read_from_file(filename, message_handler); + + merged_restrictions = merge_function_pointer_restrictions( + std::move(merged_restrictions), restrictions.restrictions); + } + + return merged_restrictions; +} + +function_pointer_restrictionst::restrictiont +function_pointer_restrictionst::parse_function_pointer_restriction( + const std::string &restriction_opt) +{ + // the format for restrictions is / + // exactly one pointer and at least one target + auto const pointer_name_end = restriction_opt.find('/'); + auto const restriction_format_message = + "the format for restrictions is " + "/"; + + if(pointer_name_end == std::string::npos) + { + throw invalid_command_line_argument_exceptiont{ + "couldn't find '/' in `" + restriction_opt + "'", + "--" RESTRICT_FUNCTION_POINTER_OPT, + restriction_format_message}; + } + + if(pointer_name_end == restriction_opt.size()) + { + throw invalid_command_line_argument_exceptiont{ + "couldn't find names of targets after '/' in `" + restriction_opt + "'", + "--" RESTRICT_FUNCTION_POINTER_OPT, + restriction_format_message}; + } + + if(pointer_name_end == 0) + { + throw invalid_command_line_argument_exceptiont{ + "couldn't find target name before '/' in `" + restriction_opt + "'", + "--" RESTRICT_FUNCTION_POINTER_OPT}; + } + + auto const pointer_name = restriction_opt.substr(0, pointer_name_end); + auto const target_names_substring = + restriction_opt.substr(pointer_name_end + 1); + auto const target_name_strings = split_string(target_names_substring, ','); + + if(target_name_strings.size() == 1 && target_name_strings[0].empty()) + { + throw invalid_command_line_argument_exceptiont{ + "missing target list for function pointer restriction " + pointer_name, + "--" RESTRICT_FUNCTION_POINTER_OPT, + restriction_format_message}; + } + + std::unordered_set target_names; + target_names.insert(target_name_strings.begin(), target_name_strings.end()); + + for(auto const &target_name : target_names) + { + if(target_name == ID_empty_string) + { + throw invalid_command_line_argument_exceptiont( + "leading or trailing comma in restrictions for `" + pointer_name + "'", + "--" RESTRICT_FUNCTION_POINTER_OPT, + restriction_format_message); + } + } + + return std::make_pair(pointer_name, target_names); +} + +function_pointer_restrictionst function_pointer_restrictionst::from_options( + const optionst &options, + message_handlert &message_handler) +{ + auto const restriction_opts = + options.get_list_option(RESTRICT_FUNCTION_POINTER_OPT); + auto const commandline_restrictions = + parse_function_pointer_restrictions_from_command_line(restriction_opts); + auto const file_restrictions = parse_function_pointer_restrictions_from_file( + options.get_list_option(RESTRICT_FUNCTION_POINTER_FROM_FILE_OPT), + message_handler); + return {merge_function_pointer_restrictions( + std::move(file_restrictions), commandline_restrictions)}; +} + +function_pointer_restrictionst +function_pointer_restrictionst::from_json(const jsont &json) +{ + function_pointer_restrictionst::restrictionst restrictions; + + if(!json.is_object()) + { + throw deserialization_exceptiont{"top level item is not an object"}; + } + + for(auto const &restriction : to_json_object(json)) + { + restrictions.emplace(irep_idt{restriction.first}, [&] { + if(!restriction.second.is_array()) + { + throw deserialization_exceptiont{"Value of " + restriction.first + + " is not an array"}; + } + auto possible_targets = std::unordered_set{}; + auto const &array = to_json_array(restriction.second); + std::transform( + array.begin(), + array.end(), + std::inserter(possible_targets, possible_targets.end()), + [&](const jsont &array_element) { + if(!array_element.is_string()) + { + throw deserialization_exceptiont{ + "Value of " + restriction.first + + "contains a non-string array element"}; + } + return irep_idt{to_json_string(array_element).value}; + }); + return possible_targets; + }()); + } + + return function_pointer_restrictionst{restrictions}; +} + +function_pointer_restrictionst function_pointer_restrictionst::read_from_file( + const std::string &filename, + message_handlert &message_handler) +{ + auto inFile = std::ifstream{filename}; + jsont json; + + if(parse_json(inFile, filename, message_handler, json)) + { + throw deserialization_exceptiont{ + "failed to read function pointer restrictions from " + filename}; + } + + return from_json(json); +} + +jsont function_pointer_restrictionst::to_json() const +{ + auto function_pointer_restrictions_json = jsont{}; + auto &restrictions_json_object = + function_pointer_restrictions_json.make_object(); + + for(auto const &restriction : restrictions) + { + auto &targets_array = + restrictions_json_object[id2string(restriction.first)].make_array(); + for(auto const &target : restriction.second) + { + targets_array.push_back(json_stringt{target}); + } + } + + return function_pointer_restrictions_json; +} + +void function_pointer_restrictionst::write_to_file( + const std::string &filename) const +{ + auto function_pointer_restrictions_json = to_json(); + + auto outFile = std::ofstream{filename}; + + if(!outFile) + { + throw system_exceptiont{"cannot open " + filename + + " for writing function pointer restrictions"}; + } + + function_pointer_restrictions_json.output(outFile); +} diff --git a/src/goto-programs/restrict_function_pointers.h b/src/goto-programs/restrict_function_pointers.h new file mode 100644 index 00000000000..e7681847406 --- /dev/null +++ b/src/goto-programs/restrict_function_pointers.h @@ -0,0 +1,109 @@ +/*******************************************************************\ + +Module: Restrict function pointers + +Author: Diffblue Ltd. + +\*******************************************************************/ + +/// \file +/// Given goto functions and a list of function parameters or globals +/// that are function pointers with lists of possible candidates, replace use of +/// these function pointers with calls to the candidate. +/// The purpose here is to avoid unnecessary branching +/// i.e. "there are 600 functions with this signature, but I know it's +/// always going to be one of these two" + +#ifndef CPROVER_GOTO_PROGRAMS_RESTRICT_FUNCTION_POINTERS_H +#define CPROVER_GOTO_PROGRAMS_RESTRICT_FUNCTION_POINTERS_H + +#include +#include + +#include +#include + +#include +#include + +#define RESTRICT_FUNCTION_POINTER_OPT "restrict-function-pointer" +#define RESTRICT_FUNCTION_POINTER_FROM_FILE_OPT \ + "function-pointer-restrictions-file" + +#define OPT_RESTRICT_FUNCTION_POINTER \ + "(" RESTRICT_FUNCTION_POINTER_OPT \ + "):" \ + "(" RESTRICT_FUNCTION_POINTER_FROM_FILE_OPT "):" + +#define HELP_RESTRICT_FUNCTION_POINTER \ + "--" RESTRICT_FUNCTION_POINTER_OPT \ + " /\n" \ + " restrict a function pointer to a set of possible targets\n" \ + " targets must all exist in the symbol table with a matching " \ + "type\n" \ + " works for globals and function parameters right now\n" \ + "--" RESTRICT_FUNCTION_POINTER_FROM_FILE_OPT \ + " \n" \ + " add function pointer restrictions from file" + +void parse_function_pointer_restriction_options_from_cmdline( + const cmdlinet &cmdline, + optionst &options); + +class jsont; +class message_handlert; + +class function_pointer_restrictionst +{ +public: + using restrictionst = + std::unordered_map>; + using restrictiont = restrictionst::value_type; + + const restrictionst restrictions; + + /// parse function pointer restrictions from command line + /// + /// Note: These are are only syntactically checked at this stage, + /// because type checking them requires a goto_modelt + static function_pointer_restrictionst + from_options(const optionst &options, message_handlert &message_handler); + + jsont to_json() const; + static function_pointer_restrictionst from_json(const jsont &json); + + static function_pointer_restrictionst read_from_file( + const std::string &filename, + message_handlert &message_handler); + + void write_to_file(const std::string &filename) const; + +protected: + static restrictionst merge_function_pointer_restrictions( + restrictionst lhs, + const restrictionst &rhs); + + static restrictionst parse_function_pointer_restrictions_from_command_line( + const std::list &restriction_opts); + + static restrictionst parse_function_pointer_restrictions_from_file( + const std::list &filenames, + message_handlert &message_handler); + + static restrictiont + parse_function_pointer_restriction(const std::string &restriction_opt); +}; + +/// Apply function pointer restrictions to a goto_model. Each restriction is a +/// mapping from a pointer name to a set of possible targets. Replace calls of +/// these "restricted" pointers with a branch on the value of the function +/// pointer, comparing it to the set of possible targets. This also adds an +/// assertion that the pointer actually has one of the listed values. +/// +/// Note: This requires label_function_pointer_call_sites to be run +/// before +void restrict_function_pointers( + goto_modelt &goto_model, + const function_pointer_restrictionst &restrictions); + +#endif // CPROVER_GOTO_PROGRAMS_RESTRICT_FUNCTION_POINTERS_H diff --git a/unit/Makefile b/unit/Makefile index d41c1770752..5974c05e6b6 100644 --- a/unit/Makefile +++ b/unit/Makefile @@ -31,7 +31,9 @@ SRC += analyses/ai/ai.cpp \ goto-programs/goto_program_validate.cpp \ goto-programs/goto_trace_output.cpp \ goto-programs/is_goto_binary.cpp \ + goto-programs/label_function_pointer_call_sites.cpp \ goto-programs/osx_fat_reader.cpp \ + goto-programs/restrict_function_pointers.cpp \ goto-programs/remove_returns.cpp \ goto-programs/xml_expr.cpp \ goto-symex/apply_condition.cpp \ diff --git a/unit/goto-programs/label_function_pointer_call_sites.cpp b/unit/goto-programs/label_function_pointer_call_sites.cpp new file mode 100644 index 00000000000..e544a6378d4 --- /dev/null +++ b/unit/goto-programs/label_function_pointer_call_sites.cpp @@ -0,0 +1,87 @@ +/*******************************************************************\ + +Module: Label function pointer call sites unit tests + +Author: Daniel Poetzl + +\*******************************************************************/ + +#include +#include + +#include + +TEST_CASE("Label function pointer call sites", "[core]") +{ + const std::string code = R"( + void f() {} + void g() {} + + void h() + { + f(); + void (*fp)() = f; + fp(); + (1 ? f : g)(); + } + + void main() { h(); } + )"; + + goto_modelt goto_model = get_goto_model_from_c(code); + + label_function_pointer_call_sites(goto_model); + + auto &h = goto_model.goto_functions.function_map.at("h"); + + for_each_instruction_if( + h, + [](goto_programt::const_targett it) { return it->is_function_call(); }, + [](goto_programt::const_targett it) { + static int call_count = 0; + + switch(call_count) + { + case 0: + // first call instruction + REQUIRE(it->get_function_call().function().id() == ID_symbol); + break; + case 1: + { + // second call instruction + const auto &fp_symbol = + to_symbol_expr( + to_dereference_expr(it->get_function_call().function()).pointer()) + .get_identifier(); + REQUIRE(fp_symbol == "h.function_pointer_call.1"); + break; + } + case 2: + { + // third call instruction + const auto &fp_symbol = + to_symbol_expr( + to_dereference_expr(it->get_function_call().function()).pointer()) + .get_identifier(); + REQUIRE(fp_symbol == "h.function_pointer_call.2"); + + auto it_prev = std::prev(it); + const auto &assign = it_prev->get_assign(); + + const auto &lhs = assign.lhs(); + const auto &rhs = assign.rhs(); + + REQUIRE( + to_symbol_expr(lhs).get_identifier() == "h.function_pointer_call.2"); + + REQUIRE(rhs.id() == ID_if); + + break; + } + default: + break; + } + + call_count++; + }); +} diff --git a/unit/goto-programs/module_dependencies.txt b/unit/goto-programs/module_dependencies.txt index 070c3bcc8bb..d0aaf35d60d 100644 --- a/unit/goto-programs/module_dependencies.txt +++ b/unit/goto-programs/module_dependencies.txt @@ -1,3 +1,4 @@ goto-programs +json testing-utils util diff --git a/unit/goto-programs/restrict_function_pointers.cpp b/unit/goto-programs/restrict_function_pointers.cpp new file mode 100644 index 00000000000..149c04f8898 --- /dev/null +++ b/unit/goto-programs/restrict_function_pointers.cpp @@ -0,0 +1,150 @@ +/*******************************************************************\ + +Module: Restrict function pointers unit tests + +Author: Daniel Poetzl + +\*******************************************************************/ + +#include +#include + +#include + +#include + +class fp_restrictionst : public function_pointer_restrictionst +{ + friend void restriction_parsing_test(); + friend void merge_restrictions_test(); +}; + +void restriction_parsing_test() +{ + { + const auto res = + fp_restrictionst::parse_function_pointer_restriction("func1/func2"); + REQUIRE(res.first == "func1"); + REQUIRE(res.second.size() == 1); + REQUIRE(res.second.find("func2") != res.second.end()); + } + + { + const auto res = + fp_restrictionst::parse_function_pointer_restriction("func1/func2,func3"); + REQUIRE(res.first == "func1"); + REQUIRE(res.second.size() == 2); + REQUIRE(res.second.find("func2") != res.second.end()); + REQUIRE(res.second.find("func3") != res.second.end()); + } + + REQUIRE_THROWS_AS( + fp_restrictionst::parse_function_pointer_restriction("func"), + invalid_command_line_argument_exceptiont); + + REQUIRE_THROWS_AS( + fp_restrictionst::parse_function_pointer_restriction("/func"), + invalid_command_line_argument_exceptiont); + + REQUIRE_THROWS_AS( + fp_restrictionst::parse_function_pointer_restriction("func/"), + invalid_command_line_argument_exceptiont); + + REQUIRE_THROWS_AS( + fp_restrictionst::parse_function_pointer_restriction("func/,"), + invalid_command_line_argument_exceptiont); + + REQUIRE_THROWS_AS( + fp_restrictionst::parse_function_pointer_restriction("func1/func2,"), + invalid_command_line_argument_exceptiont); + + REQUIRE_THROWS_AS( + fp_restrictionst::parse_function_pointer_restriction("func1/,func2"), + invalid_command_line_argument_exceptiont); +} + +void merge_restrictions_test() +{ + fp_restrictionst::restrictionst r1; + r1.emplace("fp1", std::unordered_set{"func1", "func2"}); + r1.emplace("fp2", std::unordered_set{"func1"}); + + fp_restrictionst::restrictionst r2; + r2.emplace("fp1", std::unordered_set{"func1", "func3"}); + + fp_restrictionst::restrictionst result = + fp_restrictionst::merge_function_pointer_restrictions(r1, r2); + + REQUIRE(result.size() == 2); + + const auto &fp1_restrictions = result.at("fp1"); + REQUIRE(fp1_restrictions.size() == 3); + REQUIRE(fp1_restrictions.count("func1") == 1); + REQUIRE(fp1_restrictions.count("func2") == 1); + REQUIRE(fp1_restrictions.count("func3") == 1); + + const auto &fp2_restrictions = result.at("fp2"); + REQUIRE(fp2_restrictions.size() == 1); + REQUIRE(fp2_restrictions.count("func1") == 1); +} + +TEST_CASE("Restriction parsing", "[core]") +{ + restriction_parsing_test(); +} + +TEST_CASE("Merge function pointer restrictions", "[core]") +{ + merge_restrictions_test(); +} + +TEST_CASE("Json conversion", "[core]") +{ + // conversion json1 -> restrictions1 -> json2 -> restrictions2 + // then check that restrictions1 == restrictions2 + // + // we use json as a starting point as it is easy to write, and we compare the + // restrictions as it is a canonical representation (in contrast, the json + // representation for the same restrictions can differ, due to the array + // elements appearing in different orders) + + std::istringstream ss( + "{" + " \"use_f.function_pointer_call.1\": [\"f\", \"g\"]," + " \"use_f.function_pointer_call.2\": [\"h\"]" + "}"); + + jsont json1; + + parse_json(ss, "", null_message_handler, json1); + + // json1 -> restrictions1 + const auto function_pointer_restrictions1 = + function_pointer_restrictionst::from_json(json1); + + const auto &restrictions = function_pointer_restrictions1.restrictions; + + REQUIRE(restrictions.size() == 2); + + const auto &fp1_restrictions = + restrictions.at("use_f.function_pointer_call.1"); + REQUIRE(fp1_restrictions.size() == 2); + REQUIRE(fp1_restrictions.count("f") == 1); + REQUIRE(fp1_restrictions.count("g") == 1); + + const auto &fp2_restrictions = + restrictions.at("use_f.function_pointer_call.2"); + REQUIRE(fp2_restrictions.size() == 1); + REQUIRE(fp2_restrictions.count("h") == 1); + + // restrictions1 -> json2 + const auto json2 = function_pointer_restrictions1.to_json(); + + // json2 -> restrictions2 + const auto function_pointer_restrictions2 = + function_pointer_restrictionst::from_json(json2); + + REQUIRE( + function_pointer_restrictions1.restrictions == + function_pointer_restrictions2.restrictions); +}