Skip to content

Constant propagate string-solver function results #5129

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

JohnDumbell
Copy link
Contributor

@JohnDumbell JohnDumbell commented Sep 25, 2019

Allows constant propagation to work on a range of general string operations.

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • White-space or formatting changes outside the feature-related changed lines are in commits of their own.

@JohnDumbell JohnDumbell requested a review from danpoe September 25, 2019 12:52
@JohnDumbell JohnDumbell self-assigned this Sep 25, 2019
@JohnDumbell JohnDumbell force-pushed the jd/feature/constant_string_functions branch 2 times, most recently from 6cfc3c8 to 851e709 Compare September 25, 2019 13:01
int character = std::stoi(id2string(operand.get(ID_value)), nullptr, 16);
if (isalpha(character))
{
char buffer[33];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Justify 33

auto &operands = string_data.operands();
for(auto & operand : operands)
{
int character = std::stoi(id2string(operand.get(ID_value)), nullptr, 16);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get(ID_value) -> to_constant_expr(...).get_value()

auto &operands = string_data.operands();
for(auto & operand : operands)
{
int character = std::stoi(id2string(operand.get(ID_value)), nullptr, 16);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't parse directly, use numeric_cast_v (

Target numeric_cast_v(const constant_exprt &arg)
)

if(islower(character))
{
sprintf(buffer, "%x", toupper(character));
operand.set(ID_value, buffer);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't set value, use from_integer (

constant_exprt from_integer(const mp_integer &int_value, const typet &type);
)

}

return true;
}bool goto_symext::constant_propagate_trim(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line break after }

if(last_index)
{
auto it = std::find_if(
std::next(s1_data.operands().rbegin(), starting_index),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly this use of starting-index

auto &operands = string_data.operands();
for(auto &operand : operands)
{
int character = std::stoi(id2string(operand.get(ID_value)), nullptr, 16);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use arith-tools

char buffer[33];
if(isupper(character))
{
sprintf(buffer, "%x", tolower(character));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use arith-tools

}
}

/// Simplify String.startsWith function when arguments are constant
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrong comment

lower_case_string_expression(first_value);
lower_case_string_expression(second_value);

// test whether second_value is a prefix of first_value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given you check the sizes are equal, so this is not a prefix check, surely this is just first_value == second_value

@smowton
Copy link
Contributor

smowton commented Sep 25, 2019

Also, tests: these currently only look at the cases where you do expect a change. Make sure to exercise the cases where there should not be a simplification / constant propagation, and to exercise optional arguments like the starting-index for String.find / findLast

Copy link
Contributor

@danpoe danpoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests should also check that constant propagation computes the right result in addition to that there are 0 remaining VCCs after symex.

it != operands.end();)
{
if(it->get_int(ID_value) == 20)
operands.erase(it);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be it = operands.erase(it)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think it makes a difference the way I'm using it here though?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might work though according to the documentation it becomes invalid when calling erase(it). So I'd say it's safer to use the returned iterator.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, it's just luck that it works here based on how std::vector's iterators happen to work (i.e. being pointers to the data array, and the array not usually being reallocated on an erase). How about

auto expr_is_space = [](const exprt &e)
{
  const auto &constant_value = to_constant_expr(*it);
  return numeric_cast_v<int>(constant_value) <= ' ';
}

while(!operands.empty() && expr_is_space(operands.front()))
  operands.erase(operands.begin());
while(!operands.empty() && expr_is_space(operands.back()))
  operands.erase(operands.rbegin().base());

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about it the current implementation has n^2 worst case runtime due to the repeated shifting of the vector when erasing elements at the beginning. It might be better to use find_if() to get iterators to the first and last non-space characters, and then copy that segment to a new vector (instead of making a copy of the original vector and then modifying it).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, modified.

for(auto it = operands.begin();
it != operands.end();)
{
if(it->get_int(ID_value) == 20)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should remove whitespace characters other than space as well. Also a space is 20 in hex not in decimal.

{
if(it->get_int(ID_value) == 20)
operands.erase(it.base()),
it++;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The erase(it.base()) invalidates the iterator it.

@@ -610,6 +610,25 @@ class goto_symext
symex_assignt &symex_assign,
const function_application_exprt &f_l1);

/// Attempt to constant propogate trim operations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

propogate -> propagate

const function_application_exprt &f_l1,
bool to_upper);

/// Attempt to constant proagate character replacement.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

proagate -> propagate

const auto &newChar = to_symbol_expr(old_data);

const auto new_char_pointer =
state.propagation.find(newChar.get_identifier());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can use try_evaluate_constant() instead which does the lookup in the propagation map and returns an optional constant_exprt.

CORE
Main.class
--function Main.contains --show-vcc --cp `../../../../scripts/format_classpath.sh . ../../../lib/java-models-library/target/core-models.jar`
^Generated [0-9]+ VCC\(s\), 0 remaining after simplification$
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since --show-vcc is used, the test only prints the VCCs and checks that we have 0 remaining VCCs after symex. We also need to check though that constant propagation does compute the correct result. We should leave off --show-vcc and check that verification succeeded. (also for the other tests)

Also we should check the negative cases, i.e. here if a string doesn't contain another string.

@JohnDumbell JohnDumbell force-pushed the jd/feature/constant_string_functions branch 2 times, most recently from 55d333c to 00b4edf Compare September 25, 2019 16:53
@JohnDumbell
Copy link
Contributor Author

Applied all comments and added more tests.

Copy link
Contributor

@smowton smowton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Main comment: the tests for negative cases shouldn't just be positive cases with a ! at the start, they should test cases where we give a proper negative response, such as equalsIgnoreCase("abc", "ADC"), for as many different reasons as our code has reasons to reject the comparison (searched-for string too short and contains a mismatch seem like the obvious ones). The goal is to make sure it doesn't accidentally fire every time. Where there are optional index restrictions as for String.compareTo make sure there are cases where the restriction is the reason for it not to work -- e.g. String.lastIndexOf("abc", "abc") == 0 but String.lastIndexOf("abc", "abc", 1) == -1


public void notContains() {
String str = "abcde";
assert !str.contains("cd");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't really test anything more than contains, except that operator! is working -- instead I think you should test some cases where contains should return false. assert !str.contains("fgh"), assert !str.contains("abcdef"), and so on -- things that should evaluate to false for different reasons.

CORE
Main.class
--function Main.notContains --cp `../../../../scripts/format_classpath.sh . ../../../lib/java-models-library/target/core-models.jar`
^Generated [1-9]\d* VCC\(s\), [1-9]\d* remaining after simplification$
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a non-noprop case I expect these should be decided by the simplifier

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These cases usually resolve in a VCC but with a directly false guard somewhere. Might be to do with building a trace? Always been like this when I've tested and just assumed that's how things were.

}

public void notEqualsIgnoreCase() {
assert !"abc".equalsIgnoreCase("ABC");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, test some real negatives, not just verification success vs. failure

CORE
Main.class
--function Main.notEqualsIgnoreCase --cp `../../../../scripts/format_classpath.sh . ../../../lib/java-models-library/target/core-models.jar`
^Generated [1-9]\d* VCC\(s\), [1-9]\d* remaining after simplification$
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be decided

for(auto &operand : operands)
{
auto &constant_value = to_constant_expr(operand);
char character = numeric_cast_v<int>(constant_value);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Careful, Java chars are 16-bit. It's fine if we only treat ASCII for now, but we should decline to const-prop when we're dealing with e.g. strings of Greek letters, which do have case-change semantics but aren't handled by 8-bit islower etc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point!

{
std::stringstream stream;
stream << std::hex << toupper(character);
constant_value.set_value(stream.str());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly these are 16-bit ints as far as CBMC is concerned, I'd use from_integer rather than hope you get the expected formatting for constant_exprt's value member

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like this for example:

exprt hexD800 = from_integer(0xD800, type);

auto &old_data = f_l1.arguments().at(4);

// If it's a symbol we're using the replace(Character, Character) overload.
// There's another that takes CharSequence which isn't dealt with.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still surprised you're looking for the ID of the expression not the type or the signature of the called function, though. What's going on there, why couldn't a symbol_exprt get passed to the other overload?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never did in any of my testing, but yes, it's fragile. I've changed it to check for the signature instead, which is also fragile, just in different ways.

I'll get around to adding the other overload if I have time.

it != operands.end();)
{
if(it->get_int(ID_value) == 20)
operands.erase(it);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, it's just luck that it works here based on how std::vector's iterators happen to work (i.e. being pointers to the data array, and the array not usually being reallocated on an erase). How about

auto expr_is_space = [](const exprt &e)
{
  const auto &constant_value = to_constant_expr(*it);
  return numeric_cast_v<int>(constant_value) <= ' ';
}

while(!operands.empty() && expr_is_space(operands.front()))
  operands.erase(operands.begin());
while(!operands.empty() && expr_is_space(operands.back()))
  operands.erase(operands.rbegin().base());

for(auto &operand : operands)
{
auto &constant_value = to_constant_expr(operand);
char character = numeric_cast_v<int>(constant_value);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, watch out that you're restricting to ASCII here and refuse to const-prop anything outside that range, since we don't have a C++ widget with the same case semantics as Character.toUpperCase

@JohnDumbell JohnDumbell force-pushed the jd/feature/constant_string_functions branch 2 times, most recently from 4ce1e46 to 047a471 Compare September 26, 2019 14:02
Copy link
Contributor

@romainbrenguier romainbrenguier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only reviewed until Add String.lastIndexOf(...) but some of the comments could apply to the following commits as well

if(character > ' ')
{
end_iter = it.base();
break;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use std::find_if

const constant_exprt new_char_array_length =
from_integer(new_operands.size(), length_type);

const array_typet new_char_array_type(char_type, new_char_array_length);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use curly braces {}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where would those go? (And curious about why)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const array_typet new_char_array_type{char_type, new_char_array_length};
it's the preferred way of constructing an object, here are some reasons why https://stackoverflow.com/questions/18222926/why-is-list-initialization-using-curly-braces-better-than-the-alternatives

if(character > ' ')
{
start_iter = it;
break;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use std::find_if

bool constant_propagate_trim(
statet &state,
symex_assignt &symex_assign,
const function_application_exprt &f_l1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be marked const or even static? 🌵

for(auto &operand : operands)
{
auto &constant_value = to_constant_expr(operand);
unsigned char character = numeric_cast_v<int>(constant_value);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 should use numeric_cast_v<unsigned char> or if unavailable narrow

{
if(islower(character))
{
irept hex_value =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why hex_value? this is actually an exprt

{
auto starting_offset = 0;
if(starting_index > 0)
starting_offset = search_string_size - starting_index;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use ternary if to define starting_offset and make it const. Avoid using auto when the type is not explicit.

return from_integer(
std::distance(s1_data.operands().begin(), it.base()) - 1,
expr.type());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should return -1 in the other case no? 🍊

Copy link
Contributor Author

@JohnDumbell JohnDumbell Sep 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would think so, but no, the other case works fine without it. Assume it's notched along +1 due to the way the reverse iterator works.

Edit: Yeah, something's off with the way the reverse iterator works in this instance, swapped it to use find_end instead.


if(it != s1_data.operands().end())
return from_integer(
std::distance(s1_data.operands().begin(), it), expr.type());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🍊

@JohnDumbell JohnDumbell force-pushed the jd/feature/constant_string_functions branch 2 times, most recently from 4ac83a9 to ec2ad44 Compare September 26, 2019 17:08

// If it's a symbol we're using the replace(Character, Character) overload.
// There's another that takes CharSequence which isn't dealt with.
if(target_overload == "java::java.lang.String.replace:(CC)Ljava/lang/String;")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 check the type of the arguments instead

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should block, either one achieves what's needed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's java specific code in symex

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doh, right, sure. Use the types, as Romain said.


irep_idt target_overload = f_l1.source_location().get_function();

// If it's a symbol we're using the replace(Character, Character) overload.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace(char, char)

^SIGNAL=0$
^VERIFICATION SUCCESSFUL$
--
--
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ missing new line

!lower_case_string_expression(second_value))
return simplify_exprt::unchanged(expr);

bool is_equal = first_value == second_value;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const or can be inlined in the next expression


auto is_not_whitespace = [](const exprt &expr) {
auto &constant_value = to_constant_expr(expr);
char character = numeric_cast_v<int>(constant_value);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

numeric_cast_v if possible otherwise use narrow, but actually the next operation doesn't require a char and could directly be done on an mp_integer

auto &operands = string_data.operands();

auto is_not_whitespace = [](const exprt &expr) {
auto &constant_value = to_constant_expr(expr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ inline below

if(!s_data_opt)
return false;

const array_exprt &string_data = s_data_opt->get();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ used only once, can be inlined below


// If we haven't found anything, just want to copy the whole array.
if(start_iter == operands.end())
start_iter = operands.begin();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 this seems wrong, if there is no character which is not a whitespace, then the result should be empty

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually is producing an empty string, but it's not obvious (end_iter must have traversed the whole string too, and so equals operands.begin()). Suggest instead do

exprt::operandst new_operands;
if(start_iter != operands.end())
  new_operands = exprt::operandst{start_iter, end_iter.base()};

@JohnDumbell JohnDumbell force-pushed the jd/feature/constant_string_functions branch 3 times, most recently from dcc9a76 to f232857 Compare September 27, 2019 13:33
@JohnDumbell
Copy link
Contributor Author

Added more tests, added in both character/string replace, did general clean-up pass. If you want any more tests please just list the specific ones.

@JohnDumbell JohnDumbell force-pushed the jd/feature/constant_string_functions branch 2 times, most recently from fae4f4f to c76dd8b Compare September 30, 2019 17:42
@JohnDumbell JohnDumbell force-pushed the jd/feature/constant_string_functions branch 2 times, most recently from 2630a32 to 2fe2434 Compare October 2, 2019 09:29
@codecov-io
Copy link

codecov-io commented Oct 2, 2019

Codecov Report

Merging #5129 into develop will decrease coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #5129      +/-   ##
===========================================
- Coverage    67.03%   67.01%   -0.02%     
===========================================
  Files         1149     1147       -2     
  Lines        94030    94042      +12     
===========================================
- Hits         63033    63025       -8     
- Misses       30997    31017      +20
Flag Coverage Δ
#cproversmt2 42.59% <0%> (-0.12%) ⬇️
#regression 63.51% <100%> (-0.01%) ⬇️
#unit 31.89% <2.25%> (-0.07%) ⬇️
Impacted Files Coverage Δ
src/goto-symex/goto_symex.h 92.3% <ø> (ø) ⬆️
src/util/simplify_expr.cpp 87.52% <100%> (+0.7%) ⬆️
src/goto-symex/goto_symex.cpp 99.71% <100%> (+0.07%) ⬆️
src/goto-instrument/cover_instrument_other.cpp 63.63% <0%> (-32.2%) ⬇️
...rs/strings/string_constraint_generator_testing.cpp 91.86% <0%> (-8.14%) ⬇️
...bmc/src/java_bytecode/java_static_initializers.cpp 99.11% <0%> (-0.89%) ⬇️
src/solvers/strings/string_refinement.cpp 87.84% <0%> (-0.69%) ⬇️
...lvers/strings/string_constraint_generator_main.cpp 83.33% <0%> (-0.58%) ⬇️
src/goto-symex/symex_goto.cpp 96.1% <0%> (-0.45%) ⬇️
src/util/std_expr.h 89% <0%> (-0.23%) ⬇️
... and 26 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 633d67a...bdde675. Read the comment docs.

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 2fe2434).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/130009103

Copy link
Contributor

@smowton smowton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking suggestions

^SIGNAL=0$
^VERIFICATION SUCCESSFUL$
--
--
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to note why this one doesn't work, either here or in a comment in the java file

}

public void stringReplace() {
String str = "abded";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate of replace

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One takes 2 characters, other takes 2 strings.

assert str.equals("ghcdgh");
}

public void stringMultiLargerReplace() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Larger and Smaller names the wrong way around here

^SIGNAL=0$
^VERIFICATION SUCCESSFUL$
--
--
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again mention the missing support, either here or inline in the java file

^Generated [1-9]\d* VCC\(s\), [1-9]\d* remaining after simplification$
^EXIT=0$
^SIGNAL=0$
^VERIFICATION SUCCESSFUL$
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surprising! Pi should be affected by the real JDK method; is our models lib wrong here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the builtin upper_case and lower_case are only correct for a (small) subset of unicode (I think ascii and latin-1 supplement) so this is incorrect but expected. It sould be marked as knownbug.

array_exprt characters_to_find(s_data_opt->get().type());
array_exprt characters_to_replace(s_data_opt->get().type());

// Two main ways to perform a replace: characters or strings.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Main ways, or ways? Is there a third unusual one?

}
else
{
auto &new_char_array = to_string_expr(new_data);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

INVARIANT that the types are as you expect (String, String by the sounds of it)


// If we haven't found anything, just want to copy the whole array.
if(start_iter == operands.end())
start_iter = operands.begin();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually is producing an empty string, but it's not obvious (end_iter must have traversed the whole string too, and so equals operands.begin()). Suggest instead do

exprt::operandst new_operands;
if(start_iter != operands.end())
  new_operands = exprt::operandst{start_iter, end_iter.base()};

auto start_iter =
std::find_if(operands.begin(), operands.end(), is_not_whitespace);

// If our entire string is empty, our end_iter will be at begin(), so just
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"If our entire string is empty" -> "If the entire string is whitespace"

s2_data.operands().begin(),
s2_data.operands().end());
// Searching for empty string is a special case and is simply the
// first match you find. This needs to take into account starting position
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"first match you find" -> "always found at the first searched position"?

@JohnDumbell JohnDumbell force-pushed the jd/feature/constant_string_functions branch from 2fe2434 to b5d8053 Compare October 2, 2019 13:47
@JohnDumbell
Copy link
Contributor Author

@romainbrenguier Mind having another look and seeing if you're blocking issue is dealt with?

const function_application_exprt &f_l1)
{
const auto &f_type = to_mathematical_function_type(f_l1.function().type());
const auto &length_type = f_type.domain().at(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a precondition for f_type.domain().size() == 2

return character > ' ';
};

// Note the points where a trim would trim too.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"to"?

^Generated [1-9]\d* VCC\(s\), [1-9]\d* remaining after simplification$
^EXIT=0$
^SIGNAL=0$
^VERIFICATION SUCCESSFUL$
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the builtin upper_case and lower_case are only correct for a (small) subset of unicode (I think ascii and latin-1 supplement) so this is incorrect but expected. It sould be marked as knownbug.

bool to_upper)
{
const auto &f_type = to_mathematical_function_type(f_l1.function().type());
const auto &length_type = f_type.domain().at(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

precondition

const function_application_exprt &f_l1)
{
const auto &f_type = to_mathematical_function_type(f_l1.function().type());
const auto &length_type = f_type.domain().at(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

precondition

bool constant_propagate_replace(
statet &state,
symex_assignt &symex_assign,
const function_application_exprt &f_l1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🌵

@romainbrenguier romainbrenguier dismissed their stale review October 2, 2019 15:42

blocking comments addressed

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: b5d8053).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/130056711

@JohnDumbell JohnDumbell force-pushed the jd/feature/constant_string_functions branch from b5d8053 to bdde675 Compare October 4, 2019 11:10
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: bdde675).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/130408178

@JohnDumbell JohnDumbell force-pushed the jd/feature/constant_string_functions branch from bdde675 to db1f844 Compare October 7, 2019 16:23
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: db1f844).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/130783671

@JohnDumbell JohnDumbell merged commit 357c1b1 into diffblue:develop Oct 8, 2019
@JohnDumbell JohnDumbell deleted the jd/feature/constant_string_functions branch October 8, 2019 14:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants