Add validation tests for trace XSD #5278

hannes-steffenhagen-diffblue · 2020-03-23T16:47:09Z

We have a XSD for cbmc --trace --xml-ui in our documentation, this adds
a new test run which essentially validates the XML output of all tests
in regression/cbmc against that XSD. Not all tests in there actually
produce traces, but most do so we just exclude the few that don't.

Each commit message has a non-empty body, explaining why the change was made.
Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
My commit message includes data points confirming performance improvements (if claimed).
My PR is restricted to a single feature or bugfix.
White-space or formatting changes outside the feature-related changed lines are in commits of their own.

NlightNFotis

LGTM, but cannot comment on the gradle build files and the .bat files.

codecov-io · 2020-03-24T16:21:11Z

Codecov Report

Merging #5278 into develop will increase coverage by 0.09%.
The diff coverage is n/a.

@@             Coverage Diff             @@
##           develop    #5278      +/-   ##
===========================================
+ Coverage    67.95%   68.04%   +0.09%     
===========================================
  Files         1170     1170              
  Lines        96362    96362              
===========================================
+ Hits         65478    65571      +93     
+ Misses       30884    30791      -93

Flag	Coverage Δ
#cproversmt2	`42.40% <ø> (ø)`
#regression	`65.25% <ø> (+0.80%)`	⬆️
#unit	`31.76% <ø> (ø)`

Impacted Files	Coverage Δ
src/ansi-c/expr2c.cpp	`68.45% <0.00%> (+0.07%)`	⬆️
src/cbmc/cbmc_parse_options.cpp	`76.99% <0.00%> (+0.23%)`	⬆️
src/solvers/smt2/smt2_conv.cpp	`57.46% <0.00%> (+0.26%)`	⬆️
src/util/pointer_offset_size.cpp	`94.85% <0.00%> (+0.32%)`	⬆️
src/solvers/flattening/boolbv_get.cpp	`76.66% <0.00%> (+1.33%)`	⬆️
src/goto-symex/show_vcc.cpp	`56.75% <0.00%> (+2.70%)`	⬆️
src/goto-checker/properties.cpp	`70.90% <0.00%> (+2.72%)`	⬆️
src/goto-symex/build_goto_trace.cpp	`91.72% <0.00%> (+4.13%)`	⬆️
src/goto-programs/printf_formatter.cpp	`55.29% <0.00%> (+4.70%)`	⬆️
src/goto-checker/report_util.cpp	`47.90% <0.00%> (+4.94%)`	⬆️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5c1f7bb...8241b26. Read the comment docs.

hannes-steffenhagen-diffblue · 2020-03-30T11:03:13Z

regression/validate-trace-xml-schema/validate-xsd/gradle/wrapper/gradle-wrapper.properties

@@ -0,0 +1,5 @@
+distributionBase=GRADLE_USER_HOME


Everything in the gradle directory as well as the gradlew and gradlew.bat scripts are autogenerated by gradle so the build can work without a user having to have a compatible gradle version locally. So they're meant to be here, but shouldn't be reviewed.

thk123

This seems good - some naming stuff in the python file I'd really like to see tidied up. Also, I sadly must insist on some tests for the program than runs the validation.

thk123 · 2020-03-31T10:07:34Z

regression/Makefile

@@ -2,35 +2,36 @@
 # running tests in parallel, it is important that these directories are
 # listed with decreasing runtimes (i.e. longest running at the top)
 DIRS = cbmc \
-       cbmc-library \


❓ I assume this is just a reordering of the Makefile? If it is, it'd be best pulled out into its own commit so can be clearly stated as such

thk123 · 2020-03-31T10:08:59Z

regression/validate-trace-xml-schema/CMakeLists.txt

+            PROPERTIES
+            LABELS "CBMC;CORE")
+else()
+    message(WARNING "JDK not found, skipping trace schema tests")


❓ This stops the tests being configured? I wonder if we should configure a test that always fails? That way, if someone configures it locally, they don't need JDK, but if CI fails on this step, it doesn't silently pass.

Not all our CI passes have both Java and Python 3 available.

I guess I'd rather we explicitly skip this test on the jobs that don't have the requirements.

thk123 · 2020-03-31T10:09:48Z

regression/validate-trace-xml-schema/check.sh

+python_minor_version=$(python3 --version | cut -d ' ' -f 2 | cut -d '.' -f 2)
+
+if [ $python_minor_version -lt 5 ]; then
+    echo "python version less than 3.5, skipping XSD tests"


💡 I think I would exit with a non-zero exit code here, to prevent silent failures on CI 👻

This is not supposed to fail CI

What I'm suggesting is it should fail CI (and the CIs where this doesn't work, we disable it in the configuration file). Currently if a config breaks in a way that means we have the wrong version of python, this test will just be disabled and we'll never know.

thk123 · 2020-03-31T10:10:02Z

regression/validate-trace-xml-schema/check.sh

+
+if ! command -v javac > /dev/null; then
+    echo "javac not found, skipping XSD tests"
+    exit


thk123 · 2020-03-31T10:10:12Z

regression/validate-trace-xml-schema/check.sh

+
+if [ $javac_major_version -eq 1 ] || [ $javac_minor_version -ge 8 ]; then
+    echo "JDK version less than 8, skipping XSD tests"
+    exit


regression/validate-trace-xml-schema/check.py

thk123 · 2020-03-31T10:22:06Z

regression/validate-trace-xml-schema/check.py

+class TestSpec:
+    def __init__(self, args, is_knownbug, is_future, is_thorough):
+        self.args = args
+        self.is_knownbug = is_knownbug
+        self.is_future = is_future
+        self.is_thorough = is_thorough


I think we've written a .desc python parser before (possibly for test-gen) 💡 Pull out that into a separate python module so that it may be reused

For reference, we [did](https://github.com/diffblue/cover/blob/develop/test-gen/regression/common.py#L50 🔒), but not trivial to unify these two at the moment.

thk123 · 2020-03-31T10:29:19Z

regression/validate-trace-xml-schema/check.py

+
+    def read_spec(self, spec_path):
+        with open(spec_path) as spec_file:
+            spec = spec_file.readline().split()


💡 I guess I would use the word test_kind rather than spec, which has 3 means in this file:

the XSD specificaiton

the entire .desc file

The first line of the desc file

Would like to see this addressed, as I keep thinking is_knownbug = 'KNOWNBUG' in spec is a bug, since KNOWNBUG might appear elsewhere in the spec, except here spec is just the first line.

Where "this" is:

Rename spec to spec_kind since spec makes it seem like it is the whole contents of the desc file.

regression/validate-trace-xml-schema/check.py

thk123 · 2020-03-31T11:43:13Z

I've checked the test appears in the output for one job, but just to be safe, could you break the output trace to ensure the CI fails (e.g. just change one of the fields or something so is nonsense, ideally one that doesn't cause any other tests to fail) (maybe add a field?)

thk123

One remaining blocking comment (answering my question about the test you've labelled as failing locally).

I also strongly believe scripts that we run in CI should silently fail (e.g. when python not installed)

thk123 · 2020-04-14T10:25:37Z

regression/validate-trace-xml-schema/check.py

+
+# some tests in the cbmc suite don't work for the trace checks for one reason or another
+ExcludedTests = list(map(lambda s: os.path.join(test_base_dir, s), [
+    # these tests except input from stdin


expect? accept? I don't think you mean except here though

⛏️ This seems outstanding

thk123 · 2020-04-14T10:26:37Z

regression/validate-trace-xml-schema/check.py

+    # this one doesn't work for me locally at all
+    # 'integer-assignments1/test.desc',


Huh? You mean the test fails locally? Perhaps we could raise an issue to look into that?

❓ outstanding, this needs to be addressed IMO

To be explicit about what needs addressing:

shouldn't check in "me" since no way of knowing who "me" is as soon as this line gets moved.

Further, I'd like to understand what you mean "doesn't work locally at all" means!

thk123 · 2020-04-14T10:29:04Z

regression/validate-trace-xml-schema/check.sh

+python_minor_version=$(python3 --version | cut -d ' ' -f 2 | cut -d '.' -f 2)
+
+if [ $python_minor_version -lt 5 ]; then
+    echo "python version less than 3.5, skipping XSD tests"


What I'm suggesting is it should fail CI (and the CIs where this doesn't work, we disable it in the configuration file). Currently if a config breaks in a way that means we have the wrong version of python, this test will just be disabled and we'll never know.

thk123 · 2020-04-14T10:30:27Z

regression/validate-trace-xml-schema/check.py

+
+    def read_spec(self, spec_path):
+        with open(spec_path) as spec_file:
+            spec = spec_file.readline().split()


Would like to see this addressed, as I keep thinking is_knownbug = 'KNOWNBUG' in spec is a bug, since KNOWNBUG might appear elsewhere in the spec, except here spec is just the first line.

Where "this" is:

Rename spec to spec_kind since spec makes it seem like it is the whole contents of the desc file.

thk123

I give up on trying to get the CI to fail without dependencies:

for make builds on windows, can't get all the dependencies, and make test doesn't offer the opportunity to filter out tests AFAICT
for cmake builds, though you can filter out the tests, the info about what dependency gets lost which makes it a pain to debug, so I'll leave it

I'll raise a separate PR addressing my other comments, since I've fixed them, but let's get this in first (needs a rebase which is probably easier for you to do?

We have a XSD for cbmc --trace --xml-ui in our documentation, this adds a new test run which essentially validates the XML output of all tests in regression/cbmc against that XSD. Not all tests in there actually produce traces, but most do so we just exclude the few that don't.

hannes-steffenhagen-diffblue force-pushed the ci/xsd-validation branch from 7bf9de6 to cebeb28 Compare March 23, 2020 16:59

NlightNFotis reviewed Mar 23, 2020

View reviewed changes

hannes-steffenhagen-diffblue requested review from chrisr-diffblue, kroening, tautschnig, peterschrammel and smowton as code owners March 23, 2020 18:05

hannes-steffenhagen-diffblue added the do not review label Mar 25, 2020

hannes-steffenhagen-diffblue force-pushed the ci/xsd-validation branch from 2a610b1 to 8e6c02d Compare March 25, 2020 15:38

hannes-steffenhagen-diffblue removed the do not review label Mar 27, 2020

hannes-steffenhagen-diffblue force-pushed the ci/xsd-validation branch from cc77b55 to 063eaec Compare March 27, 2020 09:46

hannes-steffenhagen-diffblue commented Mar 30, 2020

View reviewed changes

thk123 suggested changes Mar 31, 2020

View reviewed changes

hannes-steffenhagen-diffblue force-pushed the ci/xsd-validation branch from 88287c5 to 77e41e5 Compare April 8, 2020 12:07

thk123 reviewed Apr 14, 2020

View reviewed changes

thk123 mentioned this pull request Apr 22, 2020

Fix typo in trace validation script #5315

Merged

4 tasks

thk123 approved these changes Apr 30, 2020

View reviewed changes

hannes-steffenhagen-diffblue force-pushed the ci/xsd-validation branch 3 times, most recently from 18ca4aa to ffd35ff Compare May 1, 2020 15:58

hannes-steffenhagen-diffblue force-pushed the ci/xsd-validation branch from ffd35ff to cd578d9 Compare May 1, 2020 20:17

peterschrammel approved these changes May 6, 2020

View reviewed changes

thk123 merged commit ac0402f into diffblue:develop May 6, 2020

		# this one doesn't work for me locally at all
		# 'integer-assignments1/test.desc',

Add validation tests for trace XSD #5278

Add validation tests for trace XSD #5278

Uh oh!

Conversation

hannes-steffenhagen-diffblue commented Mar 23, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NlightNFotis left a comment

Choose a reason for hiding this comment

Uh oh!

codecov-io commented Mar 24, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

thk123 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

thk123 Apr 14, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

thk123 commented Mar 31, 2020

Uh oh!

thk123 left a comment

Choose a reason for hiding this comment

Uh oh!

thk123 Apr 14, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

thk123 Apr 14, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

thk123 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

hannes-steffenhagen-diffblue commented Mar 23, 2020 •

edited

Loading

codecov-io commented Mar 24, 2020 •

edited

Loading

thk123 Apr 14, 2020 •

edited

Loading

thk123 Apr 14, 2020 •

edited

Loading

thk123 Apr 14, 2020 •

edited

Loading