Skip to content

[analyzer] Don't assume third iteration in loops #119388

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jan 2, 2025

Conversation

NagyDonat
Copy link
Contributor

This commit ensures that if the loop condition is opaque (the analyzer cannot determine whether it's true or false) and there were at least two iterations, then the analyzer doesn't make the unjustified assumption that it can enter yet another iteration.

Note that the presence of a loop suggests that the developer thought that two iterations can happen (otherwise an if would've been sufficient), but it does not imply that the developer expected three or four iterations -- and in fact there are many false positives where a loop iterates over a two-element (or three-element) data structure, but the analyzer cannot understand the loop condition and blindly assumes that there may be three or more iterations. (In particular, analyzing the FFMPEG project produces 100+ such false positives.)

Moreover, this provides some performance improvements in the sense that the analyzer won't waste time on traversing the execution paths with 3 or 4 iterations in a loop (which are very similar to the paths with 2 iterations) and therefore will be able to traverse more branches elsewhere on the ExplodedGraph.

This logic is disabled if the user enables the widen-loops analyzer option (which is disabled by default), because the "simulate one final iteration after the invalidation" execution path would be suppressed by the "exit the loop if the loop condition is opaque and there were at least two iterations" logic. If we want to support loop widening, we would need to create a follow-up commit which ensures that it "plays nicely" with this logic.


To evaluate the effects of this commit, I also implemented an intermediate version which doesn't change the traversal of the exploded graph and instead of discarding the "assumes a third iteration" code paths eagerly, it analyzes them (wasting some time) and then discards the results found on them.

I analyzed our usual set open source projects these three version (upstream, intermediate, this commit) and the checker groups core, cplusplus, deadcode, nullability, security, unix and valist, plus the checker alpha.security.ArrayBoundV2.

Comparison between upstream (a few weeks old main revision) and the intermediate version:

Project Only in intermediate Only in upstream Evaluation of eliminated reports
memcached 0 new reports 0 resolved reports
tmux 0 new reports 0 resolved reports
curl 0 new reports 7 resolved reports all ArrayBoundV2: 6 clear FPs and 1 arguably true positive
twin 0 new reports 4 resolved reports all ArrayBoundV2: one confusing mess and 3 FPs like this
vim 0 new reports 1 resolved reports clear ArrayBoundV2 FP
openssl 0 new reports 2 resolved reports both clear ArrayBoundV2 FPs
sqlite 0 new reports 1 resolved reports ArrayBoundV2, probably FP
ffmpeg 0 new reports 131 resolved reports 128 ArrayBoundV2 reports -- I looked at 10 randomly picked ones (1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ) and those are all FPs, two UndefinedBinaryOperatorResult false positives [1, 2], and a complex NullDereference false positive
postgres 0 new reports 5 resolved reports three ArrayBoundV2 FPs (two simple 1 2 and one very creative), a confusing and probably FP NullDereference report and an uninitialized.ArraySubscript in code generated by yacc
tinyxml2 0 new reports 0 resolved reports
libwebm 0 new reports 2 resolved reports both straightforward ArrayBoundV2 FPs like this one
xerces 0 new reports 1 resolved reports ArrayBoundV2 FP in a test file
bitcoin 0 new reports 0 resolved reports
protobuf 0 new reports 0 resolved reports
qtbase 0 new reports 6 resolved reports 5 ArrayBoundV2 results (three FPs 1 2 3 and two results 1 2 that I don't understand); one confusing NonNullParamChecker result

Note that it's not surprising that most of the eliminated false positives are coming from the alpha checker alpha.security.ArrayBoundV2: as the analyzer engine was always making these unjustified assumptions, any checker that was affected by this was stuck in alpha -- and only checkers that are less dependent on values manipulated in loops were brought out of the alpha state.

Comparison between intermediate and this commit:

Project Only in this commit Only in intermediate
memcached 0 new reports 0 resolved reports
tmux 3 new reports 0 resolved reports
curl 3 new reports 1 resolved reports
twin 1 new reports 1 resolved reports
vim 17 new reports 3 resolved reports
openssl 1 new reports 0 resolved reports
sqlite 6 new reports 7 resolved reports
ffmpeg 11 new reports 18 resolved reports
postgres 15 new reports 5 resolved reports
tinyxml2 0 new reports 0 resolved reports
libwebm 3 new reports 2 resolved reports
xerces 6 new reports 6 resolved reports
bitcoin 1 new reports 1 resolved reports
protobuf 2 new reports 0 resolved reports
qtbase 21 new reports 26 resolved reports

These differences are just random perturbations: as we don't waste time on analyzing the discarded execution paths, the "smart" graph traversal algorithms pick different branches and finds different results. As this perturbation is negligible compared to the total number of reports (among the largest projects: 1.8% on qtbase, 0.4% on postgres, 0.3% on ffmpeg), I think it's completely fine if this number of random results is replaced by the approximately same amount of other random results. (This perturbation would happen only once, when the user upgrades to a clang version that includes this commit.)

This commit ensures that if the loop condition is opaque (the analyzer
cannot determine whether it's true or false) and there were at least two
iterations, then the analyzer doesn't make the unjustified assumption
that it can enter yet another iteration.

Note that the presence of a loop suggests that the developer thought
that two iterations can happen (otherwise an `if` would've been
sufficient), but it does not imply that the developer expected three or
four iterations -- and in fact there are many false positives where a
loop iterates over a two-element (or three-element) data structure, but
the analyzer cannot understand the loop condition and blindly assumes
that there may be three or more iterations. (In particular, analyzing
the FFMPEG project produces 100+ such false positives.)

Moreover, this provides some performance improvements in the sense that
the analyzer won't waste time on traversing the execution paths with 3
or 4 iterations in a loop (which are very similar to the paths with 2
iterations) and therefore will be able to traverse more branches
elsewhere on the `ExplodedGraph`.

This logic is disabled if the user enables the widen-loops analyzer
option (which is disabled by default), because the "simulate one final
iteration after the invalidation" execution path would be suppressed by
the "exit the loop if the loop condition is opaque and there were at
least two iterations" logic. If we want to support loop widening, we
would need to create a follow-up commit which ensures that it "plays
nicely" with this logic.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:static analyzer labels Dec 10, 2024
@llvmbot

This comment was marked as off-topic.

@llvmbot

This comment was marked as off-topic.

@NagyDonat
Copy link
Contributor Author

NagyDonat commented Dec 10, 2024

The difference between this PR and my earlier PR #109804 "Suppress out of bounds reports after weak loop assumptions" is:

This PR Earlier PR #109804
affects all checkers only affects ArrayBoundV2
doesn't traverse unjustified paths only suppresses reports after traversing the paths
only discards assuming third or later iteration also suppresses reports after assuming zero iterations

Slightly offtopic remark: I'm planning to create a separate commit to filter out branches where the analyzer assumes that 0 iterations happen in a loop (with an opaque condition). I'll probably put that logic behind an analyzer option (e.g. assume-min-iterations where I suggest 1 as the sane default value), because there is probably user demand for a pedantic analysis mode that aggressively assumes that loops can have 0 iterations. (This PR doesn't introduce an analyzer option, because I don't think that "I want the analyzer to assume that 3+ iterations can happen" is something that the users would want to use.)

These follow-up ideas can be discussed at the umbrella discourse thread for my loop handling improvement plans.

Copy link
Contributor

@steakhal steakhal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the ping. I'm really busy lately with stabilizing Z3 refutation.

I really liked the extensive evaluation, which helped a lot forming my opinion here.
The diffs look all right. I only spotchecked some places the reports. I agree with your evaluation.

The content of the patch looks also correct. I only found a couple Grammarly spellings, but that's it.
I wonder how you found out that loop widening may conflict with this heuristic, were any tests broken that raised your awareness?

I have one concern with didEagerlyAssumeBifurcateAt, that it is bound and overwritten at the most recent eager split - which to my knowledge may not be the loop condition if we have something like this:

while (coin()) {
  if (coin()) { // #inner_coin
    ++num;
  }
}

Each time the inner_coin would spoil the LastEagerlyAssumeBifurcationAt trait between the loop iterations, right?

Storing const Expr * pointers is brittle for recursive functions too, where the same expression could appear in different LocationContexts. I wonder if we could make an example where it would confuse this heuristic.

I'm also not entirely convinced about nested loops - if this heuristic would kick in. This is basically a variant of the coin case I showed an example.

I don't have a counter example just yet, but I'm not sure I can spare the time to craft one.

Once again, big kudos for the extensive data. And I have no objections overall, I only have these minor questions before I'd approve this.

@NagyDonat
Copy link
Contributor Author

Thanks for the review, I'll fix the minor remarks soon.

I wonder how you found out that loop widening may conflict with this heuristic, were any tests broken that raised your awareness?

Yes, some (IIRC three) testcases fail in loop-widening.c.

I have one concern with didEagerlyAssumeBifurcateAt, that it is bound and overwritten at the most recent eager split - which to my knowledge may not be the loop condition if we have something like this:

while (coin()) {
  if (coin()) { // #inner_coin
    ++num;
  }
}

Each time the inner_coin would spoil the LastEagerlyAssumeBifurcationAt trait between the loop iterations, right?

Storing const Expr * pointers is brittle for recursive functions too, where the same expression could appear in different LocationContexts. I wonder if we could make an example where it would confuse this heuristic.

I'm also not entirely convinced about nested loops - if this heuristic would kick in. This is basically a variant of the coin case I showed an example.

I don't have a counter example just yet, but I'm not sure I can spare the time to craft one.

These situations do not cause problems for my goals, because I only intend to use LastEagerlyAssumeBifurcationAt as a very short-term storage between the node where the eager assumption happens and the processBranch call directly after it within the same iteration. I did not intend to reuse this information in later iterations of the same loop.

However, it's nice that you highlighted this question, because now I realized that my implementation is buggy (does not do what I wanted). For example in

void foo(int x) {
  while (x == 0) {
    // body that does not change x and does not trigger any EagerlyAssume
  }
}

EagerlyAssume would succeed at the beginning of the first iteration, and this would set LastEagerlyAssumeBifurcationAt to the expression x == 0. This would stay in the state during the subsequent iterations, and at the beginning of the third iteration the code that I added would notice its presence and prevent the modeling of the third iteration. (I want to stop continuing in the loop after the second iteration only if the loop condition is ambiguous at the beginning of that particular iteration, not after some earlier iteration.)

To fix this problem, it would be sufficient to e.g. ensure that evalEagerlyAssumeBifurcation sets LastEagerlyAssumeBifurcationAt to nullptr when the bifurcation fails (= the value is not ambiguous = there is only one child node). However, there might be better solutions to implement this "did EagerlyAssume split the state right now directly before this processBranch callback?" check that I need. (E.g. perhaps we could walk a few steps backwards on the ExplodedGraph -- but I don't know what kinds of nodes should I expect and I'm afraid that what I could write would run into pitfalls in unusual cases.)

<rant>From a higher level POV I think that EagerlyAssume is an ugly hack which makes the ExplodedGraph harder to understand or reason about. I'm certain that there are (or at least were) some clever corner cases where these eager assumptions provide a better analysis (so I don't hope that we can just disable it), but I'm still annoyed that it exists.</rant>

Co-authored-by: Balazs Benics <[email protected]>
@steakhal
Copy link
Contributor

[...] To fix this problem, it would be sufficient to e.g. ensure that evalEagerlyAssumeBifurcation sets LastEagerlyAssumeBifurcationAt to nullptr [...]

Sounds good to me. Let's zero it out after it's "used"/"consumed".

[...] there might be better solutions to implement this "did EagerlyAssume split the state right now directly before this processBranch callback?" check that I need. (E.g. perhaps we could walk a few steps backwards on the ExplodedGraph -- but I don't know what kinds of nodes should I expect and I'm afraid that what I could write would run into pitfalls in unusual cases.)

Yes, it could get messy. For instance if we have multiple parent nodes due to a merge. Let's not do traversals.

I'm good with the change once the last remark is fixed.
BTW have you measured the running time implications of this patch? How much we spare?

@NagyDonat
Copy link
Contributor Author

NagyDonat commented Dec 20, 2024

[...] To fix this problem, it would be sufficient to e.g. ensure that evalEagerlyAssumeBifurcation sets LastEagerlyAssumeBifurcationAt to nullptr [...]

Sounds good to me. Let's zero it out after it's "used"/"consumed".

I took a different approach (which was already mostly implemented when I've seen this suggestion): I'm setting the stored Expr * to nullpointer when I encounter an EagerlyAssume which does not succeed (i.e. at most one branch is feasible). I also added a dedicated test file which shows that this change works (and demonstrates other features and limitations of the commit).

[...] there might be better solutions to implement this "did EagerlyAssume split the state right now directly before this processBranch callback?" check that I need. (E.g. perhaps we could walk a few steps backwards on the ExplodedGraph -- but I don't know what kinds of nodes should I expect and I'm afraid that what I could write would run into pitfalls in unusual cases.)

Yes, it could get messy. For instance if we have multiple parent nodes due to a merge. Let's not do traversals.

OK, if you say so, I'm very happy to avoid it 😀

I'm good with the change once the last remark is fixed. BTW have you measured the running time implications of this patch? How much we spare?

I didn't expect running time implications, because as we discard these execution paths, the analyzer will traverse other execution paths instead of them -- so any performance gains should appear as "more results" instead of "faster analysis". Unfortunately the improvements that I hoped didn't appear in the experimental results: the "final" version of this commit doesn't produce significantly more results compared to the "intermediate" version which just suppresses the results (after "wasting time" on traversing the unjustified paths).

I think we don't see visible performance improvements because my observation that "difference between 2 or 3 iterations is irrelevant on the later parts of the execution path" does not imply that we can save a significant amount of calculations by driving each of those paths through 2 iterations (instead of roughly half 2 iterations, half 3 iterations) -- because those paths differ in many assumptions, so we cannot merge them even if we ignore the iteration count in that particular loop.

Additionally, my measurements are on stable open source projects, where real results are much rarer than false positives (even on stable checkers), so the number of results that we get is an imperfect and somewhat noisy proxy of the actual improvements.

@Xazax-hun
Copy link
Collaborator

Xazax-hun commented Dec 23, 2024

Do you expect any changes for less obvious ways of looping?

  • Recursion
  • while(true) if (cond) break;
  • Like above but with gotos

Would be nice to have some tests for those as well.

@NagyDonat
Copy link
Contributor Author

The logic that I added only affects the handling of the condition part of loop statements, so it has absolutely no effect on the handling of code that implements similar looping behavior with different language features. I added a testcase to document this (and as an example, show the behavior of while (1) if (cond) break; which is IMO the least different way of looping).

@NagyDonat
Copy link
Contributor Author

If I understand correctly, I handled every review suggestion.

@steakhal If there are no additional suggestions, I'd be grateful for an approval. (My follow-up commit is mostly ready, I'll release it soon after merging this.)

Copy link
Contributor

@steakhal steakhal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, I handled every review suggestion.

@steakhal If there are no additional suggestions, I'd be grateful for an approval. (My follow-up commit is mostly ready, I'll release it soon after merging this.)

My bad. I thought I already approved this one.
I checked again the PR and still looks good.

Thank you for your hard work for fixing loops.

@steakhal
Copy link
Contributor

steakhal commented Jan 2, 2025

One more thing. Please mention this in the release notes.

@NagyDonat
Copy link
Contributor Author

My bad. I thought I already approved this one.

No problem, I didn't intend to merge this during the Christmas period (when I was on vacation and didn't want to monitor the effects of the change).

Please mention this in the release notes.

I added a paragraph to the release notes. Is it OK (= not too short, not too long, clear enough) ?

@NagyDonat NagyDonat merged commit bb27d5e into llvm:main Jan 2, 2025
6 of 9 checks passed
@NagyDonat NagyDonat deleted the dont-assume-third-iteration branch January 2, 2025 14:51
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jan 2, 2025

LLVM Buildbot has detected a new failure on builder openmp-s390x-linux running on systemz-1 while building clang at step 6 "test-openmp".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/88/builds/6318

Here is the relevant piece of the build log for the reference
Step 6 (test-openmp) failure: 1200 seconds without output running [b'ninja', b'-j 4', b'check-openmp'], attempting to kill
...
PASS: ompd-test :: openmp_examples/example_2.c (438 of 449)
PASS: ompd-test :: openmp_examples/example_3.c (439 of 449)
PASS: ompd-test :: openmp_examples/example_4.c (440 of 449)
PASS: ompd-test :: openmp_examples/example_5.c (441 of 449)
PASS: ompd-test :: openmp_examples/example_task.c (442 of 449)
UNSUPPORTED: ompd-test :: openmp_examples/ompd_bt.c (443 of 449)
PASS: ompd-test :: openmp_examples/fibonacci.c (444 of 449)
UNSUPPORTED: ompd-test :: openmp_examples/ompd_parallel.c (445 of 449)
PASS: ompd-test :: openmp_examples/parallel.c (446 of 449)
PASS: ompd-test :: openmp_examples/nested.c (447 of 449)
command timed out: 1200 seconds without output running [b'ninja', b'-j 4', b'check-openmp'], attempting to kill
process killed by signal 9
program finished with exit code -1
elapsedTime=1349.913078

@NagyDonat
Copy link
Contributor Author

The LLVM Buildbot failure is clearly unrelated to this commit -- it seems to be a straightforward flakiness on a completely unrelated part of the project.

NagyDonat added a commit to Ericsson/llvm-project that referenced this pull request Apr 24, 2025
This reverts commit bb27d5e.

DO NOT MERGE, this is only for comparison.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:static analyzer clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants