Skip to content

Conversation

@jmid
Copy link
Collaborator

@jmid jmid commented Apr 25, 2022

Investigating long running times where the following Thread test took long to the point of suspecting a deadlock/infinite loop (or a bug in the run-time):

   dune exec src/neg_tests/thread_lin_tests.exe -- -v -s 219916560

Because of other recent fixes enhancing reproducability, I was able to pin-point the following cmd triple:

([(Member 982); (Add_node 919); (Member 443); (Member 66); (Add_node 3); (Add_node 3); (Member 5)],
 [(Add_node 9); (Member 64); (Member 5); (Add_node 72); (Add_node 989); (Member 0); (Add_node 4); (Add_node 9); (Add_node 78); (Member 67); (Member 56); (Member 76); (Member 223); (Add_node 6); (Member 89)],
 [(Add_node 855); (Add_node 36); (Member 1); (Member 142); (Add_node 6); (Add_node 4); (Member 36); (Member 4); (Member 1); (Add_node 7); (Add_node 54); (Member 9); (Add_node 8)])

with a 15 and 13 element cmd list running in parallel and needing interleaving.

Depending on the scheduling, 50 repetitions of the above would take between 1min39sec and 15min to run - with most computation time being spent in the interleaving search. Since the interleaving is costly (exponential in input length I believe) we therefore reduce the input cmd list size. This brings the interleaving search time significantly down (to around 1min, worst case).

While we are at it, we similarly adjust STM's par_len to at most 12.

@jmid
Copy link
Collaborator Author

jmid commented Apr 26, 2022

I remembered an old remark in Claessen-al:ICFP09 (state-machine based):

"We generate parallel test cases by parallelizing a suffix of an eqc_statem test case, separating it into two lists
of commands of roughly equal length, [...]"

I then changed arb_cmds_par to such an approach, as there's a smaller chance of triggering concurrency issues when running a (0 or) 1-element cmd list in parallel with a, say 10-element cmd list.
This has a statistical significant impact, which then led me to reduce rep_count back to 100 from 125 for Thread.

As all three interpretations (Domain, Thread, Effect) are affected by the arb_cmds_par change I noticed that Effect tests in neg_tests started taking needlessly long. I therefore reduced their test count from 20.000 down to 1.000 like the others. They still trigger the expected errors.

@jmid jmid merged commit a64b114 into main May 2, 2022
@jmid jmid deleted the fix-long-interleaving branch May 2, 2022 07:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants