-
Notifications
You must be signed in to change notification settings - Fork 336
Description
We're seeing a significant performance regression after updating from 1.0.79 to 1.0.80 when validating large JSON documents ("large" = up to 300 MB non-pretty-printed). This regression is so severe that we had to revert to 1.0.79 because it makes 1.0.80 unusable for us.
- Using 1.0.79, validating our largest document (330 MB) takes ~10 seconds on my dev laptop. This matches my expectations based on the max runtime metrics we're getting from production.
- Using 1.0.80, I can't tell you how long validating the same document takes because it hasn't finished after 30 minutes.
- Using 1.0.80, our metrics show spikes up to several hours, and that's just the runs that weren't interrupted by something. I don't even know if those hours are the really big documents or just the moderately big ones, so the top end might be even worse.
The regression is also clearly visible with https://github.com/networknt/json-schema-validator-perftest, even if less extreme (both runs are on my dev laptop using the same JVM etc):
- 1.0.79:
Initial validation :11 ms
Iteration 0 (in 4 ms)
Iteration 20 (in 58 ms)
Iteration 40 (in 92 ms)
Iteration 60 (in 127 ms)
Iteration 80 (in 164 ms)
Iteration 100 (in 199 ms)
Iteration 120 (in 221 ms)
Iteration 140 (in 242 ms)
Iteration 160 (in 262 ms)
Iteration 180 (in 282 ms)
Iteration 200 (in 302 ms)
Iteration 220 (in 321 ms)
Iteration 240 (in 340 ms)
Iteration 260 (in 360 ms)
Iteration 280 (in 379 ms)
Iteration 300 (in 398 ms)
Iteration 320 (in 417 ms)
Iteration 340 (in 436 ms)
Iteration 360 (in 455 ms)
Iteration 380 (in 474 ms)
Iteration 400 (in 493 ms)
Iteration 420 (in 511 ms)
Iteration 440 (in 530 ms)
Iteration 460 (in 549 ms)
Iteration 480 (in 568 ms)
END -- time in ms: 586
- 1.0.80:
Initial validation :25 ms
Iteration 0 (in 19 ms)
Iteration 20 (in 164 ms)
Iteration 40 (in 307 ms)
Iteration 60 (in 427 ms)
Iteration 80 (in 543 ms)
Iteration 100 (in 616 ms)
Iteration 120 (in 688 ms)
Iteration 140 (in 760 ms)
Iteration 160 (in 834 ms)
Iteration 180 (in 907 ms)
Iteration 200 (in 979 ms)
Iteration 220 (in 1051 ms)
Iteration 240 (in 1125 ms)
Iteration 260 (in 1197 ms)
Iteration 280 (in 1268 ms)
Iteration 300 (in 1339 ms)
Iteration 320 (in 1414 ms)
Iteration 340 (in 1485 ms)
Iteration 360 (in 1557 ms)
Iteration 380 (in 1629 ms)
Iteration 400 (in 1702 ms)
Iteration 420 (in 1773 ms)
Iteration 440 (in 1845 ms)
Iteration 460 (in 1917 ms)
Iteration 480 (in 1991 ms)
END -- time in ms: 2060
I profiled my test program and found that a lot of time is being spent on HashSet operations in CollectorContext.copyEvaluatedProperties and RefValidator.validate. I'm suspecting the problems were introduced by #714 since it touched exactly those files. I have some screenshots of IntelliJ's profiler results to illustrate this, but not much else unfortunately:


Unfortunately, I can't share the schema or the documents, and crafting a synthetic example would be a lot of work that I'd rather like to avoid.
(Yes, I did see #564, but to be quite honest I didn't want to lump in such a significant regression from one release to the next with a year-old issue, even if they are related.)