-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
greatly improve capabilities of the fuzzer #23416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
It's the other way around. You don't need to guarantee atomicity or ordering of the writes with respect to other memory loads and stores. Also, depending on the ordering, atomics don't cause the writes to have side effects.
|
Thanks for clarifying, it makes a lot more sense when thinking about the side effects. I will push a commit reverting this once I figure out the CI failures. |
6c5255e
to
74b4724
Compare
Adds a new fuzz test for zig fmt. This fuzz test checks that Ast.render succeeds for parsed inputs. Additionally, for inputs it knows that Ast.render cannot change the order of, it checks that they are not rewritten. This fuzz test has been very successful. Using ziglang#23416, it has found three bugs (one was a TODO); two of them I have fixed and the other is in ziglang#23754. I have run the test for 650,000,000 iterations (about 2 hours) and haven't found any more bugs. Some functions in the fuzz test have instrumentation disabled because their branches are not interesting to the fuzzer; doing this found a ~40% boost to the runs per second. Additionally, the fuzz test handles tokenization itself since so it can determine if the input can be rewritten.
01b391a
to
6ccee4a
Compare
Adds a new fuzz test for zig fmt. This fuzz test checks that Ast.render succeeds for parsed inputs. Additionally, for inputs it knows that Ast.render cannot change the order of, it checks that they are not rewritten. This fuzz test has been very successful. Using ziglang#23416, it has found three bugs (one was a TODO); two of them I have fixed and the other is in ziglang#23754. I have run the test for 650,000,000 iterations (about 2 hours) and haven't found any more bugs. Some functions in the fuzz test have instrumentation disabled because their branches are not interesting to the fuzzer; doing this found a ~40% boost to the runs per second. Additionally, the fuzz test handles tokenization itself since so it can determine if the input can be rewritten.
6ccee4a
to
bba1afc
Compare
Adds a new fuzz test for zig fmt. This fuzz test checks that Ast.render succeeds for parsed inputs. Additionally, for inputs it knows that Ast.render cannot change the order of, it checks that they are not rewritten. This fuzz test has been very successful. Using ziglang#23416, it has found three bugs (one was a TODO); two of them I have fixed and the other is in ziglang#23754. I have run the test for 650,000,000 iterations (about 2 hours) and haven't found any more bugs. Some functions in the fuzz test have instrumentation disabled because their branches are not interesting to the fuzzer; doing this found a ~40% boost to the runs per second. Additionally, the fuzz test handles tokenization itself since so it can determine if the input can be rewritten.
2c28018
to
a772397
Compare
8971c27
to
8c0283a
Compare
FWIW I built this branch and tested it on my project. It seems to be give 1/3 of the iterations/sec of current master release of zig. It is a fuzz test on this project https://github.com/steelcake/olive I am running it with I'm not super experienced in fuzz testing so it might be something wrong with the test as well. Being able to run all fuzz tests with single command would be a huge improvement for me. Currently not able to do it because of this bug #23738 and this branch should fix it from what I understand (didn't test as need to change the setup for it). I also have other fuzz tests on the olive project and a bunch more in this project https://github.com/steelcake/arrow-zig if it is useful for testing. Didn't have time to test if this branch is able to find bugs faster than the master release |
I just took a brief look at this. It seems your fuzz test is very slow due to it making massive allocations every run (example) which get As for why the new fuzzer is slower for your test, it seems that the inputs it's trying take longer for your fuzz test to run. The actual fuzzer's overhead is less then master's. Also, using FlameGraph on the fuzzer process (possibly with Debug instead of Release) is a very good way to debug performance. |
Makes sense! Thank you. Didn't even realize how slow it was until I tested it. Since I run it on a server normally and don't have access to the webpage. |
f594904
to
1b59c28
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this excellent work, and for diligently keeping it up to date, and for exercising a lot of patience while it sat unreviewed. Really appreciate that.
Loris and I played with this code today also on top of some macOS fuzzing enhancements along with Matthew's debug info enhancements. The speedup in particular is really impressive.
The thing that has given me pause all this time is the change to llvm.zig to enable .rodata load tracing. While that feature might be nice, I think it would be good to evaluate separately because:
- It is in direct conflict with writing a generator ("smith"), which would be the more encouraged fuzzing strategy, especially with a potential enhancement to standard library to add a helper API for converting fuzzing inputs to concrete test inputs. I'll file a related issue on that topic after this review. When a fuzz test has a generator, the load tracing only serves to slow down the fuzzer.
- It increases the dependency on LLVM code instrumentation, which may change, be less efficient than zig-based one (manually emitting different ABI for constant data loads in LLVM backend), and eventually we're going to need to figure out how to enable these passes via clang CLI directly.
Would you be willing to reduce the scope of this PR to remove the load tracing components?
If you're up for that and address the other minor things here, I'd be happy to move forward with this. You can also expect more prompt collaboration from ZSF core team since we're shifting focus to fuzzing in the weeks ahead.
var buf: [256]u8 = undefined; | ||
var fw = f.writer(&buf); | ||
const end = f.getEndPos() catch |e| panic("failed to get fuzzer log file end: {t}", .{e}); | ||
fw.seekTo(end) catch |e| panic("failed to seek to fuzzer log file end: {t}", .{e}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this should be done on every call to log, and it also will be a bit problematic for multi-threaded fuzzing. Perhaps the changes to logging can be reverted for now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, the point of this change is to allow multiple fuzzing processes to share the same log file (that is what the exclusive locking is for). Though, I now realize that I can also cache the file (like before) and use lock
with it, but the seek is still necessary to make sure it is only appended to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it accomplishes that goal because the different processes will still be stepping all over each other. A different strategy will be needed. There are plenty of ways to tackle the problem with different tradeoffs. I think it's better to leave it unchanged for now and solve that problem independently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see how they would be stepping over each other, the file is exclusively locked above (which blocks any other processes also trying to exclusively lock the file (basically a mutex)). Sorry for the continued debate, but I want to have a fix in since it's very helpful when debugging the fuzzer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, they won't be stepping over each other because of the exclusive lock. Sorry for not noticing that at first.
Still, I don't think it should be opening, locking, and seeking a file with every log statement. Couldn't each process write to its own log file (perhaps with the pid in the filename)?
I was also confused why this was needed because the changeset does not introduce multiprocessed fuzzing, but I realized that multiple processes are in fact used when there is more than one fuzz test (related: #22900 and #22901). Had to remember that since I've been focusing on other stuff for a while :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a patch ready to push with everything else to remove the opening every time. However, I do not see much benefit in giving each process its own file since logging should be extremely infrequent anyways, and the changes here also identify which test the message belongs to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I think it's fair that it doesn't need to block the PR.
I agree in the idea that rodata tracing should not be used only on the basis of taking program constants. The one concern I have with removing it is that it is also used to determine fresh inputs. However, I will remove it for now and reevaluate later how important this is when used in a smith setting. |
This PR significantly improves the capabilities of the fuzzer. The changes made to the fuzzer to accomplish this feat mostly include tracking memory reads from .rodata to determine fresh inputs, new mutations (especially the ones that insert const values from .rodata reads and __sanitizer_conv_const_cmp), and minimizing found inputs. Additionally, the runs per second has greatly been increased due to generating smaller inputs and avoiding clearing the 8-bit pc counters. An additional feature added is that the length of the input file is now stored and the old input file is rerun upon start. Other changes made to the fuzzer include more logical initialization, using one shared file `in` for inputs, creating corpus files with proper sizes, and using hexadecimal-numbered corpus files for simplicity. Furthermore, I added several new fuzz tests to gauge the fuzzer's efficiency. I also tried to add a test for zstandard decompression, which it crashed within 60,000 runs (less than a second.) Bug fixes include: * Fixed a race conditions when multiple fuzzer processes needed to use the same coverage file. * Web interface stats now update even when unique runs is not changing. * Fixed tokenizer.testPropertiesUpheld to allow stray carriage returns since they are valid whitespace.
This can be re-evaluated at a later time, but at the moment the performance and stability concerns hold it back. Additionally, it promotes a non-smithing approach to fuzz tests.
1b59c28
to
7c6ccca
Compare
Related: #25281 |
This PR significantly improves the capabilities of the fuzzer. For comparison, here is a ten minute head to head between the old and new fuzzer implementations (with newly included fuzz tests):
-- Old --
(note: Unique Runs is highly inflated due of the inefficiency of the old implementation)
-- New --
NOTE: You have to rebuild the compiler due to new fuzzing instrumentation being enabled for memory loads.
The changes made to the fuzzer to accomplish this feat mostly include tracking memory reads from .rodata to determine new runs, new mutations (especially the ones that insert const values from .rodata reads and __sanitizer_conv_const_cmp), and minimizing found inputs. Additionally, the runs per second has greatly been increased due to generating smaller inputs and avoiding clearing the 8-bit pc counters.
An additional feature added is that the length of the input file is now stored and the old input file is rerun upon start, though this does not close #20803 since it does not output the input (though it can be very easily retrieved from the cache directory.)
Other changes made to the fuzzer include a more logical initialization interface, using one shared file
in
for inputs, creating corpus files with proper sizes, and using hexadecimal-numbered corpus files to simplify the code a bit.Additionally, volatile was removed from MemoryMappedList since all that is needed is a guarantee that compiler has done the writes, which is already accomplished with atomic ordering.Furthermore, I added several new fuzz tests to gauge the fuzzer's efficiency. I also tried to add a test for zstandard decompression, which it crashed within 60,000 runs (less than a second.)
Bug fixes include:
Possible Improvements:
@returnAddress
es in determining unique runs.Nevertheless, I feel like the fuzzer is in a viable place to start being useful (as demonstrated with the find in #23413)