greatly improve capabilities of the fuzzer #23416

gooncreeper · 2025-03-31T00:34:58Z

This PR significantly improves the capabilities of the fuzzer. For comparison, here is a ten minute head to head between the old and new fuzzer implementations (with newly included fuzz tests):

-- Old --

Total Runs: 49020931
Unique Runs: 1044131 (2.1%)
Speed (Runs/Second): 81696
Coverage: 2069 / 15866 (13.0%)

(note: Unique Runs is highly inflated due of the inefficiency of the old implementation)

-- New --

Total Runs: 537039526
Unique Runs: 1511 (0.0%)
Speed (Runs/Second): 894950
Coverage: 3000 / 15719 (19.1%)

Examples: `while(C)i(){}else|`
          `{y:n()align(b)addrspace`
          `switch(P){else=>`
          `[:l]align(_:r:l)R`
          `(if(b){defer{nosuspend`
          `union(enum(I))`

NOTE: You have to rebuild the compiler due to new fuzzing instrumentation being enabled for memory loads.

The changes made to the fuzzer to accomplish this feat mostly include tracking memory reads from .rodata to determine new runs, new mutations (especially the ones that insert const values from .rodata reads and __sanitizer_conv_const_cmp), and minimizing found inputs. Additionally, the runs per second has greatly been increased due to generating smaller inputs and avoiding clearing the 8-bit pc counters.

An additional feature added is that the length of the input file is now stored and the old input file is rerun upon start, though this does not close #20803 since it does not output the input (though it can be very easily retrieved from the cache directory.)

Other changes made to the fuzzer include a more logical initialization interface, using one shared file in for inputs, creating corpus files with proper sizes, and using hexadecimal-numbered corpus files to simplify the code a bit. ~~Additionally, volatile was removed from MemoryMappedList since all that is needed is a guarantee that compiler has done the writes, which is already accomplished with atomic ordering.~~

Furthermore, I added several new fuzz tests to gauge the fuzzer's efficiency. I also tried to add a test for zstandard decompression, which it crashed within 60,000 runs (less than a second.)

Bug fixes include:

Web interface stats now update even when unique runs is not changing.
Fixed tokenizer.testPropertiesUpheld to allow stray carriage returns since they are valid whitespace.
Supersedes fuzzer: don't remove or modify byte of empty input #23180
Fixed a race conditions when multiple fuzzer processes needed to use the same coverage file. Closes Fuzzer crash on first run with many fuzzing tests #23738

Possible Improvements:

Remove the 8-bit pc counting code prefer a call to a sanitizer function that updates a flag if a new pc hit happened (similar to how the __sanitizer_cov_load functions already operate).
Less basic input minimization function. It could also try splitting inputs into two between each byte to see if they both hit the same pcs. This is useful as smaller inputs are usually much more efficient.
Deterministic mutations when a new input is found.
Culling out corpus inputs that are redundant due to smaller inputs already hitting their pcs and memory addresses.
Applying multiple mutations during dry spells.
Prioritizing some corpus inputs.
Creating a list of the most successful input splices (which would likely contain grammar keywords) and creating a custom mutation for adding them.
Removing some less-efficient mutations.
Store effective mutations to the disk for the benefit of future runs.
Counting __sanitizer_cov @returnAddresses in determining unique runs.
Optimize __sanitizer_cov_trace_const_cmp methods (the use of an ArrayHashMap is not too fast).
Processor affinity

Nevertheless, I feel like the fuzzer is in a viable place to start being useful (as demonstrated with the find in #23413)

andrewrk · 2025-03-31T06:58:55Z

Additionally, volatile was removed from MemoryMappedList since all that is needed is a guarantee that compiler has done the writes, which is already accomplished with atomic ordering.

It's the other way around. You don't need to guarantee atomicity or ordering of the writes with respect to other memory loads and stores. Also, depending on the ordering, atomics don't cause the writes to have side effects.

volatile on the other hand is exactly the right tool for the job. It only causes the volatile writes to be considered to have side effects, and to be ordered with respect to each other, which is exactly the required semantics.

gooncreeper · 2025-03-31T15:08:11Z

It's the other way around. You don't need to guarantee atomicity or ordering of the writes with respect to other memory loads and stores. Also, depending on the ordering, atomics don't cause the writes to have side effects.

volatile on the other hand is exactly the right tool for the job. It only causes the volatile writes to be considered to have side effects, and to be ordered with respect to each other, which is exactly the required semantics.

Thanks for clarifying, it makes a lot more sense when thinking about the side effects. I will push a commit reverting this once I figure out the CI failures.

Adds a new fuzz test for zig fmt. This fuzz test checks that Ast.render succeeds for parsed inputs. Additionally, for inputs it knows that Ast.render cannot change the order of, it checks that they are not rewritten. This fuzz test has been very successful. Using ziglang#23416, it has found three bugs (one was a TODO); two of them I have fixed and the other is in ziglang#23754. I have run the test for 650,000,000 iterations (about 2 hours) and haven't found any more bugs. Some functions in the fuzz test have instrumentation disabled because their branches are not interesting to the fuzzer; doing this found a ~40% boost to the runs per second. Additionally, the fuzz test handles tokenization itself since so it can determine if the input can be rewritten.

ozgrakkurt · 2025-08-16T06:26:01Z

FWIW I built this branch and tested it on my project.

It seems to be give 1/3 of the iterations/sec of current master release of zig.

It is a fuzz test on this project https://github.com/steelcake/olive

I am running it with zig build fuzz_roundtrip --fuzz -Doptimize=ReleaseSafe

I'm not super experienced in fuzz testing so it might be something wrong with the test as well.

Being able to run all fuzz tests with single command would be a huge improvement for me. Currently not able to do it because of this bug #23738 and this branch should fix it from what I understand (didn't test as need to change the setup for it).

I also have other fuzz tests on the olive project and a bunch more in this project https://github.com/steelcake/arrow-zig if it is useful for testing.

Didn't have time to test if this branch is able to find bugs faster than the master release

gooncreeper · 2025-08-16T15:02:45Z

FWIW I built this branch and tested it on my project.

It seems to be give 1/3 of the iterations/sec of current master release of zig.

I just took a brief look at this. It seems your fuzz test is very slow due to it making massive allocations every run (example) which get @memset(undefined) by the allocator each time. I don't know how your library works, though usually using a small static (since calling into DebugAllocator each iteration is slow) buffer with a fixed buffer allocator would be possible.

As for why the new fuzzer is slower for your test, it seems that the inputs it's trying take longer for your fuzz test to run. The actual fuzzer's overhead is less then master's.

Also, using FlameGraph on the fuzzer process (possibly with Debug instead of Release) is a very good way to debug performance.

ozgrakkurt · 2025-08-17T01:54:26Z

FWIW I built this branch and tested it on my project.
It seems to be give 1/3 of the iterations/sec of current master release of zig.

I just took a brief look at this. It seems your fuzz test is very slow due to it making massive allocations every run (example) which get @memset(undefined) by the allocator each time. I don't know how your library works, though usually using a small static (since calling into DebugAllocator each iteration is slow) buffer with a fixed buffer allocator would be possible.

As for why the new fuzzer is slower for your test, it seems that the inputs it's trying take longer for your fuzz test to run. The actual fuzzer's overhead is less then master's.

Also, using FlameGraph on the fuzzer process (possibly with Debug instead of Release) is a very good way to debug performance.

Makes sense! Thank you. Didn't even realize how slow it was until I tested it. Since I run it on a server normally and don't have access to the webpage.

andrewrk

Thank you for this excellent work, and for diligently keeping it up to date, and for exercising a lot of patience while it sat unreviewed. Really appreciate that.

Loris and I played with this code today also on top of some macOS fuzzing enhancements along with Matthew's debug info enhancements. The speedup in particular is really impressive.

The thing that has given me pause all this time is the change to llvm.zig to enable .rodata load tracing. While that feature might be nice, I think it would be good to evaluate separately because:

It is in direct conflict with writing a generator ("smith"), which would be the more encouraged fuzzing strategy, especially with a potential enhancement to standard library to add a helper API for converting fuzzing inputs to concrete test inputs. I'll file a related issue on that topic after this review. When a fuzz test has a generator, the load tracing only serves to slow down the fuzzer.
It increases the dependency on LLVM code instrumentation, which may change, be less efficient than zig-based one (manually emitting different ABI for constant data loads in LLVM backend), and eventually we're going to need to figure out how to enable these passes via clang CLI directly.

Would you be willing to reduce the scope of this PR to remove the load tracing components?

If you're up for that and address the other minor things here, I'd be happy to move forward with this. You can also expect more prompt collaboration from ZSF core team since we're shifting focus to fuzzing in the weeks ahead.

lib/compiler/test_runner.zig

andrewrk · 2025-09-18T20:44:09Z

lib/fuzzer.zig

+    var buf: [256]u8 = undefined;
+    var fw = f.writer(&buf);
+    const end = f.getEndPos() catch |e| panic("failed to get fuzzer log file end: {t}", .{e});
+    fw.seekTo(end) catch |e| panic("failed to seek to fuzzer log file end: {t}", .{e});


I don't think this should be done on every call to log, and it also will be a bit problematic for multi-threaded fuzzing. Perhaps the changes to logging can be reverted for now?

well, the point of this change is to allow multiple fuzzing processes to share the same log file (that is what the exclusive locking is for). Though, I now realize that I can also cache the file (like before) and use lock with it, but the seek is still necessary to make sure it is only appended to.

I don't think it accomplishes that goal because the different processes will still be stepping all over each other. A different strategy will be needed. There are plenty of ways to tackle the problem with different tradeoffs. I think it's better to leave it unchanged for now and solve that problem independently.

I don't see how they would be stepping over each other, the file is exclusively locked above (which blocks any other processes also trying to exclusively lock the file (basically a mutex)). Sorry for the continued debate, but I want to have a fix in since it's very helpful when debugging the fuzzer.

I see, they won't be stepping over each other because of the exclusive lock. Sorry for not noticing that at first.

Still, I don't think it should be opening, locking, and seeking a file with every log statement. Couldn't each process write to its own log file (perhaps with the pid in the filename)?

I was also confused why this was needed because the changeset does not introduce multiprocessed fuzzing, but I realized that multiple processes are in fact used when there is more than one fuzz test (related: #22900 and #22901). Had to remember that since I've been focusing on other stuff for a while :-)

I have a patch ready to push with everything else to remove the opening every time. However, I do not see much benefit in giving each process its own file since logging should be extremely infrequent anyways, and the changes here also identify which test the message belongs to.

Well, I think it's fair that it doesn't need to block the PR.

gooncreeper · 2025-09-18T21:45:27Z

I agree in the idea that rodata tracing should not be used only on the basis of taking program constants. The one concern I have with removing it is that it is also used to determine fresh inputs. However, I will remove it for now and reevaluate later how important this is when used in a smith setting.

This PR significantly improves the capabilities of the fuzzer. The changes made to the fuzzer to accomplish this feat mostly include tracking memory reads from .rodata to determine fresh inputs, new mutations (especially the ones that insert const values from .rodata reads and __sanitizer_conv_const_cmp), and minimizing found inputs. Additionally, the runs per second has greatly been increased due to generating smaller inputs and avoiding clearing the 8-bit pc counters. An additional feature added is that the length of the input file is now stored and the old input file is rerun upon start. Other changes made to the fuzzer include more logical initialization, using one shared file `in` for inputs, creating corpus files with proper sizes, and using hexadecimal-numbered corpus files for simplicity. Furthermore, I added several new fuzz tests to gauge the fuzzer's efficiency. I also tried to add a test for zstandard decompression, which it crashed within 60,000 runs (less than a second.) Bug fixes include: * Fixed a race conditions when multiple fuzzer processes needed to use the same coverage file. * Web interface stats now update even when unique runs is not changing. * Fixed tokenizer.testPropertiesUpheld to allow stray carriage returns since they are valid whitespace.

This can be re-evaluated at a later time, but at the moment the performance and stability concerns hold it back. Additionally, it promotes a non-smithing approach to fuzz tests.

andrewrk · 2025-09-19T00:15:59Z

Related: #25281

gooncreeper force-pushed the improved-fuzzer branch 2 times, most recently from 6c5255e to 74b4724 Compare May 1, 2025 23:10

gooncreeper mentioned this pull request May 4, 2025

add a fuzz test for zig fmt #23793

Open

gooncreeper force-pushed the improved-fuzzer branch 2 times, most recently from 01b391a to 6ccee4a Compare July 10, 2025 19:54

gooncreeper force-pushed the improved-fuzzer branch from 6ccee4a to bba1afc Compare July 12, 2025 16:41

gooncreeper force-pushed the improved-fuzzer branch 3 times, most recently from 2c28018 to a772397 Compare July 20, 2025 20:21

gooncreeper force-pushed the improved-fuzzer branch 2 times, most recently from 8971c27 to 8c0283a Compare August 2, 2025 02:18

gooncreeper force-pushed the improved-fuzzer branch 2 times, most recently from f594904 to 1b59c28 Compare September 13, 2025 13:31

andrewrk requested changes Sep 18, 2025

View reviewed changes

gooncreeper added 3 commits September 18, 2025 18:56

add some new fuzz tests

b905c65

fuzzer: remove rodata load tracing

7c6ccca

This can be re-evaluated at a later time, but at the moment the performance and stability concerns hold it back. Additionally, it promotes a non-smithing approach to fuzz tests.

gooncreeper force-pushed the improved-fuzzer branch from 1b59c28 to 7c6ccca Compare September 18, 2025 22:57

andrewrk added release notes This PR should be mentioned in the release notes. fuzzing labels Sep 19, 2025

andrewrk enabled auto-merge September 19, 2025 05:04

andrewrk merged commit 164c598 into ziglang:master Sep 19, 2025
18 checks passed

AdamGoertz mentioned this pull request Sep 19, 2025

Add length range to FuzzInputOptions #20914

Closed

Atomk mentioned this pull request Sep 21, 2025

write fuzz inputs to a shared memory region before running a task #20803

Open

McSinyx mentioned this pull request Sep 22, 2025

fuzzer: don't remove or modify byte of empty input #23180

Closed

Uh oh!

greatly improve capabilities of the fuzzer #23416

greatly improve capabilities of the fuzzer #23416

Uh oh!

Conversation

gooncreeper commented Mar 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

andrewrk commented Mar 31, 2025

Uh oh!

gooncreeper commented Mar 31, 2025

Uh oh!

ozgrakkurt commented Aug 16, 2025

Uh oh!

gooncreeper commented Aug 16, 2025

Uh oh!

ozgrakkurt commented Aug 17, 2025

Uh oh!

andrewrk left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

andrewrk Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

gooncreeper Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

andrewrk Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

gooncreeper Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

andrewrk Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

gooncreeper Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

andrewrk Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

gooncreeper commented Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

andrewrk commented Sep 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gooncreeper commented Mar 31, 2025 •

edited

Loading

gooncreeper Sep 18, 2025 •

edited

Loading

gooncreeper commented Sep 18, 2025 •

edited

Loading