Skip to content

Conversation

gooncreeper
Copy link
Contributor

@gooncreeper gooncreeper commented Mar 31, 2025

This PR significantly improves the capabilities of the fuzzer. For comparison, here is a ten minute head to head between the old and new fuzzer implementations (with newly included fuzz tests):

-- Old --

Total Runs: 49020931
Unique Runs: 1044131 (2.1%)
Speed (Runs/Second): 81696
Coverage: 2069 / 15866 (13.0%)

(note: Unique Runs is highly inflated due of the inefficiency of the old implementation)

-- New --

Total Runs: 537039526
Unique Runs: 1511 (0.0%)
Speed (Runs/Second): 894950
Coverage: 3000 / 15719 (19.1%)
Examples: `while(C)i(){}else|`
          `{y:n()align(b)addrspace`
          `switch(P){else=>`
          `[:l]align(_:r:l)R`
          `(if(b){defer{nosuspend`
          `union(enum(I))`

NOTE: You have to rebuild the compiler due to new fuzzing instrumentation being enabled for memory loads.

The changes made to the fuzzer to accomplish this feat mostly include tracking memory reads from .rodata to determine new runs, new mutations (especially the ones that insert const values from .rodata reads and __sanitizer_conv_const_cmp), and minimizing found inputs. Additionally, the runs per second has greatly been increased due to generating smaller inputs and avoiding clearing the 8-bit pc counters.

An additional feature added is that the length of the input file is now stored and the old input file is rerun upon start, though this does not close #20803 since it does not output the input (though it can be very easily retrieved from the cache directory.)

Other changes made to the fuzzer include a more logical initialization interface, using one shared file in for inputs, creating corpus files with proper sizes, and using hexadecimal-numbered corpus files to simplify the code a bit. Additionally, volatile was removed from MemoryMappedList since all that is needed is a guarantee that compiler has done the writes, which is already accomplished with atomic ordering.

Furthermore, I added several new fuzz tests to gauge the fuzzer's efficiency. I also tried to add a test for zstandard decompression, which it crashed within 60,000 runs (less than a second.)

Bug fixes include:

Possible Improvements:

  • Remove the 8-bit pc counting code prefer a call to a sanitizer function that updates a flag if a new pc hit happened (similar to how the __sanitizer_cov_load functions already operate).
  • Less basic input minimization function. It could also try splitting inputs into two between each byte to see if they both hit the same pcs. This is useful as smaller inputs are usually much more efficient.
  • Deterministic mutations when a new input is found.
  • Culling out corpus inputs that are redundant due to smaller inputs already hitting their pcs and memory addresses.
  • Applying multiple mutations during dry spells.
  • Prioritizing some corpus inputs.
  • Creating a list of the most successful input splices (which would likely contain grammar keywords) and creating a custom mutation for adding them.
  • Removing some less-efficient mutations.
  • Store effective mutations to the disk for the benefit of future runs.
  • Counting __sanitizer_cov @returnAddresses in determining unique runs.
  • Optimize __sanitizer_cov_trace_const_cmp methods (the use of an ArrayHashMap is not too fast).
  • Processor affinity

Nevertheless, I feel like the fuzzer is in a viable place to start being useful (as demonstrated with the find in #23413)

@andrewrk
Copy link
Member

Additionally, volatile was removed from MemoryMappedList since all that is needed is a guarantee that compiler has done the writes, which is already accomplished with atomic ordering.

It's the other way around. You don't need to guarantee atomicity or ordering of the writes with respect to other memory loads and stores. Also, depending on the ordering, atomics don't cause the writes to have side effects.

volatile on the other hand is exactly the right tool for the job. It only causes the volatile writes to be considered to have side effects, and to be ordered with respect to each other, which is exactly the required semantics.

@gooncreeper
Copy link
Contributor Author

It's the other way around. You don't need to guarantee atomicity or ordering of the writes with respect to other memory loads and stores. Also, depending on the ordering, atomics don't cause the writes to have side effects.

volatile on the other hand is exactly the right tool for the job. It only causes the volatile writes to be considered to have side effects, and to be ordered with respect to each other, which is exactly the required semantics.

Thanks for clarifying, it makes a lot more sense when thinking about the side effects. I will push a commit reverting this once I figure out the CI failures.

@gooncreeper gooncreeper force-pushed the improved-fuzzer branch 2 times, most recently from 6c5255e to 74b4724 Compare May 1, 2025 23:10
gooncreeper added a commit to gooncreeper/zig that referenced this pull request May 4, 2025
Adds a new fuzz test for zig fmt. This fuzz test checks that Ast.render
succeeds for parsed inputs. Additionally, for inputs it knows that
Ast.render cannot change the order of, it checks that they are not
rewritten.

This fuzz test has been very successful. Using ziglang#23416, it has found
three bugs (one was a TODO); two of them I have fixed and the other is
in ziglang#23754. I have run the test for 650,000,000 iterations (about 2
hours) and haven't found any more bugs.

Some functions in the fuzz test have instrumentation disabled because
their branches are not interesting to the fuzzer; doing this found a
~40% boost to the runs per second. Additionally, the fuzz test handles
tokenization itself since so it can determine if the input can be
rewritten.
@gooncreeper gooncreeper force-pushed the improved-fuzzer branch 2 times, most recently from 01b391a to 6ccee4a Compare July 10, 2025 19:54
gooncreeper added a commit to gooncreeper/zig that referenced this pull request Jul 11, 2025
Adds a new fuzz test for zig fmt. This fuzz test checks that Ast.render
succeeds for parsed inputs. Additionally, for inputs it knows that
Ast.render cannot change the order of, it checks that they are not
rewritten.

This fuzz test has been very successful. Using ziglang#23416, it has found
three bugs (one was a TODO); two of them I have fixed and the other is
in ziglang#23754. I have run the test for 650,000,000 iterations (about 2
hours) and haven't found any more bugs.

Some functions in the fuzz test have instrumentation disabled because
their branches are not interesting to the fuzzer; doing this found a
~40% boost to the runs per second. Additionally, the fuzz test handles
tokenization itself since so it can determine if the input can be
rewritten.
gooncreeper added a commit to gooncreeper/zig that referenced this pull request Jul 16, 2025
Adds a new fuzz test for zig fmt. This fuzz test checks that Ast.render
succeeds for parsed inputs. Additionally, for inputs it knows that
Ast.render cannot change the order of, it checks that they are not
rewritten.

This fuzz test has been very successful. Using ziglang#23416, it has found
three bugs (one was a TODO); two of them I have fixed and the other is
in ziglang#23754. I have run the test for 650,000,000 iterations (about 2
hours) and haven't found any more bugs.

Some functions in the fuzz test have instrumentation disabled because
their branches are not interesting to the fuzzer; doing this found a
~40% boost to the runs per second. Additionally, the fuzz test handles
tokenization itself since so it can determine if the input can be
rewritten.
@gooncreeper gooncreeper force-pushed the improved-fuzzer branch 3 times, most recently from 2c28018 to a772397 Compare July 20, 2025 20:21
@gooncreeper gooncreeper force-pushed the improved-fuzzer branch 2 times, most recently from 8971c27 to 8c0283a Compare August 2, 2025 02:18
@ozgrakkurt
Copy link
Contributor

FWIW I built this branch and tested it on my project.

It seems to be give 1/3 of the iterations/sec of current master release of zig.

It is a fuzz test on this project https://github.com/steelcake/olive

I am running it with zig build fuzz_roundtrip --fuzz -Doptimize=ReleaseSafe

I'm not super experienced in fuzz testing so it might be something wrong with the test as well.

Being able to run all fuzz tests with single command would be a huge improvement for me. Currently not able to do it because of this bug #23738 and this branch should fix it from what I understand (didn't test as need to change the setup for it).

I also have other fuzz tests on the olive project and a bunch more in this project https://github.com/steelcake/arrow-zig if it is useful for testing.

Didn't have time to test if this branch is able to find bugs faster than the master release

@gooncreeper
Copy link
Contributor Author

FWIW I built this branch and tested it on my project.

It seems to be give 1/3 of the iterations/sec of current master release of zig.

I just took a brief look at this. It seems your fuzz test is very slow due to it making massive allocations every run (example) which get @memset(undefined) by the allocator each time. I don't know how your library works, though usually using a small static (since calling into DebugAllocator each iteration is slow) buffer with a fixed buffer allocator would be possible.

As for why the new fuzzer is slower for your test, it seems that the inputs it's trying take longer for your fuzz test to run. The actual fuzzer's overhead is less then master's.

Also, using FlameGraph on the fuzzer process (possibly with Debug instead of Release) is a very good way to debug performance.

@ozgrakkurt
Copy link
Contributor

FWIW I built this branch and tested it on my project.
It seems to be give 1/3 of the iterations/sec of current master release of zig.

I just took a brief look at this. It seems your fuzz test is very slow due to it making massive allocations every run (example) which get @memset(undefined) by the allocator each time. I don't know how your library works, though usually using a small static (since calling into DebugAllocator each iteration is slow) buffer with a fixed buffer allocator would be possible.

As for why the new fuzzer is slower for your test, it seems that the inputs it's trying take longer for your fuzz test to run. The actual fuzzer's overhead is less then master's.

Also, using FlameGraph on the fuzzer process (possibly with Debug instead of Release) is a very good way to debug performance.

Makes sense! Thank you. Didn't even realize how slow it was until I tested it. Since I run it on a server normally and don't have access to the webpage.

@gooncreeper gooncreeper force-pushed the improved-fuzzer branch 2 times, most recently from f594904 to 1b59c28 Compare September 13, 2025 13:31
Copy link
Member

@andrewrk andrewrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this excellent work, and for diligently keeping it up to date, and for exercising a lot of patience while it sat unreviewed. Really appreciate that.

Loris and I played with this code today also on top of some macOS fuzzing enhancements along with Matthew's debug info enhancements. The speedup in particular is really impressive.

The thing that has given me pause all this time is the change to llvm.zig to enable .rodata load tracing. While that feature might be nice, I think it would be good to evaluate separately because:

  • It is in direct conflict with writing a generator ("smith"), which would be the more encouraged fuzzing strategy, especially with a potential enhancement to standard library to add a helper API for converting fuzzing inputs to concrete test inputs. I'll file a related issue on that topic after this review. When a fuzz test has a generator, the load tracing only serves to slow down the fuzzer.
  • It increases the dependency on LLVM code instrumentation, which may change, be less efficient than zig-based one (manually emitting different ABI for constant data loads in LLVM backend), and eventually we're going to need to figure out how to enable these passes via clang CLI directly.

Would you be willing to reduce the scope of this PR to remove the load tracing components?

If you're up for that and address the other minor things here, I'd be happy to move forward with this. You can also expect more prompt collaboration from ZSF core team since we're shifting focus to fuzzing in the weeks ahead.

Comment on lines +30 to +29
var buf: [256]u8 = undefined;
var fw = f.writer(&buf);
const end = f.getEndPos() catch |e| panic("failed to get fuzzer log file end: {t}", .{e});
fw.seekTo(end) catch |e| panic("failed to seek to fuzzer log file end: {t}", .{e});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should be done on every call to log, and it also will be a bit problematic for multi-threaded fuzzing. Perhaps the changes to logging can be reverted for now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, the point of this change is to allow multiple fuzzing processes to share the same log file (that is what the exclusive locking is for). Though, I now realize that I can also cache the file (like before) and use lock with it, but the seek is still necessary to make sure it is only appended to.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it accomplishes that goal because the different processes will still be stepping all over each other. A different strategy will be needed. There are plenty of ways to tackle the problem with different tradeoffs. I think it's better to leave it unchanged for now and solve that problem independently.

Copy link
Contributor Author

@gooncreeper gooncreeper Sep 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how they would be stepping over each other, the file is exclusively locked above (which blocks any other processes also trying to exclusively lock the file (basically a mutex)). Sorry for the continued debate, but I want to have a fix in since it's very helpful when debugging the fuzzer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, they won't be stepping over each other because of the exclusive lock. Sorry for not noticing that at first.

Still, I don't think it should be opening, locking, and seeking a file with every log statement. Couldn't each process write to its own log file (perhaps with the pid in the filename)?

I was also confused why this was needed because the changeset does not introduce multiprocessed fuzzing, but I realized that multiple processes are in fact used when there is more than one fuzz test (related: #22900 and #22901). Had to remember that since I've been focusing on other stuff for a while :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a patch ready to push with everything else to remove the opening every time. However, I do not see much benefit in giving each process its own file since logging should be extremely infrequent anyways, and the changes here also identify which test the message belongs to.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I think it's fair that it doesn't need to block the PR.

@gooncreeper
Copy link
Contributor Author

gooncreeper commented Sep 18, 2025

I agree in the idea that rodata tracing should not be used only on the basis of taking program constants. The one concern I have with removing it is that it is also used to determine fresh inputs. However, I will remove it for now and reevaluate later how important this is when used in a smith setting.

This PR significantly improves the capabilities of the fuzzer.

The changes made to the fuzzer to accomplish this feat mostly include
tracking memory reads from .rodata to determine fresh inputs, new
mutations (especially the ones that insert const values from .rodata
reads and __sanitizer_conv_const_cmp), and minimizing found inputs.
Additionally, the runs per second has greatly been increased due to
generating smaller inputs and avoiding clearing the 8-bit pc counters.

An additional feature added is that the length of the input file is now
stored and the old input file is rerun upon start.

Other changes made to the fuzzer include more logical initialization,
using one shared file `in` for inputs, creating corpus files with
proper sizes, and using hexadecimal-numbered corpus files for
simplicity.

Furthermore, I added several new fuzz tests to gauge the fuzzer's
efficiency. I also tried to add a test for zstandard decompression,
which it crashed within 60,000 runs (less than a second.)

Bug fixes include:
* Fixed a race conditions when multiple fuzzer processes needed to use
the same coverage file.
* Web interface stats now update even when unique runs is not changing.
* Fixed tokenizer.testPropertiesUpheld to allow stray carriage returns
since they are valid whitespace.
This can be re-evaluated at a later time, but at the moment the
performance and stability concerns hold it back. Additionally, it
promotes a non-smithing approach to fuzz tests.
@andrewrk
Copy link
Member

Related: #25281

@andrewrk andrewrk added release notes This PR should be mentioned in the release notes. fuzzing labels Sep 19, 2025
@andrewrk andrewrk enabled auto-merge September 19, 2025 05:04
@andrewrk andrewrk merged commit 164c598 into ziglang:master Sep 19, 2025
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fuzzing release notes This PR should be mentioned in the release notes.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

write fuzz inputs to a shared memory region before running a task Fuzzer crash on first run with many fuzzing tests

3 participants