-
Notifications
You must be signed in to change notification settings - Fork 108
Fix check for empty monitor directory when setting starting counter, which leads to off-by-POLL error in monitor times #179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
The default workdir location is tools/captain/workdir, which can grow quite large and result in large contexts that Docker is unable to handle. Given that there is nothing in the tools/ directory that the fuzzing containers need, we can just ignore it.
ddfuzz: added ddfuzz fuzzer
Merge dev with 1.2
k-scheduler: ensure canaries are compiled in
k-scheduler: bug fixes to work with targets with 'non-wrapper' drivers
Update latest from dev
Hi Caroline! 👋 Nice finding, thanks for the PR and the detailed report. I agree this does look like an issue. Do you think the following is a "better" (for some definition of better!) fix? if [ -z "$(ls $MONITOR)" ]; then
counter = 0
else
polls=("$MONITOR"/*)
timestamps=($(sort -n < <(basename -a "${polls[@]}")))
# ...
fi I agree that a check for directory contents is probably cleaner. Then, if that passes we can be assured that What do you think? |
Hi Adrian :)! Thanks for getting back to me. Yeah, I think the fix you've proposed is cleaner; checking that the expansion fails in the particular way I checked seems brittle. And polls is only used in the else branch, so it's ok to push its definition into the else branch. It's also more understandable, as the if condition no longer relies on knowledge of bash arrays. Unless there's some system on which I've applied your suggestion in a second commit. |
It appears that f35ace4 previously fixed this in
Thus the merge conflict. |
Hi Caroline,
Yeah I did notice the merge conflict when I changed the target branch over
to dev. Sorry, I probably should have checked the dev branch first, but
unfortunately magma isn’t really maintained full-time anymore.
I think I still prefer the fix we came up with, so let’s go with that. We
should probably look to make a new release as well; the v1.2 branch is
pretty old now.
…On Tue, 16 Sep 2025 at 10:44 am, Caroline Lemieux ***@***.***> wrote:
*carolemieux* left a comment (HexHive/magma#179)
<#179 (comment)>
It appears that f35ace4
<f35ace4>
previously fixed this in dev by using
polls=($(ls ${MONITOR}))
if [ ${#polls[@]} -eq 0 ]; then
Thus the merge conflict.
Do you have a preference on which version to maintain?
—
Reply to this email directly, view it on GitHub
<#179 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACB2DETRTOPO2TMVXKA5MQD3S5MOXAVCNFSM6AAAAACGJJLEC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTEOJUGQ2TCNRXHE>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Sounds good, adopted our new version in the merge. Bumping version makes sense as the observed behaviour will change with this change to run.sh |
The off-by-POLL error
While investigating the creation/modification times for files in the Magma-generated
monitor/
directory, I noticed that the modification times appeared to bePOLL
-seconds off the names of the monitor directories. I.e., the file named X was actually created X-POLL
seconds after the start of the fuzzing run, rather than X seconds after the start of the fuzzing run.A more clear witness of the issue is that the earliest file in
monitor/
is named5
(or whateverPOLL
is) rather than0
.What this means is that applying
exp2json.py
on experiment directories leads to reporting time to reach/trigger bugs that are 1 poll later than they actually appear. I suspect this is unlikely to have any significant empirical effect (if fuzzer A reaches in 500s and fuzzer B in 1005s, that the times are actually 495s and 1000s doesn't really change any conclusions), but to the best of my understanding this is an error. This PR proposes a fix.I suppose it's possible at this point that this fix might affect users of MAGMA who are used to things starting at
POLL
rather than0
, and thus you may not want to apply it.Root causing and fix
The culprit appears to be this code in
magma/run.sh
:if $MONITOR is an empty directory, polls=("$MONITOR"/*) will not be an array of size 0, but rather an array containing "$MONITOR/*" (at least, on ubuntu). So this falls into the
else
case; since the directory is empty, thelast
timestamp doesn't usually end up being a number, but e.g. a variable name which is not defined, which then becomes empty, setting counter to start at POLL.off_by_one_error.sh
, attached, isolates this particular bit of code.off_by_one_error.sh
./off_by_one_error.sh
createstmp_monitor_dir/monitor
and either leave the directory empty (if using argument 1) or put some files in (if using argument 0). It then runs through the problematic code to demonstrate the unexpected behaviour.Here is what the code does with a non-empty monitor directory:
This behaviour seems reasonable. There could still be an off-by-one issue here if we write 20 to be empty, and put the result of the first poll in 25, but I haven't thought about that case.
In more clear bugginess, here's what happens with an empty directory.
The core issue is this line of code:
as polls is literally an array containing the element 'monitor/*', not an empty array. (see above)
My suggested fix:
This checks that the polls array contains only the literal "$MONITOR"'/*', i.e. 'monitor/*'. This will lead to the counter being started at 0 if literally the file '*' exists in the $MONITOR folder, which I suppose is incorrect.
You can check out the fix in action on the smaller shell script here:
off_by_one_error_fix.sh
This pull request applies this fix to
run.sh
.This was the most minimal fix to me, but it is somewhat inelegant to check for this literal file to check if a directory is empty. Perhaps , we could use
ls
to check for the number of files inmonitor
, and set counter to 0 in that case. Then we could expandpolls
only in the else branch, where we will use that content.And, in writing this PR, I noted that it's possible the resume behaviour also has an off-by-one error, but I'm not sure about this.