Skip to content

Conversation

carolemieux
Copy link

The off-by-POLL error

While investigating the creation/modification times for files in the Magma-generated monitor/ directory, I noticed that the modification times appeared to be POLL-seconds off the names of the monitor directories. I.e., the file named X was actually created X-POLL seconds after the start of the fuzzing run, rather than X seconds after the start of the fuzzing run.

A more clear witness of the issue is that the earliest file in monitor/ is named 5 (or whatever POLL is) rather than 0.

What this means is that applying exp2json.py on experiment directories leads to reporting time to reach/trigger bugs that are 1 poll later than they actually appear. I suspect this is unlikely to have any significant empirical effect (if fuzzer A reaches in 500s and fuzzer B in 1005s, that the times are actually 495s and 1000s doesn't really change any conclusions), but to the best of my understanding this is an error. This PR proposes a fix.

I suppose it's possible at this point that this fix might affect users of MAGMA who are used to things starting at POLL rather than 0, and thus you may not want to apply it.

Root causing and fix

The culprit appears to be this code in magma/run.sh:

rm -f "$MONITOR/tmp"*
polls=("$MONITOR"/*)
if [ ${#polls[@]} -eq 0 ]; then
    counter=0
else
    timestamps=($(sort -n < <(basename -a "${polls[@]}")))
    last=${timestamps[-1]}
    counter=$(( last + POLL ))
fi

if $MONITOR is an empty directory, polls=("$MONITOR"/*) will not be an array of size 0, but rather an array containing "$MONITOR/*" (at least, on ubuntu). So this falls into the else case; since the directory is empty, the last timestamp doesn't usually end up being a number, but e.g. a variable name which is not defined, which then becomes empty, setting counter to start at POLL.

off_by_one_error.sh, attached, isolates this particular bit of code.
off_by_one_error.sh

./off_by_one_error.sh creates tmp_monitor_dir/monitor and either leave the directory empty (if using argument 1) or put some files in (if using argument 0). It then runs through the problematic code to demonstrate the unexpected behaviour.

Here is what the code does with a non-empty monitor directory:

$ ./off_by_one_error.sh 0
====Non-empty monitor directory is tmp_monitor_dir/monitor, ls output:====
10  15  5
============================================================
Output of $MONITOR/*: tmp_monitor_dir/monitor/10 tmp_monitor_dir/monitor/15 tmp_monitor_dir/monitor/5
Content of polls: tmp_monitor_dir/monitor/10 tmp_monitor_dir/monitor/15 tmp_monitor_dir/monitor/5
Output of ${#polls[@]}: 3
polls is non-empty, getting all timestamps in monitor dir
all timestamps: 5 10 15
last timestamp: 15
setting timestamp to last + poll
Counter starts at: 20

This behaviour seems reasonable. There could still be an off-by-one issue here if we write 20 to be empty, and put the result of the first poll in 25, but I haven't thought about that case.

In more clear bugginess, here's what happens with an empty directory.

$ ./off_by_one_error.sh 1
====Empty monitor directory is tmp_monitor_dir/monitor, ls output:====
============================================================
Output of $MONITOR/*: tmp_monitor_dir/monitor/*
Content of polls: tmp_monitor_dir/monitor/*
Output of ${#polls[@]}: 1
polls is non-empty, getting all timestamps in monitor dir
all timestamps: monitor1 monitor2 off_by_one_error_fix.sh off_by_one_error.sh tmp_monitor_dir tmp_monitor_dirs
last timestamp: tmp_monitor_dirs
setting timestamp to last + poll
Counter starts at: 5

The core issue is this line of code:

if [ ${#polls[@]} -eq 0 ]; then

as polls is literally an array containing the element 'monitor/*', not an empty array. (see above)

My suggested fix:

if [ ${#polls[@]} -eq 1 ] && [ "${polls[0]}" = "$MONITOR"'/*' ]; then

This checks that the polls array contains only the literal "$MONITOR"'/*', i.e. 'monitor/*'. This will lead to the counter being started at 0 if literally the file '*' exists in the $MONITOR folder, which I suppose is incorrect.

You can check out the fix in action on the smaller shell script here:
off_by_one_error_fix.sh

This pull request applies this fix to run.sh.

This was the most minimal fix to me, but it is somewhat inelegant to check for this literal file to check if a directory is empty. Perhaps , we could use ls to check for the number of files in monitor, and set counter to 0 in that case. Then we could expand polls only in the else branch, where we will use that content.

And, in writing this PR, I noted that it's possible the resume behaviour also has an off-by-one error, but I'm not sure about this.

adrianherrera and others added 9 commits September 22, 2021 09:02
The default workdir location is tools/captain/workdir, which can grow quite
large and result in large contexts that Docker is unable to handle. Given that
there is nothing in the tools/ directory that the fuzzing containers need, we
can just ignore it.
ddfuzz: added ddfuzz fuzzer
k-scheduler: ensure canaries are compiled in
k-scheduler: bug fixes to work with targets with 'non-wrapper' drivers
@adrianherrera
Copy link
Member

Hi Caroline! 👋

Nice finding, thanks for the PR and the detailed report. I agree this does look like an issue. Do you think the following is a "better" (for some definition of better!) fix?

if [ -z "$(ls $MONITOR)" ]; then
    counter = 0
else
    polls=("$MONITOR"/*)
    timestamps=($(sort -n < <(basename -a "${polls[@]}")))
    # ...
fi

I agree that a check for directory contents is probably cleaner. Then, if that passes we can be assured that polls is a valid array.

What do you think?

@carolemieux
Copy link
Author

carolemieux commented Sep 12, 2025

Hi Adrian :)! Thanks for getting back to me.

Yeah, I think the fix you've proposed is cleaner; checking that the expansion fails in the particular way I checked seems brittle. And polls is only used in the else branch, so it's ok to push its definition into the else branch.

It's also more understandable, as the if condition no longer relies on knowledge of bash arrays. Unless there's some system on which ls on an empty directory doesn't return the empty string.... I hope that's not the case

I've applied your suggestion in a second commit.

@adrianherrera adrianherrera changed the base branch from v1.2 to dev September 14, 2025 09:41
@carolemieux
Copy link
Author

It appears that f35ace4 previously fixed this in dev by using

polls=($(ls ${MONITOR}))
if [ ${#polls[@]} -eq 0 ]; then

Thus the merge conflict.
Do you have a preference on which version to maintain?

@adrianherrera
Copy link
Member

adrianherrera commented Sep 16, 2025 via email

@carolemieux
Copy link
Author

Sounds good, adopted our new version in the merge.

Bumping version makes sense as the observed behaviour will change with this change to run.sh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants