test: fix flaky test-fs-stat-bigint #58043

dario-piotrowicz · 2025-04-26T21:42:56Z

As I've been running the repo's tests I notice that on my machine¹ test-fs-stat-bigint seems rather
flaky and such flakiness is caused by the allowable deltas in the test which seem to generally be very
low (1 or 2), so here I'm proposing to increment such values by 50²

Besides the above I've applies some minor code refactoring.

Note

I'd like to point out that I am adding this 50 magic number, before a magic number 5 was being used by was removed in #31726, naturally I do agree that magic numbers are not great but I'd argue that the presence of a magic number is anyways preferable compared to that of flakiness

macOS Sequoia 15.4 Darwin XXX 24.4.0 Darwin Kernel Version 24.4.0: Wed Mar 19 21:16:34 PDT 2025; root:xnu-11417.101.15~1/RELEASE_ARM64_T6000 arm64 ↩
50 is a completely arbitrary number, I have tried with various other numbers, such as 5 and 10 and they were significantly reducing but not completely getting rid of the flakiness, so I figured that the number might as well be increased to eliminate the flakiness completely, my guess is that in the grand scheme of things something like a 50 milliseconds is a very small and acceptable delta anyways, but this is really a wild guess of mine, I'm waiting to see if anyone pushes back on it ↩

the current allowable delta values cause some flakiness in the test-fs-stat-bigint tests, so the changes here incremente such values by an arbitrary number to fix such flakiness

codecov · 2025-04-26T22:32:20Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 90.26%. Comparing base (b673c69) to head (e6057b9).
Report is 18 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #58043      +/-   ##
==========================================
- Coverage   90.28%   90.26%   -0.02%     
==========================================
  Files         630      630              
  Lines      186150   186328     +178     
  Branches    36468    36503      +35     
==========================================
+ Hits       168059   168186     +127     
- Misses      10975    11013      +38     
- Partials     7116     7129      +13

see 33 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

RaisinTen

Adding a magic number feels more like a mitigation than a fix.

Logically, it does make sense to expect the numbers to be between the start and end times.

I'm curious if this was flakey for all of these subtests or if it can be isolated to only some of the:

fs calls like stat / fstat / lstat
conditions like sync / promise / callback
stat object properties like atime / mtime / ctime / birthtime

Investigating that might help in figuring out what's really causing this flakiness.

RaisinTen · 2025-04-27T06:03:32Z

test/parallel/test-fs-stat-bigint.js

+  // this number is computed by taking the difference between the start and end
+  // times of the stat calls, converting it from nanoseconds to milliseconds,
+  // and adding a small buffer to account for potential timing variations
+  const allowableTimeDelta = Math.ceil(Number(endTime - startTime) / 1e6) + 50;


The magic number needs to be passed through common.platformTimeout() to translate to equivalent numbers on other platforms

RaisinTen

Can you check if measuring the start time before file creation (the getFilename() call creates the file) gets rid of the flakiness?

dario-piotrowicz · 2025-04-27T08:18:56Z

Can you check if measuring the start time before file creation (the getFilename() call creates the file) gets rid of the flakiness?

Sure, I've just checked 👍 (on the sync test), it does not seem to help with the flakiness 🤔

dario-piotrowicz · 2025-04-27T08:54:41Z

I'm curious if this was flakey for all of these subtests or if it can be isolated to only some of the:

fs calls like stat / fstat / lstat

conditions like sync / promise / callback

stat object properties like atime / mtime / ctime / birthtime

Thanks for the helpful suggestion 🙂 , however I've tested all the isolated possibilities above and I could not see any version that would not include flakiness 😓

I've run subsets of the tests/checks (only the promise tests, only the callback test, only the stat tests, only the fstat tests, only the atime checks, only the mtime checks, etc...) and run them with a high repetition (300), I did not run combinations of them (only atime checks in fstat promise tests, only mtime checks in lstat callback tests, etc...) since given the above I don't imagine that combinations would yield any different result 😕

I've also not checked the level of flakiness of any of the isolated versions, however by running then and manually looking at the results it feels to me like there is not much difference between the flakiness of each version (or the full test file as a whole)

dario-piotrowicz · 2025-04-27T08:59:13Z

@RaisinTen, could we query Jenkins results to understand if the flakiness is present for all platforms or just a subset? (or if it's even failing in CI)

RaisinTen · 2025-04-27T11:46:40Z

The Jenkins results are logged in the reliability repo but I wasn't able to find anything useful there because the linked CI logs are pretty old, so the data is gone. However, there are some interesting bits of information about this in the main issue tracker. Check this out - #45612 and the discussions on James' attempted fix which is similar to yours.

Based on the data there, the time-based flakiness in this test is known to repro only on M1 macOS due to the unexpectedly large difference in the 2 atimeMs values. Can you confirm that you're also seeing flakiness because of differences in the (birth|c|m)time values?

dario-piotrowicz · 2025-04-27T13:19:27Z

Ah, sorry I forgot to check if there was an issue already tracking this 😓

Also sorry I made a mistake in my testing earlier and it does seem that the flakiness only applies to atimeMs values 🙇

My testing

Basically with this change:

function verifyStats(bigintStats, numStats, allowableDelta) {
  // allowableDelta: It's possible that the file stats are updated between the
  // two stat() calls so allow for a small difference.
  for (const key of Object.keys(numStats)) {
    const val = numStats[key];
+   if (key === 'atimeMs') continue;
    if (isDate(val)) {
      const time = val.getTime();
      const time2 = bigintStats[key].getTime();
      assert(

I can see that the flakiness goes away

dario-piotrowicz · 2025-04-27T13:26:36Z

One solution here could be to check, if the key is atimeMs and the platform is darwin then the allowable delta could be increased 🤔

But again that might be seen as a mitigation rather than an actual fix.

dario-piotrowicz · 2025-04-27T13:28:16Z

I feel like this might just be highly dependent on the OS/platform scheduling and something outside of the control of node 😕

RaisinTen · 2025-04-28T09:19:11Z

Based on the docs (I hope it's the same on your M1 macOS too), these are the causes for the atime to change:

$ man 2 stat
...
     st_atime         Time when file data last accessed.  Changed by the mknod(2), utimes(2) and read(2) system
                      calls.

Can you check what's causing the atime to go outside the expected window? You can try to wildly simplify the test locally, make sure the flake repros and use commands like fs_usage, dtruss, etc. to see what's causing this change.

dario-piotrowicz · 2025-04-28T11:08:23Z

Based on the docs (I hope it's the same on your M1 macOS too), these are the causes for the atime to change:
$ man 2 stat
...
     st_atime         Time when file data last accessed.  Changed by the mknod(2), utimes(2) and read(2) system
                      calls.
Can you check what's causing the atime to go outside the expected window? You can try to wildly simplify the test locally, make sure the flake repros and use commands like fs_usage, dtruss, etc. to see what's causing this change.

Thank you so very much for the fantastic suggestion @RaisinTen 🫶

Of course, I will look into this as soon as I can

Anyways, what do you think of this current PR? should I maybe close it and we can keep the conversation in #45612? (I'm asking just to make sure that this PR doesn't create unnecessary noise in the repo 🙂)

RaisinTen · 2025-04-28T11:25:39Z

Since the current approach of this PR is similar to James' approach in #45925 which was closed, I don't think this should land in its current state.

The investigation happening here is very useful and I think we can figure out the root cause and solve this. I don't have any strong opinions on whether the conversation should happen here or in #45612, as long as it happens somewhere. So I'm fine with whichever channel you prefer.

Thanks a lot for looking into this. :)

dario-piotrowicz · 2025-04-28T21:00:38Z

Since the current approach of this PR is similar to James' approach in #45925 which was closed, I don't think this should land in its current state.

Yes I think that the content of this PR would anyways likely significantly change, so I might as well close it and reopen one with the correct/agreed approach when we come up with one

The investigation happening here is very useful and I think we can figure out the root cause and solve this. I don't have any strong opinions on whether the conversation should happen here or in #45612, as long as it happens somewhere. So I'm fine with whichever channel you prefer.

Ok 🙂 , if it's all ok to you then I'll just close this one and continue the conversation in #45612 when I have updates 🙂
(I've made a note on this and will look into it properly in the next few days, I'm not going to forget about it 👍)

Thanks a lot for looking into this. :)

It's my pleasure 😄 Thank you so very much for all the support and helpful suggestions! 🫶

test: fix flaky test-fs-stat-bigint

e6057b9

the current allowable delta values cause some flakiness in the test-fs-stat-bigint tests, so the changes here incremente such values by an arbitrary number to fix such flakiness

nodejs-github-bot added needs-ci PRs that need a full CI run. test Issues and PRs related to the tests. labels Apr 26, 2025

RaisinTen reviewed Apr 27, 2025

View reviewed changes

dario-piotrowicz mentioned this pull request Apr 27, 2025

Investigate flaky test-fs-stat-bigint on macos M1 #45612

Open

dario-piotrowicz closed this Apr 28, 2025

dario-piotrowicz deleted the dario/test-fs-stat-bigint-flakiness branch April 28, 2025 21:00

Uh oh!

test: fix flaky test-fs-stat-bigint #58043

test: fix flaky test-fs-stat-bigint #58043

Uh oh!

Conversation

dario-piotrowicz commented Apr 26, 2025

Footnotes

Uh oh!

codecov bot commented Apr 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

RaisinTen left a comment

Choose a reason for hiding this comment

Uh oh!

RaisinTen Apr 27, 2025

Choose a reason for hiding this comment

Uh oh!

RaisinTen left a comment

Choose a reason for hiding this comment

Uh oh!

dario-piotrowicz commented Apr 27, 2025

Uh oh!

dario-piotrowicz commented Apr 27, 2025

Uh oh!

dario-piotrowicz commented Apr 27, 2025

Uh oh!

RaisinTen commented Apr 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dario-piotrowicz commented Apr 27, 2025

Uh oh!

dario-piotrowicz commented Apr 27, 2025

Uh oh!

dario-piotrowicz commented Apr 27, 2025

Uh oh!

RaisinTen commented Apr 28, 2025

Uh oh!

dario-piotrowicz commented Apr 28, 2025

Uh oh!

RaisinTen commented Apr 28, 2025

Uh oh!

dario-piotrowicz commented Apr 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov bot commented Apr 26, 2025 •

edited

Loading

RaisinTen commented Apr 27, 2025 •

edited

Loading