Skip to content

[compiler-rt] ThreadDescriptorSize is flaky (fails if other tests call InitTlsSize() first) #112399

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mgorny opened this issue Oct 15, 2024 · 3 comments · Fixed by #112438
Closed

Comments

@mgorny
Copy link
Member

mgorny commented Oct 15, 2024

Today, two tests started failing for me on main:

FAIL: SanitizerCommon-Unit :: ./Sanitizer-i386-Test/1/12 (3455 of 7309)
******************** TEST 'SanitizerCommon-Unit :: ./Sanitizer-i386-Test/1/12' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:/var/tmp/portage/sys-libs/compiler-rt-sanitizers-20.0.0_pre20241015/work/compiler-rt_build/lib/sanitizer_common/tests/./Sanitizer-i386-Test-SanitizerCommon-Unit-878-1-12.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=12 GTEST_SHARD_INDEX=1 /var/tmp/portage/sys-libs/compiler-rt-sanitizers-20.0.0_pre20241015/work/compiler-rt_build/lib/sanitizer_common/tests/./Sanitizer-i386-Test
--

Script:
--
/var/tmp/portage/sys-libs/compiler-rt-sanitizers-20.0.0_pre20241015/work/compiler-rt_build/lib/sanitizer_common/tests/./Sanitizer-i386-Test --gtest_filter=SanitizerLinux.ThreadDescriptorSize
--
/var/tmp/portage/sys-libs/compiler-rt-sanitizers-20.0.0_pre20241015/work/compiler-rt/lib/sanitizer_common/tests/sanitizer_linux_test.cpp:208: Failure
Expected equality of these values:
  0u
    Which is: 0
  ThreadDescriptorSize()
    Which is: 1216


/var/tmp/portage/sys-libs/compiler-rt-sanitizers-20.0.0_pre20241015/work/compiler-rt/lib/sanitizer_common/tests/sanitizer_linux_test.cpp:208
Expected equality of these values:
  0u
    Which is: 0
  ThreadDescriptorSize()
    Which is: 1216



********************
FAIL: SanitizerCommon-Unit :: ./Sanitizer-x86_64-Test/10/13 (3639 of 7309)
******************** TEST 'SanitizerCommon-Unit :: ./Sanitizer-x86_64-Test/10/13' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:/var/tmp/portage/sys-libs/compiler-rt-sanitizers-20.0.0_pre20241015/work/compiler-rt_build/lib/sanitizer_common/tests
/./Sanitizer-x86_64-Test-SanitizerCommon-Unit-878-10-13.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=13 GTEST_SHARD_INDEX=10 /var/tmp/portag
e/sys-libs/compiler-rt-sanitizers-20.0.0_pre20241015/work/compiler-rt_build/lib/sanitizer_common/tests/./Sanitizer-x86_64-Test
--

Script:
--
/var/tmp/portage/sys-libs/compiler-rt-sanitizers-20.0.0_pre20241015/work/compiler-rt_build/lib/sanitizer_common/tests/./Sanitizer-x86_6
4-Test --gtest_filter=SanitizerLinux.ThreadDescriptorSize
--
/var/tmp/portage/sys-libs/compiler-rt-sanitizers-20.0.0_pre20241015/work/compiler-rt/lib/sanitizer_common/tests/sanitizer_linux_test.cp
p:208: Failure
Expected equality of these values:
  0u
    Which is: 0
  ThreadDescriptorSize()
    Which is: 2368


/var/tmp/portage/sys-libs/compiler-rt-sanitizers-20.0.0_pre20241015/work/compiler-rt/lib/sanitizer_common/tests/sanitizer_linux_test.cp
p:208
Expected equality of these values:
  0u
    Which is: 0
  ThreadDescriptorSize()
    Which is: 2368



********************

A bisect points out that the failure started happening with aa44f59. Curious facts:

  • Tests fail only on one of the two systems I've tested. The "failing" system is -j12, the good one is -j32; I'm going to test later if switching the other to -j12 triggers the issue.
  • If I remove the newly added test (TEST(SanitizerCommon, PrintThreadHistory)), they start passing again.
  • If I remove the contents of the new test but leave the empty function in place, they still fail.

All this considered, my educated guess is that something is leaking from some other test, and the addition of this one caused other tests to be reorganized into shards that trigger this failure.

@mgorny
Copy link
Member Author

mgorny commented Oct 15, 2024

Oh, that test assumes that InitTlsSize() wasn't called earlier. Lemme try to figure out what ends up in that shard…

@mgorny
Copy link
Member Author

mgorny commented Oct 15, 2024

This assertions was introduced in 999313d, so perhaps that's why it haven't exploded in anyone else's face yet. There are two other tests that indirectly call InitTlsSize() via InitializePlatformEarly():

compiler-rt/lib/sanitizer_common/tests/sanitizer_common_test.cpp-TEST(SanitizerCommon, ThreadStackTlsMain)
compiler-rt/lib/sanitizer_common/tests/sanitizer_common_test.cpp-TEST(SanitizerCommon, ThreadStackTlsWorker)

And indeed, the first one is being called earlier in the same worker:

[ RUN      ] SanitizerCommon.ThreadStackTlsMain
[       OK ] SanitizerCommon.ThreadStackTlsMain (0 ms)

@mgorny mgorny changed the title [compiler-rt] ThreadDescriptorSize failing (on some systems) since aa44f59abf399f81585898fb95e66518ef3591af [compiler-rt] ThreadDescriptorSize is flaky (fails if other tests call InitTlsSize() first) Oct 15, 2024
@vitalybuka vitalybuka self-assigned this Oct 15, 2024
vitalybuka added a commit that referenced this issue Oct 15, 2024
Important part of the test to have correct
`ThreadDescriptorSize` after `InitTlsSize()`.

It's not a problem if another test called
`InitTlsSize()` before.

Fixes #112399.
DanielCChen pushed a commit to DanielCChen/llvm-project that referenced this issue Oct 16, 2024
Important part of the test to have correct
`ThreadDescriptorSize` after `InitTlsSize()`.

It's not a problem if another test called
`InitTlsSize()` before.

Fixes llvm#112399.
@mgorny
Copy link
Member Author

mgorny commented Oct 16, 2024

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants