Skip to content

[test] Fixes #120514 #125450

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

[test] Fixes #120514 #125450

wants to merge 1 commit into from

Conversation

Colibrow
Copy link
Contributor

@Colibrow Colibrow commented Feb 3, 2025

No description provided.

@llvmbot
Copy link
Member

llvmbot commented Feb 3, 2025

@llvm/pr-subscribers-lld-elf

@llvm/pr-subscribers-lld

Author: Pengying Xu (Colibrow)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/125450.diff

1 Files Affected:

  • (modified) lld/test/ELF/bp-section-orderer.s (+6-4)
diff --git a/lld/test/ELF/bp-section-orderer.s b/lld/test/ELF/bp-section-orderer.s
index 1f3776280eae0e..ef7659e338aa7f 100644
--- a/lld/test/ELF/bp-section-orderer.s
+++ b/lld/test/ELF/bp-section-orderer.s
@@ -37,14 +37,18 @@
 
 # RUN: ld.lld -o out.cd a.o --verbose-bp-section-orderer --bp-compression-sort=data 2>&1 | FileCheck %s --check-prefix=BP-COMPRESSION-DATA
 # RUN: llvm-nm -jn out.cd | tr '\n' , | FileCheck %s --check-prefix=CDATA
-# CDATA: s4,s2,s1,s5,s3,F,C,E,D,B,A,_start,d4,d1,d3,d2,
+# CDATA-DAG: s4,s2,s1
+# CDATA-DAG: s5,s3
+# CDATA: F,C,E,D,B,A,_start,d4,d1,d3,d2,
 
 # RUN: ld.lld -o out.cb a.o --verbose-bp-section-orderer --bp-compression-sort=both 2>&1 | FileCheck %s --check-prefix=BP-COMPRESSION-BOTH
 # RUN: llvm-nm -jn out.cb | tr '\n' , | FileCheck %s --check-prefix=CDATA
 
 # RUN: ld.lld -o out.cbs a.o --verbose-bp-section-orderer --bp-compression-sort=both --irpgo-profile=a.profdata --bp-startup-sort=function 2>&1 | FileCheck %s --check-prefix=BP-COMPRESSION-BOTH
 # RUN: llvm-nm -jn out.cbs | tr '\n' , | FileCheck %s --check-prefix=CBOTH-STARTUP
-# CBOTH-STARTUP: s4,s2,s1,s5,s3,A,B,C,F,E,D,_start,d4,d1,d3,d2,
+# CBOTH-STARTUP-DAG: s4,s2,s1,
+# CBOTH-STARTUP-DAG: s5,s3
+# CBOTH-STARTUP: A,B,C,F,E,D,_start,d4,d1,d3,d2,
 
 # BP-COMPRESSION-FUNC: Ordered 7 sections using balanced partitioning
 # BP-COMPRESSION-DATA: Ordered 9 sections using balanced partitioning
@@ -100,8 +104,6 @@ E
 D
 s2
 s1
-r3
-r2
 
 #--- a.c
 const char s5[] = "engineering";

@Colibrow
Copy link
Contributor Author

Colibrow commented Feb 3, 2025

@MaskRay The #120514 failed on some LLD tests on different hosts. I try to not setting the order between s5 and s4 to fix the test issue. Also, remove any redundant items from the orderfile. Could you please review?

@MaskRay
Copy link
Member

MaskRay commented Feb 3, 2025

@MaskRay The #120514 failed on some LLD tests on different hosts. I try to not setting the order between s5 and s4 to fix the test issue. Also, remove any redundant items from the orderfile. Could you please review?

On what hosts do the tests have different behaviors? Why is there a difference?

@MaskRay
Copy link
Member

MaskRay commented Feb 3, 2025

If there is a problem due to DenseMap, DenseSet iteration order, we should fix the source code to avoid host differences.

@Colibrow
Copy link
Contributor Author

Colibrow commented Feb 3, 2025

@MaskRay The #120514 failed on some LLD tests on different hosts. I try to not setting the order between s5 and s4 to fix the test issue. Also, remove any redundant items from the orderfile. Could you please review?

On what hosts do the tests have different behaviors? Why is there a difference?

https://lab.llvm.org/buildbot/#/builders/24/builds/4900
https://lab.llvm.org/buildbot/#/builders/25/builds/6143

If there is a problem due to DenseMap, DenseSet iteration order, we should fix the source code to avoid host differences.

I haven't found the reason which the problem reproduced on my local mac build which flip the order of s5 and s4..

@MaskRay
Copy link
Member

MaskRay commented Feb 3, 2025

Perhaps fixed by c92f204 or 046dd4b ...

@zmodem
Copy link
Collaborator

zmodem commented Feb 3, 2025

Perhaps fixed by c92f204 or 046dd4b ...

No, we hit the error also at b693e1c which comes after: https://logs.chromium.org/logs/chromium/buildbucket/cr-buildbucket/8723973565647921217/+/u/package_clang/stdout

zmodem added a commit that referenced this pull request Feb 3, 2025
The ELF/bp-section-orderer.s test is failing on some buildbots due to
what seems like non-determinism issues, see comments on the original PR
and #125450

Reverting to green the build.

This reverts commit 0154dce and
follow-up commits 046dd4b and
c92f204.
@zmodem
Copy link
Collaborator

zmodem commented Feb 3, 2025

Reverted in f3c4b58 to unbreak the bots for now.

@Colibrow
Copy link
Contributor Author

Colibrow commented Feb 3, 2025

@MaskRay I was able to reproduce this problem in the Mac environment and printed the nodesForDataCompression information before and after the BalancePartition algorithm. I found that the input before the algorithm was consistent, but there was a difference in order after running. Is this something in the BP algorithm? some randomness or something..
@ellishg Hi ellis, do you have any idea? thx.

@Colibrow
Copy link
Contributor Author

Colibrow commented Feb 3, 2025

std::nth_element(Nodes.begin(), NodesMid, Nodes.end(), [](auto &L, auto &R) {
return L.InputOrderIndex < R.InputOrderIndex;
});

Maybe I am wrong. Splitting the nodes using std::nth_element causes a slight difference because this API doesn't guarantee the order of initialization. In my opinion, it shouldn't affect the final order, but it does. When I switched to llvm::sort, it seems to match.

@ellishg
Copy link
Contributor

ellishg commented Feb 3, 2025

std::nth_element(Nodes.begin(), NodesMid, Nodes.end(), [](auto &L, auto &R) {
return L.InputOrderIndex < R.InputOrderIndex;
});

Maybe I am wrong. Splitting the nodes using std::nth_element causes a slight difference because this API doesn't guarantee the order of initialization. In my opinion, it shouldn't affect the final order, but it does. When I switched to llvm::sort, it seems to match.

std::nth_element() was deliberately chosen over sort() for runtime complexity.

Are you seeing non-deterministic behavior on different platforms? Or have you seen this on the same platform?

MaskRay added a commit that referenced this pull request Feb 3, 2025
Exposed by the test added in the reverted #120514

* Fix libstdc++/libc++ differences due to nth_element. #125450 (comment)
* Fix LLVM_ENABLE_REVERSE_ITERATION=1 differences
* Fix potential issue in `currentSize += D::getSize(*sections[*sectionIdxs.begin()])` where DenseSet was used, though not covered by a test
github-actions bot pushed a commit to arm/arm-toolchain that referenced this pull request Feb 3, 2025
Exposed by the test added in the reverted #120514

* Fix libstdc++/libc++ differences due to nth_element. llvm/llvm-project#125450 (comment)
* Fix LLVM_ENABLE_REVERSE_ITERATION=1 differences
* Fix potential issue in `currentSize += D::getSize(*sections[*sectionIdxs.begin()])` where DenseSet was used, though not covered by a test
@Colibrow
Copy link
Contributor Author

Colibrow commented Feb 4, 2025

I mainly tested on mac arm64 and debian-x86_64, for the same input,
mac-log:https://gist.github.com/Colibrow/145c6f0ba77cd23cfbf6752db212ed35
linux-log:https://gist.github.com/Colibrow/7ea824b956094bc640df64028c640ea1
linux-using-llvmsort-log:https://gist.github.com/Colibrow/ef7ed8901f4738eb4032e13ad3f6b039

which in linux-nth-element:

before nth_element {ID=12 InputOrder=5 Utilities={23,24,25,26,27,28,29,30,31,32,33,34,35} Bucket=None}
{ID=7 InputOrder=0 Utilities={1,2,3,4,5} Bucket=None}
{ID=8 InputOrder=1 Utilities={6,7,8,9,10,11,12,13,14,15,16,17,18,19} Bucket=None}
{ID=9 InputOrder=2 Utilities={20,21,1,2,22,3,4,5} Bucket=None}
{ID=10 InputOrder=3 Utilities={20,21,9,22,11,15,16,18,19} Bucket=None}
{ID=11 InputOrder=4 Utilities={6,7,8,10,12,13,14,17} Bucket=None}
{ID=12 InputOrder=5 Utilities={23,24,25,26,27,28,29,30,31,32,33,34,35} Bucket=None}
{ID=13 InputOrder=6 Utilities={25,26,36,37,29,38,32,39,35,40} Bucket=None}
{ID=14 InputOrder=7 Utilities={36,37,38,39,40} Bucket=None}
{ID=15 InputOrder=8 Utilities={23,24,25,26,27,28,29,30,31,32,33,34,35} Bucket=None}
after nth_element {ID=11 InputOrder=4 Utilities={6,7,8,10,12,13,14,17} Bucket=None}
{ID=8 InputOrder=1 Utilities={6,7,8,9,10,11,12,13,14,15,16,17,18,19} Bucket=None}
{ID=9 InputOrder=2 Utilities={20,21,1,2,22,3,4,5} Bucket=None}
{ID=10 InputOrder=3 Utilities={20,21,9,22,11,15,16,18,19} Bucket=None}
{ID=7 InputOrder=0 Utilities={1,2,3,4,5} Bucket=None}
{ID=12 InputOrder=5 Utilities={23,24,25,26,27,28,29,30,31,32,33,34,35} Bucket=None}
{ID=13 InputOrder=6 Utilities={25,26,36,37,29,38,32,39,35,40} Bucket=None}
{ID=14 InputOrder=7 Utilities={36,37,38,39,40} Bucket=None}
{ID=15 InputOrder=8 Utilities={23,24,25,26,27,28,29,30,31,32,33,34,35} Bucket=None}

After nth_element, the ID:11 becomes the first item, while the other two builds follow the original input order, resulting in the same final output.
Hope the log helps.

Copy link
Contributor

@ellishg ellishg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proper fix is in 2f6e3df, which seems ok to me. Since it should be fixed I think we can close this PR.

@Colibrow
Copy link
Contributor Author

Colibrow commented Feb 5, 2025

Closed.

@Colibrow Colibrow closed this Feb 5, 2025
Icohedron pushed a commit to Icohedron/llvm-project that referenced this pull request Feb 11, 2025
The ELF/bp-section-orderer.s test is failing on some buildbots due to
what seems like non-determinism issues, see comments on the original PR
and llvm#125450

Reverting to green the build.

This reverts commit 0154dce and
follow-up commits 046dd4b and
c92f204.
Icohedron pushed a commit to Icohedron/llvm-project that referenced this pull request Feb 11, 2025
Exposed by the test added in the reverted llvm#120514

* Fix libstdc++/libc++ differences due to nth_element. llvm#125450 (comment)
* Fix LLVM_ENABLE_REVERSE_ITERATION=1 differences
* Fix potential issue in `currentSize += D::getSize(*sections[*sectionIdxs.begin()])` where DenseSet was used, though not covered by a test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants