Skip to content

[LLD][COFF] Keep hasData true in NullChunk constructor #124368

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 25, 2025

Conversation

cjacek
Copy link
Contributor

@cjacek cjacek commented Jan 24, 2025

NullChunk instances do write data, even if it's always zero. Setting hasData to false
causes Writer::assignAddresses to ignore them when calculating rawSize. This typically
isn't an issue, as null chunks are usually positioned within a section, and later
chunks adjust the size accordingly.

However, on ARM64EC, the auxiliary IAT is placed at the end of the .rdata section and
terminates with a null chunk. As a result, rawSize is never updated to account for it,
and space for the null chunk is not allocated. Consequently, when NullChunk::writeTo
is called, it receives an invalid pointer - either pointing to the next section or
beyond the allocated buffer.

NullChunk instances do write data, even if it's always zero. Setting hasData to false
causes Writer::assignAddresses to ignore them when calculating rawSize. This usually
isn't an issue, as null chunks are typically in the middle of a section and later
chunks adjust the size. However, on ARM64EC, the auxiliary IAT is placed at the end
of the .rdata section and ends with a null chunk, making this problematic.
@cjacek
Copy link
Contributor Author

cjacek commented Jan 24, 2025

This should fix CI failure in #124189.

@llvmbot
Copy link
Member

llvmbot commented Jan 24, 2025

@llvm/pr-subscribers-platform-windows

@llvm/pr-subscribers-lld

Author: Jacek Caban (cjacek)

Changes

NullChunk instances do write data, even if it's always zero. Setting hasData to false causes Writer::assignAddresses to ignore them when calculating rawSize. This usually isn't an issue, as null chunks are typically in the middle of a section and later chunks adjust the size. However, on ARM64EC, the auxiliary IAT is placed at the end of the .rdata section and ends with a null chunk, making this problematic.


Full diff: https://github.com/llvm/llvm-project/pull/124368.diff

2 Files Affected:

  • (modified) lld/COFF/DLL.cpp (-1)
  • (modified) lld/test/COFF/arm64ec-import.test (+13)
diff --git a/lld/COFF/DLL.cpp b/lld/COFF/DLL.cpp
index 6a3f8eb21e8475..ae3a8047b7008f 100644
--- a/lld/COFF/DLL.cpp
+++ b/lld/COFF/DLL.cpp
@@ -132,7 +132,6 @@ class ImportDirectoryChunk : public NonSectionChunk {
 class NullChunk : public NonSectionChunk {
 public:
   explicit NullChunk(size_t n, uint32_t align) : size(n) {
-    hasData = false;
     setAlignment(align);
   }
   explicit NullChunk(COFFLinkerContext &ctx)
diff --git a/lld/test/COFF/arm64ec-import.test b/lld/test/COFF/arm64ec-import.test
index 033c27884be02c..bb2b772081d590 100644
--- a/lld/test/COFF/arm64ec-import.test
+++ b/lld/test/COFF/arm64ec-import.test
@@ -160,6 +160,19 @@ BASERELOC-NEXT:     Type: DIR64
 BASERELOC-NEXT:     Address: 0x5020
 BASERELOC-NEXT:   }
 
+
+Build with -filealign:8 to enable precise size checking.
+
+RUN: lld-link -machine:arm64ec -dll -noentry -out:out-size.dll loadconfig-arm64ec.obj icall.obj hybmp.obj \
+RUN:          test.obj test-arm64ec.lib test2-arm64ec.lib -filealign:8
+
+RUN: llvm-readobj --headers out-size.dll | FileCheck --check-prefix=RDATA-HEADER %s
+
+RDATA-HEADER:      Name: .rdata (2E 72 64 61 74 61 00 00)
+RDATA-HEADER-NEXT: VirtualSize: 0x2030
+RDATA-HEADER-NEXT: VirtualAddress: 0x3000
+RDATA-HEADER-NEXT: RawDataSize: 8240
+
 #--- test.s
     .section .test, "r"
     .globl arm64ec_data_sym

@llvmbot
Copy link
Member

llvmbot commented Jan 24, 2025

@llvm/pr-subscribers-lld-coff

Author: Jacek Caban (cjacek)

Changes

NullChunk instances do write data, even if it's always zero. Setting hasData to false causes Writer::assignAddresses to ignore them when calculating rawSize. This usually isn't an issue, as null chunks are typically in the middle of a section and later chunks adjust the size. However, on ARM64EC, the auxiliary IAT is placed at the end of the .rdata section and ends with a null chunk, making this problematic.


Full diff: https://github.com/llvm/llvm-project/pull/124368.diff

2 Files Affected:

  • (modified) lld/COFF/DLL.cpp (-1)
  • (modified) lld/test/COFF/arm64ec-import.test (+13)
diff --git a/lld/COFF/DLL.cpp b/lld/COFF/DLL.cpp
index 6a3f8eb21e8475..ae3a8047b7008f 100644
--- a/lld/COFF/DLL.cpp
+++ b/lld/COFF/DLL.cpp
@@ -132,7 +132,6 @@ class ImportDirectoryChunk : public NonSectionChunk {
 class NullChunk : public NonSectionChunk {
 public:
   explicit NullChunk(size_t n, uint32_t align) : size(n) {
-    hasData = false;
     setAlignment(align);
   }
   explicit NullChunk(COFFLinkerContext &ctx)
diff --git a/lld/test/COFF/arm64ec-import.test b/lld/test/COFF/arm64ec-import.test
index 033c27884be02c..bb2b772081d590 100644
--- a/lld/test/COFF/arm64ec-import.test
+++ b/lld/test/COFF/arm64ec-import.test
@@ -160,6 +160,19 @@ BASERELOC-NEXT:     Type: DIR64
 BASERELOC-NEXT:     Address: 0x5020
 BASERELOC-NEXT:   }
 
+
+Build with -filealign:8 to enable precise size checking.
+
+RUN: lld-link -machine:arm64ec -dll -noentry -out:out-size.dll loadconfig-arm64ec.obj icall.obj hybmp.obj \
+RUN:          test.obj test-arm64ec.lib test2-arm64ec.lib -filealign:8
+
+RUN: llvm-readobj --headers out-size.dll | FileCheck --check-prefix=RDATA-HEADER %s
+
+RDATA-HEADER:      Name: .rdata (2E 72 64 61 74 61 00 00)
+RDATA-HEADER-NEXT: VirtualSize: 0x2030
+RDATA-HEADER-NEXT: VirtualAddress: 0x3000
+RDATA-HEADER-NEXT: RawDataSize: 8240
+
 #--- test.s
     .section .test, "r"
     .globl arm64ec_data_sym

@mstorsjo
Copy link
Member

However, on ARM64EC, the auxiliary IAT is placed at the end of the .rdata section and ends with a null chunk, making this problematic.

Can you elaborate on how this is problematic? Usually we do this for e.g. the .bss section, extending the virtual size of .data without having actual data in the binary for it. Can you explain what actually goes wrong in #124189 due to the raw data size covering the null chunks?

@cjacek
Copy link
Contributor Author

cjacek commented Jan 25, 2025

In NullChunk::writeTo, we do write data using memset. However, we don't allocate space for that data in Writer::assignAddresses, meaning the pointer passed later to writeTo is invalid, it either points to the next section or beyond the allocated buffer.

To create behavior similar to .bss, we'd need to remove the memset call, but that's not what MSVC does. In most cases, this would save only 8 bytes, which doesn't seem worth diverging from MSVC. In the specific problematic ARM64X test, the entire auxiliary IAT consists only of null chunks, and the first one has a 0x1000 alignment, which could result in larger savings. However, this is a corner case.

@mstorsjo
Copy link
Member

In NullChunk::writeTo, we do write data using memset. However, we don't allocate space for that data in Writer::assignAddresses, meaning the pointer passed later to writeTo is invalid, it either points to the next section or beyond the allocated buffer.

To create behavior similar to .bss, we'd need to remove the memset call, but that's not what MSVC does. In most cases, this would save only 8 bytes, which doesn't seem worth diverging from MSVC. In the specific problematic ARM64X test, the entire auxiliary IAT consists only of null chunks, and the first one has a 0x1000 alignment, which could result in larger savings. However, this is a corner case.

Ok, I see. Yeah we don’t need to do this like bss; this should be fine if you just amend the commit message to explain that we do need to allocate actual space for it as we do memset it.

Copy link
Member

@mstorsjo mstorsjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if the commit message is clarified

@cjacek cjacek merged commit 77c325b into llvm:main Jan 25, 2025
12 checks passed
@cjacek cjacek deleted the has-data branch January 25, 2025 21:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants