Skip to content

[LLD][COFF] Ensure .bss is merged at the end of a section. #137677

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jeremyd2019
Copy link
Contributor

Because it is full of zeros, it is expected that as much of it as possible is elided from the actual image, and that cannot happen if there is initialized data in the section after it.

@jeremyd2019 jeremyd2019 force-pushed the lld-merge-bss-at-end-of-data branch from ae66955 to 6cefe38 Compare April 28, 2025 18:15
@jeremyd2019 jeremyd2019 changed the title [LLD][COFF] Merge .bss at end of .data section. [LLD][COFF] Ensure .bss is merged at the end of a section. Apr 28, 2025
@jeremyd2019 jeremyd2019 marked this pull request as ready for review April 28, 2025 18:19
@llvmbot
Copy link
Member

llvmbot commented Apr 28, 2025

@llvm/pr-subscribers-platform-windows
@llvm/pr-subscribers-lld-coff

@llvm/pr-subscribers-lld

Author: None (jeremyd2019)

Changes

Because it is full of zeros, it is expected that as much of it as possible is elided from the actual image, and that cannot happen if there is initialized data in the section after it.


Full diff: https://github.com/llvm/llvm-project/pull/137677.diff

2 Files Affected:

  • (modified) lld/COFF/Writer.cpp (+31-21)
  • (added) lld/test/COFF/merge-data-bss.test (+92)
diff --git a/lld/COFF/Writer.cpp b/lld/COFF/Writer.cpp
index a5582cc8074d1..00e63cb90c2d0 100644
--- a/lld/COFF/Writer.cpp
+++ b/lld/COFF/Writer.cpp
@@ -215,6 +215,7 @@ class Writer {
   void appendImportThunks();
   void locateImportTables();
   void createExportTable();
+  void mergeSection(const std::map<StringRef, StringRef>::value_type &p);
   void mergeSections();
   void sortECChunks();
   void appendECImportTables();
@@ -1566,6 +1567,30 @@ void Writer::createSymbolAndStringTable() {
   fileSize = alignTo(fileOff, ctx.config.fileAlign);
 }
 
+void Writer::mergeSection(const std::map<StringRef, StringRef>::value_type &p) {
+  StringRef toName = p.second;
+  if (p.first == toName)
+    return;
+  StringSet<> names;
+  while (true) {
+    if (!names.insert(toName).second)
+      Fatal(ctx) << "/merge: cycle found for section '" << p.first << "'";
+    auto i = ctx.config.merge.find(toName);
+    if (i == ctx.config.merge.end())
+      break;
+    toName = i->second;
+  }
+  OutputSection *from = findSection(p.first);
+  OutputSection *to = findSection(toName);
+  if (!from)
+    return;
+  if (!to) {
+    from->name = toName;
+    return;
+  }
+  to->merge(from);
+}
+
 void Writer::mergeSections() {
   llvm::TimeTraceScope timeScope("Merge sections");
   if (!pdataSec->chunks.empty()) {
@@ -1594,28 +1619,13 @@ void Writer::mergeSections() {
   }
 
   for (auto &p : ctx.config.merge) {
-    StringRef toName = p.second;
-    if (p.first == toName)
-      continue;
-    StringSet<> names;
-    while (true) {
-      if (!names.insert(toName).second)
-        Fatal(ctx) << "/merge: cycle found for section '" << p.first << "'";
-      auto i = ctx.config.merge.find(toName);
-      if (i == ctx.config.merge.end())
-        break;
-      toName = i->second;
-    }
-    OutputSection *from = findSection(p.first);
-    OutputSection *to = findSection(toName);
-    if (!from)
-      continue;
-    if (!to) {
-      from->name = toName;
-      continue;
-    }
-    to->merge(from);
+    if (p.first != ".bss")
+      mergeSection(p);
   }
+
+  auto it = ctx.config.merge.find(".bss");
+  if (it != ctx.config.merge.end())
+    mergeSection(*it);
 }
 
 // EC targets may have chunks of various architectures mixed together at this
diff --git a/lld/test/COFF/merge-data-bss.test b/lld/test/COFF/merge-data-bss.test
new file mode 100644
index 0000000000000..a821f8d6e9048
--- /dev/null
+++ b/lld/test/COFF/merge-data-bss.test
@@ -0,0 +1,92 @@
+# RUN: yaml2obj %s -o %t.obj
+# RUN: lld-link /out:%t.exe /entry:main /subsystem:console /force \
+# RUN:   /merge:.other=.data %t.obj /debug
+# RUN: llvm-readobj --sections %t.exe | FileCheck %s
+
+# CHECK:      Name: .data
+# CHECK-NEXT: VirtualSize: 0x2018
+# CHECK-NEXT: VirtualAddress: 0x3000
+# CHECK-NEXT: RawDataSize: 512
+# CHECK-NOT: Name: .other
+
+--- !COFF
+header:
+  Machine:         IMAGE_FILE_MACHINE_AMD64
+  Characteristics: [  ]
+sections:
+  - Name:            .text
+    Characteristics: [ IMAGE_SCN_CNT_CODE, IMAGE_SCN_MEM_EXECUTE, IMAGE_SCN_MEM_READ ]
+    Alignment:       4
+    SectionData:     '90'
+    SizeOfRawData:   1
+  - Name:            .data
+    Characteristics: [ IMAGE_SCN_CNT_INITIALIZED_DATA, IMAGE_SCN_MEM_READ, IMAGE_SCN_MEM_WRITE ]
+    Alignment:       4
+    SectionData:     '010000000000000002'
+    SizeOfRawData:   9
+  - Name:            .bss
+    Characteristics: [ IMAGE_SCN_CNT_UNINITIALIZED_DATA, IMAGE_SCN_MEM_READ, IMAGE_SCN_MEM_WRITE ]
+    Alignment:       4
+    SectionData:     ''
+    SizeOfRawData:   8192
+  - Name:            .other
+    Characteristics: [ IMAGE_SCN_CNT_INITIALIZED_DATA, IMAGE_SCN_MEM_READ, IMAGE_SCN_MEM_WRITE ]
+    Alignment:       4
+    SectionData:     '030000000000000004'
+    SizeOfRawData:   9
+symbols:
+  - Name:            .text
+    Value:           0
+    SectionNumber:   1
+    SimpleType:      IMAGE_SYM_TYPE_NULL
+    ComplexType:     IMAGE_SYM_DTYPE_NULL
+    StorageClass:    IMAGE_SYM_CLASS_STATIC
+    SectionDefinition:
+      Length:          1
+      NumberOfRelocations: 0
+      NumberOfLinenumbers: 0
+      CheckSum:        4027552580
+      Number:          1
+  - Name:            .data
+    Value:           0
+    SectionNumber:   2
+    SimpleType:      IMAGE_SYM_TYPE_NULL
+    ComplexType:     IMAGE_SYM_DTYPE_NULL
+    StorageClass:    IMAGE_SYM_CLASS_STATIC
+    SectionDefinition:
+      Length:          9
+      NumberOfRelocations: 0
+      NumberOfLinenumbers: 0
+      CheckSum:        4185224559
+      Number:          2
+  - Name:            .bss
+    Value:           0
+    SectionNumber:   3
+    SimpleType:      IMAGE_SYM_TYPE_NULL
+    ComplexType:     IMAGE_SYM_DTYPE_NULL
+    StorageClass:    IMAGE_SYM_CLASS_STATIC
+    SectionDefinition:
+      Length:          8192
+      NumberOfRelocations: 0
+      NumberOfLinenumbers: 0
+      CheckSum:        0
+      Number:          3
+  - Name:            .other
+    Value:           0
+    SectionNumber:   4
+    SimpleType:      IMAGE_SYM_TYPE_NULL
+    ComplexType:     IMAGE_SYM_DTYPE_NULL
+    StorageClass:    IMAGE_SYM_CLASS_STATIC
+    SectionDefinition:
+      Length:          9
+      NumberOfRelocations: 0
+      NumberOfLinenumbers: 0
+      CheckSum:        1054931164
+      Number:          4
+  - Name:            main
+    Value:           0
+    SectionNumber:   1
+    SimpleType:      IMAGE_SYM_TYPE_NULL
+    ComplexType:     IMAGE_SYM_DTYPE_NULL
+    StorageClass:    IMAGE_SYM_CLASS_EXTERNAL
+...

@jeremyd2019
Copy link
Contributor Author

Split from #136180 as requested by @mstorsjo

Copy link
Member

@mstorsjo mstorsjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, with just one question just to be sure, and one request for an added comment.

After approving, I'd wait a day or two here in case someone else might want to speak up about this (e.g. with suggestions about other ways of doing it?), but this looks acceptable to me.

Thanks!

@@ -1566,6 +1567,30 @@ void Writer::createSymbolAndStringTable() {
fileSize = alignTo(fileOff, ctx.config.fileAlign);
}

void Writer::mergeSection(const std::map<StringRef, StringRef>::value_type &p) {
StringRef toName = p.second;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this factorized function is just a direct move of the code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, with indents reduced and continues turned into returns.

Because it is full of zeros, it is expected that as much of it as
possible is elided from the actual image, and that cannot happen if
there is initialized data in the section after it.
@jeremyd2019 jeremyd2019 force-pushed the lld-merge-bss-at-end-of-data branch from 6cefe38 to 48dcc96 Compare April 28, 2025 21:35
Copy link
Member

@mstorsjo mstorsjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants