Skip to content

[llvm] Implement address sanitizer on AIX (2/6) #129926

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

jakeegan
Copy link
Member

@jakeegan jakeegan commented Mar 5, 2025

This PR includes backend changes needed for the address sanitizer on AIX. It updates shadow mapping instrumentation specific for AIX. Since the address ranges on 64-bit AIX are large, resulting in a large shadow region, we diverge from the default implementation by using HIGH_BITS to limit the shadow region.

previous PR: #129925 next PR: #131866

@llvmbot
Copy link
Member

llvmbot commented Mar 5, 2025

@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-compiler-rt-sanitizer

@llvm/pr-subscribers-debuginfo

Author: Jake Egan (jakeegan)

Changes

The PR includes llvm changes needed for the address sanitizer on AIX.

clang PR: #129925
compiler-rt PR: TBD


Full diff: https://github.com/llvm/llvm-project/pull/129926.diff

6 Files Affected:

  • (modified) llvm/lib/DebugInfo/Symbolize/SymbolizableObjectFile.cpp (+1-1)
  • (modified) llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp (+21-3)
  • (modified) llvm/test/DebugInfo/Symbolize/ELF/data-command-symtab.yaml (+3-3)
  • (modified) llvm/test/DebugInfo/Symbolize/ELF/symtab-file2.yaml (+1-1)
  • (modified) llvm/test/tools/llvm-symbolizer/data.s (+2-2)
  • (modified) llvm/tools/llvm-nm/llvm-nm.cpp (+6)
diff --git a/llvm/lib/DebugInfo/Symbolize/SymbolizableObjectFile.cpp b/llvm/lib/DebugInfo/Symbolize/SymbolizableObjectFile.cpp
index d5e1dc759df5c..f9d7a5ab9f145 100644
--- a/llvm/lib/DebugInfo/Symbolize/SymbolizableObjectFile.cpp
+++ b/llvm/lib/DebugInfo/Symbolize/SymbolizableObjectFile.cpp
@@ -331,7 +331,7 @@ DIGlobal SymbolizableObjectFile::symbolizeData(
   std::string FileName;
   getNameFromSymbolTable(ModuleOffset.Address, Res.Name, Res.Start, Res.Size,
                          FileName);
-  Res.DeclFile = FileName;
+  Res.DeclFile = FileName.empty() ? Res.Name : FileName;
 
   // Try and get a better filename:lineno pair from the debuginfo, if present.
   DILineInfo DL = DebugInfoContext->getLineInfoForDataAddress(ModuleOffset);
diff --git a/llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp b/llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp
index bbe7040121649..3af2493b0f440 100644
--- a/llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp
+++ b/llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp
@@ -119,6 +119,9 @@ static const uint64_t kNetBSDKasan_ShadowOffset64 = 0xdfff900000000000;
 static const uint64_t kPS_ShadowOffset64 = 1ULL << 40;
 static const uint64_t kWindowsShadowOffset32 = 3ULL << 28;
 static const uint64_t kEmscriptenShadowOffset = 0;
+static const uint64_t kAIXShadowOffset32 = 0x40000000;
+// 64-BIT AIX is not yet ready.
+static const uint64_t kAIXShadowOffset64 = 0x0a01000000000000ULL;
 
 // The shadow memory space is dynamically allocated.
 static const uint64_t kWindowsShadowOffset64 = kDynamicShadowSentinel;
@@ -128,6 +131,8 @@ static const size_t kMaxStackMallocSize = 1 << 16;  // 64K
 static const uintptr_t kCurrentStackFrameMagic = 0x41B58AB3;
 static const uintptr_t kRetiredStackFrameMagic = 0x45E0360E;
 
+static const uint32_t kAIXHighBits = 6;
+
 const char kAsanModuleCtorName[] = "asan.module_ctor";
 const char kAsanModuleDtorName[] = "asan.module_dtor";
 static const uint64_t kAsanCtorAndDtorPriority = 1;
@@ -468,6 +473,7 @@ namespace {
 ///   shadow = (mem >> Scale) + &__asan_shadow
 struct ShadowMapping {
   int Scale;
+  int HighBits;
   uint64_t Offset;
   bool OrShadowOffset;
   bool InGlobal;
@@ -487,6 +493,7 @@ static ShadowMapping getShadowMapping(const Triple &TargetTriple, int LongSize,
   bool IsLinux = TargetTriple.isOSLinux();
   bool IsPPC64 = TargetTriple.getArch() == Triple::ppc64 ||
                  TargetTriple.getArch() == Triple::ppc64le;
+  bool IsAIX = TargetTriple.isOSAIX();
   bool IsSystemZ = TargetTriple.getArch() == Triple::systemz;
   bool IsX86_64 = TargetTriple.getArch() == Triple::x86_64;
   bool IsMIPSN32ABI = TargetTriple.isABIN32();
@@ -526,6 +533,8 @@ static ShadowMapping getShadowMapping(const Triple &TargetTriple, int LongSize,
       Mapping.Offset = kWindowsShadowOffset32;
     else if (IsEmscripten)
       Mapping.Offset = kEmscriptenShadowOffset;
+    else if (IsAIX)
+      Mapping.Offset = kAIXShadowOffset32;     
     else
       Mapping.Offset = kDefaultShadowOffset32;
   } else {  // LongSize == 64
@@ -533,7 +542,9 @@ static ShadowMapping getShadowMapping(const Triple &TargetTriple, int LongSize,
     // space is always available.
     if (IsFuchsia)
       Mapping.Offset = 0;
-    else if (IsPPC64)
+    else if (IsAIX)
+      Mapping.Offset = kAIXShadowOffset64;
+    else if (IsPPC64 && !IsAIX)
       Mapping.Offset = kPPC64_ShadowOffset64;
     else if (IsSystemZ)
       Mapping.Offset = kSystemZ_ShadowOffset64;
@@ -592,13 +603,16 @@ static ShadowMapping getShadowMapping(const Triple &TargetTriple, int LongSize,
   // SystemZ, we could OR the constant in a single instruction, but it's more
   // efficient to load it once and use indexed addressing.
   Mapping.OrShadowOffset = !IsAArch64 && !IsPPC64 && !IsSystemZ && !IsPS &&
-                           !IsRISCV64 && !IsLoongArch64 &&
+                           !IsRISCV64 && !IsLoongArch64 && !IsAIX &&
                            !(Mapping.Offset & (Mapping.Offset - 1)) &&
                            Mapping.Offset != kDynamicShadowSentinel;
   bool IsAndroidWithIfuncSupport =
       IsAndroid && !TargetTriple.isAndroidVersionLT(21);
   Mapping.InGlobal = ClWithIfunc && IsAndroidWithIfuncSupport && IsArmOrThumb;
 
+  if (IsAIX && LongSize == 64)
+    Mapping.HighBits = kAIXHighBits;
+
   return Mapping;
 }
 
@@ -1326,7 +1340,11 @@ static bool isUnsupportedAMDGPUAddrspace(Value *Addr) {
 
 Value *AddressSanitizer::memToShadow(Value *Shadow, IRBuilder<> &IRB) {
   // Shadow >> scale
-  Shadow = IRB.CreateLShr(Shadow, Mapping.Scale);
+  if (TargetTriple.isOSAIX() && TargetTriple.getArch() == Triple::ppc64)
+    Shadow = IRB.CreateLShr(IRB.CreateShl(Shadow, Mapping.HighBits),
+                            Mapping.Scale + Mapping.HighBits);
+  else
+    Shadow = IRB.CreateLShr(Shadow, Mapping.Scale);
   if (Mapping.Offset == 0) return Shadow;
   // (Shadow >> scale) | offset
   Value *ShadowBase;
diff --git a/llvm/test/DebugInfo/Symbolize/ELF/data-command-symtab.yaml b/llvm/test/DebugInfo/Symbolize/ELF/data-command-symtab.yaml
index 83af3111c5dd6..fa2514fda5459 100644
--- a/llvm/test/DebugInfo/Symbolize/ELF/data-command-symtab.yaml
+++ b/llvm/test/DebugInfo/Symbolize/ELF/data-command-symtab.yaml
@@ -7,15 +7,15 @@
 
 # CHECK:       func
 # CHECK-NEXT:  4096 1
-# CHECK-NEXT:  ??:?
+# CHECK-NEXT:  func:0
 # CHECK-EMPTY:
 # CHECK-NEXT:  data
 # CHECK-NEXT:  8192 2
-# CHECK-NEXT:  ??:?
+# CHECK-NEXT:  data:0
 # CHECK-EMPTY:
 # CHECK-NEXT:  notype
 # CHECK-NEXT:  8194 3
-# CHECK-NEXT:  ??:?
+# CHECK-NEXT:  notype:0
 # CHECK-EMPTY:
 
 --- !ELF
diff --git a/llvm/test/DebugInfo/Symbolize/ELF/symtab-file2.yaml b/llvm/test/DebugInfo/Symbolize/ELF/symtab-file2.yaml
index f86a934240d20..37cd8c25b8695 100644
--- a/llvm/test/DebugInfo/Symbolize/ELF/symtab-file2.yaml
+++ b/llvm/test/DebugInfo/Symbolize/ELF/symtab-file2.yaml
@@ -74,7 +74,7 @@ Symbols:
 
 # CHECK3:      code
 # CHECK3-NEXT: 4096 2
-# CHECK3-NEXT: ??:?
+# CHECK3-NEXT: code:0
 # CHECK3-EMPTY:
 
 --- !ELF
diff --git a/llvm/test/tools/llvm-symbolizer/data.s b/llvm/test/tools/llvm-symbolizer/data.s
index cc9503c59141a..f08a59bb2f18b 100644
--- a/llvm/test/tools/llvm-symbolizer/data.s
+++ b/llvm/test/tools/llvm-symbolizer/data.s
@@ -7,11 +7,11 @@
 
 # CHECK:      d1
 # CHECK-NEXT: 0 8
-# CHECK-NEXT: ??:?
+# CHECK-NEXT: d1:0
 # CHECK-EMPTY:
 # CHECK-NEXT: d2
 # CHECK-NEXT: 8 4
-# CHECK-NEXT: ??:?
+# CHECK-NEXT: d2:0
 # CHECK-EMPTY:
 
 d1:
diff --git a/llvm/tools/llvm-nm/llvm-nm.cpp b/llvm/tools/llvm-nm/llvm-nm.cpp
index e7c3e36dd38d2..5a5f84b3915d2 100644
--- a/llvm/tools/llvm-nm/llvm-nm.cpp
+++ b/llvm/tools/llvm-nm/llvm-nm.cpp
@@ -697,6 +697,12 @@ static void printLineNumbers(symbolize::LLVMSymbolizer &Symbolizer,
       return;
     break;
   }
+  case 'a':
+    return;
+  case 'd':
+    return;
+  case 'r':
+    return;
   case 't':
   case 'T': {
     Expected<DILineInfo> ResOrErr = Symbolizer.symbolizeCode(*Obj, Address);

@@ -1326,7 +1340,11 @@ static bool isUnsupportedAMDGPUAddrspace(Value *Addr) {

Value *AddressSanitizer::memToShadow(Value *Shadow, IRBuilder<> &IRB) {
// Shadow >> scale
Shadow = IRB.CreateLShr(Shadow, Mapping.Scale);
if (TargetTriple.isOSAIX() && TargetTriple.getArch() == Triple::ppc64)
Shadow = IRB.CreateLShr(IRB.CreateShl(Shadow, Mapping.HighBits),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will ShadowBase has HighBits set?
It possible to avoid that so you don't need to diverge from default implementation?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, does this mean that 0x090100000f000000 will have same shadow as 0x000100000f000000?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I think we have to diverge from the default implementation because the address ranges on 64-bit AIX are large, resulting in a large shadow region, so we use HIGH_BITS to limit it. More details on the mapping:
https://github.com/llvm/llvm-project/blob/e9e590e350df5877d717bb667f20bbc60a9de0ee/compiler-rt/lib/asan/asan_mapping_aix64.h

0x90100000f000000 is mapped to 0x20200001e00000 while 0x100000f000000 is mapped to 0x200001e00000

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still confused with HIGH_BITS purpose.
If we do just:

#define MEM_TO_SHADOW(mem)                                       \
  ((((mem)) >> ((ASAN_SHADOW_SCALE))) + \
   ASAN_SHADOW_OFFSET)

Mapping should be

// Default AIX64 mapping:
// || `[0x0fffff8000000000, 0x0fffffffffffffff]` || HighMem    ||
// || `[0x0c00fff000000000, 0x0c00ffffffffffff]` || HighShadow ||
// || `[0x0b41000000000000, 0x0b41003fffffffff]` || MidShadow  ||
// || `[0x0b21020000000000, 0x0b21020fffffffff]` || Mid2Shadow ||
// || `[0x0b01020000000000, 0x0b01020fffffffff]` || Mid3Shadow ||
// || `[0x0a01000000000000, 0x0a01000fffffffff]` || LowShadow  ||
// || `[0x0a00000000000000, 0x0a0001ffffffffff]` || MidMem     ||
// || `[0x0900100000000000, 0x0900107fffffffff]` || Mid2Mem    ||
// || `[0x0800100000000000, 0x0800107fffffffff]` || Mid3Mem    ||
// || `[0x0000000000000000, 0x0000007fffffffff]` || LowMem     ||

Same amount of shadow. Why this does not work?

Copy link
Collaborator

@vitalybuka vitalybuka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check comments inline

Copy link

github-actions bot commented Mar 9, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@jakeegan jakeegan changed the title [llvm] Implement address sanitizer on AIX (2/3) [llvm] Implement address sanitizer on AIX (2/n) Mar 24, 2025
@jakeegan jakeegan changed the title [llvm] Implement address sanitizer on AIX (2/n) [llvm] Implement address sanitizer on AIX (2/6) Apr 30, 2025
@jakeegan
Copy link
Member Author

Ping

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants