Skip to content

[sancov] Introduce optional callback for stack-depth tracking #138323

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

kees
Copy link
Contributor

@kees kees commented May 2, 2025

Normally -fsanitize-coverage=stack-depth inserts inline arithmetic to update thread_local __sancov_lowest_stack. To support stack depth tracking in the Linux kernel, which does not implement traditional thread_local storage, provide the option to call a function instead.

This matches the existing "stackleak" implementation that is supported in Linux via a GCC plugin. To make this coverage more performant, a minimum estimated stack depth can be chosen to enable the callback mode, skipping instrumentation of functions with smaller stacks.

With -fsanitize-coverage-stack-depth-callback-min set greater than 0, the __sanitize_cov_stack_depth() callback will be injected when the estimated stack depth is greater than or equal to the given minimum.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. compiler-rt:sanitizer llvm:transforms labels May 2, 2025
@kees kees requested a review from vitalybuka May 2, 2025 18:46
@llvmbot
Copy link
Member

llvmbot commented May 2, 2025

@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-clang-driver
@llvm/pr-subscribers-compiler-rt-sanitizer

@llvm/pr-subscribers-clang-codegen

Author: Kees Cook (kees)

Changes

Normally -fsanitize-coverage=stack-depth inserts inline arithmetic to update thread_local __sancov_lowest_stack. To support stack depth tracking in the Linux kernel, which does not implement traditional thread_local storage, provide the option to call a function instead.

This matches the existing "stackleak" implementation that is supported in Linux via a GCC plugin. To make this coverage more performant, a minimum estimated stack depth can be chosen to enable the callback mode, skipping instrumentation of functions with smaller stacks.

With -fsanitize-coverage-stack-depth-callback-min set greater than 0, the __sanitize_cov_stack_depth() callback will be injected when the estimated stack depth is greater than or equal to the given minimum.


Full diff: https://github.com/llvm/llvm-project/pull/138323.diff

7 Files Affected:

  • (modified) clang/include/clang/Basic/CodeGenOptions.def (+1)
  • (modified) clang/include/clang/Driver/Options.td (+5)
  • (modified) clang/include/clang/Driver/SanitizerArgs.h (+1)
  • (modified) clang/lib/CodeGen/BackendUtil.cpp (+1)
  • (modified) clang/lib/Driver/SanitizerArgs.cpp (+14)
  • (modified) llvm/include/llvm/Transforms/Utils/Instrumentation.h (+1)
  • (modified) llvm/lib/Transforms/Instrumentation/SanitizerCoverage.cpp (+50-15)
diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def
index 927972015c3dc..452b1e325afb2 100644
--- a/clang/include/clang/Basic/CodeGenOptions.def
+++ b/clang/include/clang/Basic/CodeGenOptions.def
@@ -305,6 +305,7 @@ CODEGENOPT(SanitizeCoveragePCTable, 1, 0) ///< Create a PC Table.
 CODEGENOPT(SanitizeCoverageControlFlow, 1, 0) ///< Collect control flow
 CODEGENOPT(SanitizeCoverageNoPrune, 1, 0) ///< Disable coverage pruning.
 CODEGENOPT(SanitizeCoverageStackDepth, 1, 0) ///< Enable max stack depth tracing
+VALUE_CODEGENOPT(SanitizeCoverageStackDepthCallbackMin , 32, 0) ///< Enable stack depth tracing callbacks.
 CODEGENOPT(SanitizeCoverageTraceLoads, 1, 0) ///< Enable tracing of loads.
 CODEGENOPT(SanitizeCoverageTraceStores, 1, 0) ///< Enable tracing of stores.
 CODEGENOPT(SanitizeBinaryMetadataCovered, 1, 0) ///< Emit PCs for covered functions.
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 561b0498c549c..5823899bdb558 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -2361,6 +2361,11 @@ def fsanitize_coverage_ignorelist : Joined<["-"], "fsanitize-coverage-ignorelist
     HelpText<"Disable sanitizer coverage instrumentation for modules and functions "
              "that match the provided special case list, even the allowed ones">,
     MarshallingInfoStringVector<CodeGenOpts<"SanitizeCoverageIgnorelistFiles">>;
+def fsanitize_coverage_stack_depth_callback_min_EQ :
+  Joined<["-"], "fsanitize-coverage-stack-depth-callback-min=">,
+  Group<f_clang_Group>, MetaVarName<"<M>">,
+  HelpText<"Use callback for max stack depth tracing with minimum stack depth M">,
+  MarshallingInfoInt<CodeGenOpts<"SanitizeCoverageStackDepthCallbackMin">>;
 def fexperimental_sanitize_metadata_EQ : CommaJoined<["-"], "fexperimental-sanitize-metadata=">,
   Group<f_Group>,
   HelpText<"Specify the type of metadata to emit for binary analysis sanitizers">;
diff --git a/clang/include/clang/Driver/SanitizerArgs.h b/clang/include/clang/Driver/SanitizerArgs.h
index 528e3b400f3dc..1213cd8dcd3b6 100644
--- a/clang/include/clang/Driver/SanitizerArgs.h
+++ b/clang/include/clang/Driver/SanitizerArgs.h
@@ -34,6 +34,7 @@ class SanitizerArgs {
   std::vector<std::string> CoverageIgnorelistFiles;
   std::vector<std::string> BinaryMetadataIgnorelistFiles;
   int CoverageFeatures = 0;
+  int StackDepthCallbackMin = 0;
   int BinaryMetadataFeatures = 0;
   int OverflowPatternExclusions = 0;
   int MsanTrackOrigins = 0;
diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp
index c9ceb49ce5ceb..42c59377688b2 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -255,6 +255,7 @@ getSancovOptsFromCGOpts(const CodeGenOptions &CGOpts) {
   Opts.InlineBoolFlag = CGOpts.SanitizeCoverageInlineBoolFlag;
   Opts.PCTable = CGOpts.SanitizeCoveragePCTable;
   Opts.StackDepth = CGOpts.SanitizeCoverageStackDepth;
+  Opts.StackDepthCallbackMin = CGOpts.SanitizeCoverageStackDepthCallbackMin;
   Opts.TraceLoads = CGOpts.SanitizeCoverageTraceLoads;
   Opts.TraceStores = CGOpts.SanitizeCoverageTraceStores;
   Opts.CollectControlFlow = CGOpts.SanitizeCoverageControlFlow;
diff --git a/clang/lib/Driver/SanitizerArgs.cpp b/clang/lib/Driver/SanitizerArgs.cpp
index ff08bffdbde1f..414f68e4423e8 100644
--- a/clang/lib/Driver/SanitizerArgs.cpp
+++ b/clang/lib/Driver/SanitizerArgs.cpp
@@ -751,6 +751,16 @@ SanitizerArgs::SanitizerArgs(const ToolChain &TC,
       options::OPT_fno_sanitize_ignorelist,
       clang::diag::err_drv_malformed_sanitizer_ignorelist, DiagnoseErrors);
 
+  // Verify that -fsanitize-coverage-stack-depth-callback-min is >= 0.
+  if (Arg *A = Args.getLastArg(options::OPT_fsanitize_coverage_stack_depth_callback_min_EQ)) {
+      StringRef S = A->getValue();
+      if (S.getAsInteger(0, StackDepthCallbackMin) || StackDepthCallbackMin < 0) {
+          if (DiagnoseErrors)
+            D.Diag(clang::diag::err_drv_invalid_value)
+                << A->getAsString(Args) << S;
+      }
+  }
+
   // Parse -f[no-]sanitize-memory-track-origins[=level] options.
   if (AllAddedKinds & SanitizerKind::Memory) {
     if (Arg *A =
@@ -1269,6 +1279,10 @@ void SanitizerArgs::addArgs(const ToolChain &TC, const llvm::opt::ArgList &Args,
   addSpecialCaseListOpt(Args, CmdArgs, "-fsanitize-coverage-ignorelist=",
                         CoverageIgnorelistFiles);
 
+  if (StackDepthCallbackMin)
+    CmdArgs.push_back(Args.MakeArgString("-fsanitize-coverage-stack-depth-callback-min=" +
+                                         Twine(StackDepthCallbackMin)));
+
   if (!GPUSanitize) {
     // Translate available BinaryMetadataFeatures to corresponding clang-cc1
     // flags. Does not depend on any other sanitizers. Unsupported on GPUs.
diff --git a/llvm/include/llvm/Transforms/Utils/Instrumentation.h b/llvm/include/llvm/Transforms/Utils/Instrumentation.h
index 0e2c0d9bfa605..0b2ccf6180e1c 100644
--- a/llvm/include/llvm/Transforms/Utils/Instrumentation.h
+++ b/llvm/include/llvm/Transforms/Utils/Instrumentation.h
@@ -158,6 +158,7 @@ struct SanitizerCoverageOptions {
   bool PCTable = false;
   bool NoPrune = false;
   bool StackDepth = false;
+  int  StackDepthCallbackMin = 0;
   bool TraceLoads = false;
   bool TraceStores = false;
   bool CollectControlFlow = false;
diff --git a/llvm/lib/Transforms/Instrumentation/SanitizerCoverage.cpp b/llvm/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
index e52269637b92d..b8212b42b9d0c 100644
--- a/llvm/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
+++ b/llvm/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
@@ -33,6 +33,7 @@
 #include "llvm/Support/CommandLine.h"
 #include "llvm/Support/SpecialCaseList.h"
 #include "llvm/Support/VirtualFileSystem.h"
+#include "llvm/Support/raw_ostream.h"
 #include "llvm/TargetParser/Triple.h"
 #include "llvm/Transforms/Utils/BasicBlockUtils.h"
 #include "llvm/Transforms/Utils/ModuleUtils.h"
@@ -86,6 +87,7 @@ const char SanCovPCsSectionName[] = "sancov_pcs";
 const char SanCovCFsSectionName[] = "sancov_cfs";
 const char SanCovCallbackGateSectionName[] = "sancov_gate";
 
+const char SanCovStackDepthCallbackName[] = "__sanitizer_cov_stack_depth";
 const char SanCovLowestStackName[] = "__sancov_lowest_stack";
 const char SanCovCallbackGateName[] = "__sancov_should_track";
 
@@ -152,6 +154,12 @@ static cl::opt<bool> ClStackDepth("sanitizer-coverage-stack-depth",
                                   cl::desc("max stack depth tracing"),
                                   cl::Hidden);
 
+static cl::opt<int> ClStackDepthCallbackMin(
+    "sanitizer-coverage-stack-depth-callback-min",
+    cl::desc("max stack depth tracing should use callback and only when "
+	     "stack depth more than specified"),
+    cl::Hidden);
+
 static cl::opt<bool>
     ClCollectCF("sanitizer-coverage-control-flow",
                 cl::desc("collect control flow for each function"), cl::Hidden);
@@ -202,6 +210,8 @@ SanitizerCoverageOptions OverrideFromCL(SanitizerCoverageOptions Options) {
   Options.PCTable |= ClCreatePCTable;
   Options.NoPrune |= !ClPruneBlocks;
   Options.StackDepth |= ClStackDepth;
+  Options.StackDepthCallbackMin = std::max(Options.StackDepthCallbackMin,
+                                           ClStackDepthCallbackMin.getValue());
   Options.TraceLoads |= ClLoadTracing;
   Options.TraceStores |= ClStoreTracing;
   Options.GatedCallbacks |= ClGatedCallbacks;
@@ -271,6 +281,7 @@ class ModuleSanitizerCoverage {
   DomTreeCallback DTCallback;
   PostDomTreeCallback PDTCallback;
 
+  FunctionCallee SanCovStackDepthCallback;
   FunctionCallee SanCovTracePCIndir;
   FunctionCallee SanCovTracePC, SanCovTracePCGuard;
   std::array<FunctionCallee, 4> SanCovTraceCmpFunction;
@@ -514,6 +525,8 @@ bool ModuleSanitizerCoverage::instrumentModule() {
   SanCovTracePCGuard =
       M.getOrInsertFunction(SanCovTracePCGuardName, VoidTy, PtrTy);
 
+  SanCovStackDepthCallback = M.getOrInsertFunction(SanCovStackDepthCallbackName, VoidTy);
+
   for (auto &F : M)
     instrumentFunction(F);
 
@@ -1078,22 +1091,44 @@ void ModuleSanitizerCoverage::InjectCoverageAtBlock(Function &F, BasicBlock &BB,
     Store->setNoSanitizeMetadata();
   }
   if (Options.StackDepth && IsEntryBB && !IsLeafFunc) {
-    // Check stack depth.  If it's the deepest so far, record it.
     Module *M = F.getParent();
-    auto FrameAddrPtr = IRB.CreateIntrinsic(
-        Intrinsic::frameaddress,
-        IRB.getPtrTy(M->getDataLayout().getAllocaAddrSpace()),
-        {Constant::getNullValue(Int32Ty)});
-    auto FrameAddrInt = IRB.CreatePtrToInt(FrameAddrPtr, IntptrTy);
-    auto LowestStack = IRB.CreateLoad(IntptrTy, SanCovLowestStack);
-    auto IsStackLower = IRB.CreateICmpULT(FrameAddrInt, LowestStack);
-    auto ThenTerm = SplitBlockAndInsertIfThen(
-        IsStackLower, &*IP, false,
-        MDBuilder(IRB.getContext()).createUnlikelyBranchWeights());
-    IRBuilder<> ThenIRB(ThenTerm);
-    auto Store = ThenIRB.CreateStore(FrameAddrInt, SanCovLowestStack);
-    LowestStack->setNoSanitizeMetadata();
-    Store->setNoSanitizeMetadata();
+    if (Options.StackDepthCallbackMin) {
+      // In callback mode, only add call when stack depth reaches minimum.
+      const DataLayout &DL = M->getDataLayout();
+      uint32_t EstimatedStackSize = 0;
+
+      // Make an estimate on the stack usage.
+      for (auto &I : F.getEntryBlock()) {
+          if (auto *AI = dyn_cast<AllocaInst>(&I)) {
+              if (AI->isStaticAlloca()) {
+                  uint32_t TypeSize = DL.getTypeAllocSize(AI->getAllocatedType());
+                  EstimatedStackSize += TypeSize;
+              } else {
+                  // Over-estimate dynamic sizes.
+                  EstimatedStackSize += 4096;
+              }
+          }
+      }
+
+      if (EstimatedStackSize >= Options.StackDepthCallbackMin)
+        IRB.CreateCall(SanCovStackDepthCallback)->setCannotMerge();
+    } else {
+      // Check stack depth.  If it's the deepest so far, record it.
+      auto FrameAddrPtr = IRB.CreateIntrinsic(
+          Intrinsic::frameaddress,
+          IRB.getPtrTy(M->getDataLayout().getAllocaAddrSpace()),
+          {Constant::getNullValue(Int32Ty)});
+      auto FrameAddrInt = IRB.CreatePtrToInt(FrameAddrPtr, IntptrTy);
+      auto LowestStack = IRB.CreateLoad(IntptrTy, SanCovLowestStack);
+      auto IsStackLower = IRB.CreateICmpULT(FrameAddrInt, LowestStack);
+      auto ThenTerm = SplitBlockAndInsertIfThen(
+          IsStackLower, &*IP, false,
+          MDBuilder(IRB.getContext()).createUnlikelyBranchWeights());
+      IRBuilder<> ThenIRB(ThenTerm);
+      auto Store = ThenIRB.CreateStore(FrameAddrInt, SanCovLowestStack);
+      LowestStack->setNoSanitizeMetadata();
+      Store->setNoSanitizeMetadata();
+    }
   }
 }
 

@kees kees requested review from bwendling and morehouse May 2, 2025 18:47
Copy link

github-actions bot commented May 2, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@kees
Copy link
Contributor Author

kees commented May 2, 2025

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
View the diff from clang-format here.

Whoops, yes, I've fixed these now.

intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request May 2, 2025
Wire up stackleak to Clang's proposed[1] stack depth tracking callback
option. While __noinstr already contained __no_sanitize_coverage, it was
still needed for __init and __head section markings. This is needed to
make sure the callback is not executed in unsupported contexts.

Link: llvm/llvm-project#138323 [1]
Signed-off-by: Kees Cook <kees@kernel.org>
lougovsk pushed a commit to lougovsk/linux that referenced this pull request May 2, 2025
Wire up stackleak to Clang's proposed[1] stack depth tracking callback
option. While __noinstr already contained __no_sanitize_coverage, it was
still needed for __init and __head section markings. This is needed to
make sure the callback is not executed in unsupported contexts.

Link: llvm/llvm-project#138323 [1]
Signed-off-by: Kees Cook <kees@kernel.org>
Message-Id: <20250502190129.246328-4-kees@kernel.org>
linux-riscv-bot pushed a commit to linux-riscv/linux that referenced this pull request May 2, 2025
Wire up stackleak to Clang's proposed[1] stack depth tracking callback
option. While __noinstr already contained __no_sanitize_coverage, it was
still needed for __init and __head section markings. This is needed to
make sure the callback is not executed in unsupported contexts.

Link: llvm/llvm-project#138323 [1]
Signed-off-by: Kees Cook <kees@kernel.org>
Signed-off-by: Linux RISC-V bot <linux.riscv.bot@gmail.com>
Copy link
Contributor

@melver melver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also missing flag and IR tests.

@kees
Copy link
Contributor Author

kees commented May 4, 2025

This is also missing flag and IR tests.

Oh, yes, I will add those. Thanks!

@kees kees force-pushed the stackleak branch 2 times, most recently from 729d249 to 027ff82 Compare May 5, 2025 03:38
@kees kees requested a review from melver May 5, 2025 03:39
Copy link
Contributor

@melver melver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM - but let's also wait for others to comment.

Documentation of this feature is lacking (and afaik also wasn't added in https://reviews.llvm.org/D36839). Given this will be used in the kernel, some kind of official documentation might be good to have.

Btw, if you want a runtime test, it's possible to add one similar to the one in compiler-rt/trunk/test/sanitizer_common/TestCases/sanitizer_coverage_stack_depth.cc.

@melver melver self-requested a review May 5, 2025 07:02
@kees kees force-pushed the stackleak branch 2 times, most recently from 18bc755 to 084b20d Compare May 6, 2025 04:05
@kees
Copy link
Contributor Author

kees commented May 6, 2025

Okay, a couple small clean-ups, and I've also solved the lack of dynamic stack size tracking.

dyn_cast<ConstantInt>(AI->getArraySize()))
Bytes *= arraySize->getZExtValue();
}
EstimatedStackSize += Bytes;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to do this accounting and not just use something like __builtin_frame_address ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're asking why I can't use Intrinsic::frameaddress in the callback case? (This is what the __sancov_lowest_stack mode does.) At this point the actual size of the stack space is unknown (since it hasn't gone through CodeGen). We need to know this early to minimize the number of places where instrumentation is happening. So to estimate how much space is in use, this manually adds it up to decide whether or not to insert the callback call. (In the __sancov_lowest_stack mode, the instrumentation is always added.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a more detailed comment to the accounting loop (and moved the insertion calculation there since it's only needed for the callback case). And I've tweaked the documentation a bit more with a short example.

kees added 2 commits May 5, 2025 23:06
Normally -fsanitize-coverage=stack-depth inserts inline arithmetic to
update thread_local __sancov_lowest_stack. To support stack depth
tracking in the Linux kernel, which does not implement traditional
thread_local storage, provide the option to call a function instead.

This matches the existing "stackleak" implementation that is supported
in Linux via a GCC plugin. To make this coverage more performant, a
minimum estimated stack depth can be chosen to enable the callback mode,
skipping instrumentation of functions with smaller stacks.

With -fsanitize-coverage-stack-depth-callback-min set greater than 0,
the __sanitize_cov_stack_depth() callback will be injected when the
estimated stack depth is greater than or equal to the given minimum.
Add documentation to detail the behaviors of the stack-depth tracer,
including the new -fsanitize-coverage-stack-depth-callback-min=N
argument.
// If dynamic alloca found, always add call.
bool dynamic_alloca = false;
// Find an insertion point after last "alloca".
llvm::Instruction *InsertBefore = NULL;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nullptr

// In callback mode, only add call when stack depth reaches minimum.
uint32_t EstimatedStackSize = 0;
// If dynamic alloca found, always add call.
bool dynamic_alloca = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HasDynamicAlloc?


The maximum depth of a function call graph is stored in the thread-local
``__sancov_lowest_stack`` variable. Instrumentation is inserted in every
non-leaf function to check the stack pointer against this variable,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You write "stack pointer" here, but in the below it's "frame pointer", which are different.

if (stack < __sancov_lowest_stack)
__sancov_lowest_stack = stack;

If ``-fsanitize-coverage-stack-depth-callback-min=N`` is also used, the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"If -fsanitize-coverage-stack-depth-callback-min=N (where N > 0) is also used ..." ?

Because right now it's unclear what behaviour is supposed to be if N==0. Per current implementation N==0 disables this feature.


.. code-block:: c++

thread_local uintptr_t __sancov_lowest_stack;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extern ?

Because the compiler doesn't define the TLS variable.

case of a dynamically sized stack, the callback is unconditionally added.

The callback takes no arguments and is responsible for determining the
stack pointer and doing any needed comparisons and storage. A roughtly
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/roughtly/roughly/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category compiler-rt:sanitizer llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants