Skip to content

[lld][WebAssembly] Allow --trace-symbol to work with symbols with custom import names #96119

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

sbc100
Copy link
Collaborator

@sbc100 sbc100 commented Jun 19, 2024

No description provided.

@llvmbot
Copy link
Member

llvmbot commented Jun 19, 2024

@llvm/pr-subscribers-lld

@llvm/pr-subscribers-lld-wasm

Author: Sam Clegg (sbc100)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/96119.diff

3 Files Affected:

  • (modified) lld/test/wasm/trace-symbol.s (+13-1)
  • (modified) lld/wasm/SymbolTable.cpp (+31-6)
  • (modified) lld/wasm/SymbolTable.h (+4)
diff --git a/lld/test/wasm/trace-symbol.s b/lld/test/wasm/trace-symbol.s
index 88e5c6f5829e3..3d79fa96f0942 100644
--- a/lld/test/wasm/trace-symbol.s
+++ b/lld/test/wasm/trace-symbol.s
@@ -1,19 +1,28 @@
+# Test -y symbol and -trace-symbol=symbol
+
 # RUN: llvm-mc -filetype=obj -triple=wasm32-unknown-unknown %p/Inputs/ret32.s -o %t.ret32.o
 # RUN: llvm-mc -filetype=obj -triple=wasm32-unknown-unknown -o %t.start.o %s
 # RUN: wasm-ld -o %t.wasm %t.start.o %t.ret32.o -y ret32 -y _start | FileCheck %s -check-prefix=BOTH
 # RUN: wasm-ld -o %t.wasm %t.ret32.o %t.start.o -y ret32 -y _start | FileCheck %s -check-prefix=REVERSED
 
-# check alias
+# check long argument form
 # RUN: wasm-ld -o %t.wasm %t.start.o %t.ret32.o -trace-symbol=_start | FileCheck %s -check-prefixes=JUST-START
 
+# check import names can be traced
+# RUN: wasm-ld -o %t.wasm %t.start.o %t.ret32.o -trace-symbol=foo_external | FileCheck %s -check-prefixes=IMPORT-NAME
+
 .functype ret32 (f32) -> (i32)
 
+.functype foo_import () -> ()
+.import_name foo_import, foo_external
+
 .globl  _start
 _start:
   .functype _start () -> ()
   f32.const 0.0
   call ret32
   drop
+  call foo_import
   end_function
 
 # BOTH:          start.o: definition of _start
@@ -26,3 +35,6 @@ _start:
 
 # JUST-START: start.o: definition of _start
 # JUST-START-NOT: ret32
+
+# IMPORT-NAME: start.o: reference to foo_external
+# IMPORT-NAME-NOT: ret32
diff --git a/lld/wasm/SymbolTable.cpp b/lld/wasm/SymbolTable.cpp
index 00c347ea3ef24..65f394f14bb30 100644
--- a/lld/wasm/SymbolTable.cpp
+++ b/lld/wasm/SymbolTable.cpp
@@ -507,6 +507,15 @@ static void setImportAttributes(T *existing,
   }
 }
 
+static void traceImport(std::optional<StringRef> importName, InputFile *file) {
+  if (importName.has_value()) {
+    auto name = importName.value();
+    if (symtab->isTraced(name)) {
+      printTraceSymbolUndefined(name, file);
+    }
+  }
+}
+
 Symbol *SymbolTable::addUndefinedFunction(StringRef name,
                                           std::optional<StringRef> importName,
                                           std::optional<StringRef> importModule,
@@ -526,6 +535,7 @@ Symbol *SymbolTable::addUndefinedFunction(StringRef name,
     printTraceSymbolUndefined(name, file);
 
   auto replaceSym = [&]() {
+    traceImport(importName, file);
     replaceSymbol<UndefinedFunction>(s, name, importName, importModule, flags,
                                      file, sig, isCalledDirectly);
   };
@@ -560,6 +570,7 @@ Symbol *SymbolTable::addUndefinedFunction(StringRef name,
         replaceSym();
     }
     if (existingUndefined) {
+      traceImport(importName, file);
       setImportAttributes(existingUndefined, importName, importModule, flags,
                           file);
       if (isCalledDirectly)
@@ -612,10 +623,11 @@ Symbol *SymbolTable::addUndefinedGlobal(StringRef name,
   if (s->traced)
     printTraceSymbolUndefined(name, file);
 
-  if (wasInserted)
+  if (wasInserted) {
+    traceImport(importName, file);
     replaceSymbol<UndefinedGlobal>(s, name, importName, importModule, flags,
                                    file, type);
-  else if (auto *lazy = dyn_cast<LazySymbol>(s))
+  } else if (auto *lazy = dyn_cast<LazySymbol>(s))
     lazy->extract();
   else if (s->isDefined())
     checkGlobalType(s, file, type);
@@ -638,10 +650,11 @@ Symbol *SymbolTable::addUndefinedTable(StringRef name,
   if (s->traced)
     printTraceSymbolUndefined(name, file);
 
-  if (wasInserted)
+  if (wasInserted) {
+    traceImport(importName, file);
     replaceSymbol<UndefinedTable>(s, name, importName, importModule, flags,
                                   file, type);
-  else if (auto *lazy = dyn_cast<LazySymbol>(s))
+  } else if (auto *lazy = dyn_cast<LazySymbol>(s))
     lazy->extract();
   else if (s->isDefined())
     checkTableType(s, file, type);
@@ -664,10 +677,11 @@ Symbol *SymbolTable::addUndefinedTag(StringRef name,
   if (s->traced)
     printTraceSymbolUndefined(name, file);
 
-  if (wasInserted)
+  if (wasInserted) {
+    traceImport(importName, file);
     replaceSymbol<UndefinedTag>(s, name, importName, importModule, flags, file,
                                 sig);
-  else if (auto *lazy = dyn_cast<LazySymbol>(s))
+  } else if (auto *lazy = dyn_cast<LazySymbol>(s))
     lazy->extract();
   else if (s->isDefined())
     checkTagType(s, file, sig);
@@ -828,9 +842,20 @@ bool SymbolTable::getFunctionVariant(Symbol* sym, const WasmSignature *sig,
 // Set a flag for --trace-symbol so that we can print out a log message
 // if a new symbol with the same name is inserted into the symbol table.
 void SymbolTable::trace(StringRef name) {
+  tracingEnabled = true;
   symMap.insert({CachedHashStringRef(name), -1});
 }
 
+bool SymbolTable::isTraced(StringRef name) {
+  // Early exit in the default case to avoid symbol hashmap lookup.
+  if (!tracingEnabled)
+    return false;
+  auto it = symMap.find(CachedHashStringRef(name));
+  if (it == symMap.end())
+    return false;
+  return it->second == -1 || symVector[it->second]->traced;
+}
+
 void SymbolTable::wrap(Symbol *sym, Symbol *real, Symbol *wrap) {
   // Swap symbols as instructed by -wrap.
   int &origIdx = symMap[CachedHashStringRef(sym->getName())];
diff --git a/lld/wasm/SymbolTable.h b/lld/wasm/SymbolTable.h
index 42ebb8be8eb3f..f631b10961206 100644
--- a/lld/wasm/SymbolTable.h
+++ b/lld/wasm/SymbolTable.h
@@ -50,6 +50,8 @@ class SymbolTable {
 
   void trace(StringRef name);
 
+  bool isTraced(StringRef name);
+
   Symbol *addDefinedFunction(StringRef name, uint32_t flags, InputFile *file,
                              InputFunction *function);
   Symbol *addDefinedData(StringRef name, uint32_t flags, InputFile *file,
@@ -132,6 +134,8 @@ class SymbolTable {
 
   // For LTO.
   std::unique_ptr<BitcodeCompiler> lto;
+
+  bool tracingEnabled = false;
 };
 
 extern SymbolTable *symtab;

@sbc100 sbc100 requested a review from aheejin June 20, 2024 04:02
Copy link
Member

@aheejin aheejin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Is the intention to print the import name, not the internal name in tracing?
  • traceImport seems to be added to addUndefined[Function|Global|Tag] but not addUndefinedData. Does addUndefinedData not need the tracing?

@@ -526,6 +535,7 @@ Symbol *SymbolTable::addUndefinedFunction(StringRef name,
printTraceSymbolUndefined(name, file);

auto replaceSym = [&]() {
traceImport(importName, file);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This replaceSym also seems to be called when wasInserted was not true below:

      // If the existing undefined functions is not called directly then let
      // this one take precedence.  Otherwise the existing function is either
      // directly called or defined, in which case we need a function variant.
      if (existingUndefined && !existingUndefined->isCalledDirectly)
        replaceSym();
      else if (getFunctionVariant(s, sig, file, &s))
        replaceSym();

Is that OK to run this again in this case?

@@ -526,6 +535,7 @@ Symbol *SymbolTable::addUndefinedFunction(StringRef name,
printTraceSymbolUndefined(name, file);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We run printTraceSymbolUndefined here and possibly again when wasInserted is true below. Does this mean s->traced and wasInserted are never true at the same time?
Would it be clear if we check two conditions at a single place to run this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, s->traced depends only on the command line flags and will be true if and only if --trace-symbol=foo is passed for a given symbol foo. It can be true or false independently of everything else going on here.

I'm not sure what you mean by "and then again". I only see a single call to printTraceSymbolUndefined in this function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function contains a call to traceImport which calls printTraceSymbolUndefined. So I guess if s->traced and wasInserted are both true it can be called twice?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes in that case you potentially see a trace entry for the "name" and the "import name". Not that those are normally different but if you happened to trace both then I would expect two trace entires in the log yes.

@sbc100
Copy link
Collaborator Author

sbc100 commented Jun 20, 2024

  • Is the intention to print the import name, not the internal name in tracing?

Yes exactly. Imagine I have a symbols called __internal_fd_write which is imported as fd_write (i.e. the symbol __internal_fd_write has an import name of fd_write.

Prior to this change, if a user links with --trace-symbol=fd_write they would see no output when this import is used.

After this change they would see a report of fd_write being undefined/imported in a certain object file.

  • traceImport seems to be added to addUndefined[Function|Global|Tag] but not addUndefinedData. Does addUndefinedData not need the tracing?

No that now not intended it should used anywhere an importName can exist. I'll see if I overlooked that.

@aheejin
Copy link
Member

aheejin commented Jun 21, 2024

  • Is the intention to print the import name, not the internal name in tracing?

Yes exactly. Imagine I have a symbols called __internal_fd_write which is imported as fd_write (i.e. the symbol __internal_fd_write has an import name of fd_write.

Prior to this change, if a user links with --trace-symbol=fd_write they would see no output when this import is used.

After this change they would see a report of fd_write being undefined/imported in a certain object file.

What if we have another symbol (which is may or may not be imported) whose name is fd_write? Then if a user specifies fd_write, does that point to the symbol with that name or this symbol whose import name is fd_write? Is this even possible?

@sbc100
Copy link
Collaborator Author

sbc100 commented Jun 21, 2024

  • Is the intention to print the import name, not the internal name in tracing?

Yes exactly. Imagine I have a symbols called __internal_fd_write which is imported as fd_write (i.e. the symbol __internal_fd_write has an import name of fd_write.
Prior to this change, if a user links with --trace-symbol=fd_write they would see no output when this import is used.
After this change they would see a report of fd_write being undefined/imported in a certain object file.

What if we have another symbol (which is may or may not be imported) whose name is fd_write? Then if a user specifies fd_write, does that point to the symbol with that name or this symbol whose import name is fd_write? Is this even possible?

Both would get printed. The point is to trace any and all occurrence of the symbol as they pertain the global symbol table.

@sbc100
Copy link
Collaborator Author

sbc100 commented Jun 21, 2024

  • traceImport seems to be added to addUndefined[Function|Global|Tag] but not addUndefinedData. Does addUndefinedData not need the tracing?

No that now not intended it should used anywhere an importName can exist. I'll see if I overlooked that.

I remember now, we don't see this for data symbols because data symbols cannot have importName set. See how there is no importName argument to the addUndefinedData method.

@@ -612,10 +623,11 @@ Symbol *SymbolTable::addUndefinedGlobal(StringRef name,
if (s->traced)
printTraceSymbolUndefined(name, file);

if (wasInserted)
if (wasInserted) {
traceImport(importName, file);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

traceImport sounds like telling the program to start tracing the import, which doesn't seem to be the case; it just prints the import.

By the way, I'm not very familiar with the code structure here so I might be missing something, but should we call printTraceSymbolUndefined and traceImport separately from several places? Can we do both in one function and preferably one place, when importName is not null?

@sbc100 sbc100 force-pushed the import_name_t_tracing branch from 5cbb61e to 08e16d6 Compare July 12, 2024 22:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants