Description
Given the following C++ input file:
#include <iostream>
struct A { virtual void f() = 0; };
struct A1 : public A { virtual void f() { std::cout << "A1::f()\n"; } };
struct A2 : public A { virtual void f() { std::cout << "A2::f()\n"; } };
struct A3 : public A { virtual void f() { std::cout << "A3::f()\n"; } };
struct B { virtual void f() = 0; };
struct B1 : public B { virtual void f() { std::cout << "B1::f()\n"; } };
struct B2 : public B { virtual void f() { std::cout << "B2::f()\n"; } };
struct B3 : public B { virtual void f() { std::cout << "B3::f()\n"; } };
struct C { virtual void f(int) = 0; };
struct C1 : public C { virtual void f(int) { std::cout << "C1::f(int)\n"; } };
void Af(A *a) { a->f(); }
void Bf(B *b) { b->f(); }
void Cf(C *c) { c->f(0); }
int main() {
Af(new A1());
Af(new A2());
Bf(new B1());
Bf(new B2());
Cf(new C1());
return 0;
}
And using the Attributor to generate a callgraph:
clang++ example.cpp -c -o outputs/example.ll -flto -fvisibility=hidden -fwhole-program-vtables -fsanitize=cfi-icall -O0 -Xclang -disable-O0-optnone
opt outputs/example.ll -passes=attributor --attributor-assume-closed-world --attributor-print-call-graph -disable-output | c++filt | awk 'BEGIN{FS=OFS="\042"} { for (i=2; i<=NF; i+=2) { gsub(/</, "\\\<", $i); gsub(/>/, "\\\>", $i); }} 1' >outputs/example.callgraph.dot
dot outputs/example.callgraph.dot -Tsvg -o outputs/example.callgraph.svg
i.e. Af(A*)
is assumed to call (a.o.) A1::f()
, A2::f()
, B1::f()
, B2::f()
, and C1::f(int)
.
This can be made more precise in (at least) two ways:
- Calling a function through a pointer of an incompatible type is UB in at least C and C++.
C standard 6.3.2.3 paragraph 8:
A pointer to a function of one type may be converted to a pointer to a function of another type and back again; the result shall compare equal to the original pointer. If a converted pointer is used to call a function whose type is not compatible with the referenced type, the behavior is undefined.
C++ standard 7.6.1.3 paragraph 5
Calling a function through an expression whose function type E is different from the function type F of the called function’s definition results in undefined behavior unless the type “pointer to F” can be converted to the type “pointer to E” via a function pointer conversion (7.3.14).
Hence, C1::f(int)
can be removed from the list of possible callees.
- Using the vtable information encoded in the Type Metadata (e.g., emitted through the use of
-flto -fvisibility=hidden -fwhole-program-vtables -fsanitize=cfi-icall
) to restrict the set toA1::f()
andA2::f()
only.
It seems that WPD uses some of this information: e.g., the call c->f(0)
in Cf
is converted to a direct call to C1::f(int)
, as that is the only possible callee, but the Attributor does not.
From what I can tell from
llvm-project/llvm/lib/Transforms/IPO/Attributor.cpp
Lines 1072 to 1079 in db3bc49
llvm-project/llvm/lib/Transforms/IPO/AttributorAttributes.cpp
Lines 12227 to 12232 in db3bc49
Are there any plans/interest for contributions to make use of the function signature and Type Metadata in the Attributor call graph as well?
Activity