Open
Description
Bugzilla Link | 31521 |
Version | trunk |
OS | All |
CC | @anarazel,@pitrou,@weliveindetail |
Extended Description
We need to be able to profile the performance of JIT'd code. This bug should serve as an umbrella for ORC clients and developers to discuss ORC profiling support.
Metadata
Metadata
Assignees
Type
Projects
Milestone
Relationships
Development
No branches or pull requests
Activity
lhames commentedon Jan 3, 2017
ExecutionEngine and MCJIT currently support OProfile and Intel profiling via the JITEventListener interface - it should be easy to adapt that code to work with ObjectLinkingLayer's callbacks (NotifyLoaded and NotifyFinalized), allowing easy migration for existing clients. If anyone wants to dive on this please feel free (either file a new bug blocking this, or just make notes inline here). Otherwise I'll get to it when I can.
anarazel commentedon Mar 26, 2018
I've used out-of-tree patches for this for a while. I'm not quite sure what changes you exactly had in mind, but I'm going to open a review with what I have, and then we can go from there?
(waiting for a recompile to open a phab review)
anarazel commentedon Mar 26, 2018
See https://reviews.llvm.org/D44890 and also https://reviews.llvm.org/D44892
weliveindetail commentedon May 20, 2019
This comes up again for JITLink. I had a quick-fix, but due to intermediate changes it didn't fit in anymore: https://reviews.llvm.org/D61065
Correct me if I am wrong, but I think the plan is to:
Maybe the functionality can be encapsulated in a utility class.
lhames commentedon May 20, 2019
Yep. The aim is to be able to write profiling support as an ObjectLinkingLayer::Plugin subclass. Ditto for debugging support.
weliveindetail commentedon May 21, 2019
Ok nice, is LocalEHFrameRegistrationPlugin a good role model here for subclassing ObjectLinkingLayer::Plugin?
https://github.com/llvm/llvm-project/blob/a2fbe2bc/llvm/include/llvm/ExecutionEngine/Orc/ObjectLinkingLayer.h#L142
anarazel commentedon May 21, 2019
Does this mean that you currently expect JITEventListener based profiling support to be broken by your changes? Or just that it'd be more efficient to do so via ObjectLinkingLayer::Plugin?
weliveindetail commentedon May 21, 2019
It still works with RuntimeDyld, using Orc's RTDyldObjectLinkingLayer.
It does NOT YET work with the new JITLink, using Orc's new ObjectLinkingLayer.
lhames commentedon May 21, 2019
Yes it is. The plugin API is new though, so it might not provide everything you need to know (e.g. you only get access to the AtomGraph so far, not the underlying buffer/object). There's plenty of scope for us to tweak the API at the moment, since there are few clients.
lhames commentedon May 21, 2019
As Stefan mentioned: RuntimeDyld and RTDyldObjectLinkingLayer continue to be supported, and they will continue to support JITEventListeners. RuntimeDyld/RTDyldObjectLinkingLayer will not be replaced until JITLink and ObjectLinkingLayer can provide a truly viable alternative.
As for the new listener/plugin interfaces: JITLink allows rich interaction with the linker's data structures, which RuntimeDyld did not. I don't know that it will make plugins more efficient, but I think it will allow a wider range of plugins, and allow JITLink to generate more efficient code, while keeping the plugin mechanism efficient.
My hope is that it will be possible to write a JITEventListener wrapper for backwards compatibility. The big caveat will be dead-stripping/atom layout: JITLink deletes unreachable code and reorganizes section contents. That will break any JITEventListener that expects the relocated section content to line up exactly with the section content in the object file. We might be able to address this by making dead/stripping and layout optional (or pluggable, and provide variants that mimic the object layout).
lhames commentedon Mar 15, 2023
Related issue: @lucasreis1 has been seeing some issues with the existing perf support: #58174.
lhames commentedon Mar 15, 2023
@pchintalapudi is looking at making ELF debug sections available in the LinkGraph (they were previously skipped during ELF LinkGraph construction). That's step 1 here, since the perf event listeners all need to read debug info.
The next question is what metadata do we need, and in what form?
E.g. Should we just dump the original object file to disk? We could make the original object file available as a section in the graph to facilitate that, but what would we do for LinkGraphs created directly via the LinkGraph APIs? Or should the profiling support plugin synthesize a new object file from the sections in the Graph? That's my initial preference, but I wonder how much work the object synthesizer will be.
Finally there's the question of where the metadata massaging should happen (controller or executor). The registration (and deregistration) itself should definitely happen on the executor side, so that will need to be implemented in the ORC runtime.
llvmbot commentedon Mar 15, 2023
@llvm/issue-subscribers-jitlink
vchuravy commentedon Mar 16, 2023
x-ref: #60883
Once we settled the Perf side of things we will have to do the same for VTunes/ITTAPI (cc: @ekovanova & @abrown)
vchuravy commentedon Mar 16, 2023
At least for Perf the tools are all looing at the mapped files of the process being profiled (and the vma must be valid in that process) so my understanding is that the write to the
mmap
file need to happen on the executor side.llvmbot commentedon Jun 21, 2023
@llvm/issue-subscribers-julialang
vchuravy commentedon Jun 21, 2023
The PR for initial perf integration is https://reviews.llvm.org/D146169
mgood7123 commentedon Oct 9, 2023
for debugging or for profiling?
compilers including clang can emit optimized code that can be profiled relatively fine although
???
symbols will always appear in various places due to optimizing and lack of debug infovchuravy commentedon Mar 22, 2024
VTune support landed in #83957