Skip to content

[MLIR] Auto generate source location for python bindings #112923

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

grypp
Copy link
Member

@grypp grypp commented Oct 18, 2024

This PR introduces the get_source_location() function, which:

Traverses the call stack to locate the frame just before the one whose filename contains 'python_packages'. Returns a source location using the selected frame, useful for debugging or tracking the source of execution. The python_packages is the install directory for the python bindings, it gives the most meaningful source location.

The get_source_location() gets called for each OP building.

This PR introduces the `get_source_location()` function, which:

Traverses the call stack to locate the frame just before the one whose filename contains 'python_packages'.
Returns a source location using the selected frame, useful for debugging or tracking the source of execution. The `python_packages` is the install directory for the python bindings, it gives the most meaningful source location.

The `get_source_location()` gets called for each OP building.
@llvmbot llvmbot added mlir:core MLIR Core Infrastructure mlir:python MLIR Python bindings mlir labels Oct 18, 2024
@llvmbot
Copy link
Member

llvmbot commented Oct 18, 2024

@llvm/pr-subscribers-mlir-core

@llvm/pr-subscribers-mlir

Author: Guray Ozen (grypp)

Changes

This PR introduces the get_source_location() function, which:

Traverses the call stack to locate the frame just before the one whose filename contains 'python_packages'. Returns a source location using the selected frame, useful for debugging or tracking the source of execution. The python_packages is the install directory for the python bindings, it gives the most meaningful source location.

The get_source_location() gets called for each OP building.


Full diff: https://github.com/llvm/llvm-project/pull/112923.diff

3 Files Affected:

  • (modified) mlir/python/mlir/dialects/_ods_common.py (+33)
  • (modified) mlir/test/python/ir/location.py (+30)
  • (modified) mlir/tools/mlir-tblgen/OpPythonBindingGen.cpp (+3)
diff --git a/mlir/python/mlir/dialects/_ods_common.py b/mlir/python/mlir/dialects/_ods_common.py
index d40d936cdc83d6..768a09fcc0e540 100644
--- a/mlir/python/mlir/dialects/_ods_common.py
+++ b/mlir/python/mlir/dialects/_ods_common.py
@@ -2,6 +2,7 @@
 #  See https://llvm.org/LICENSE.txt for license information.
 #  SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
 
+import inspect
 from typing import (
     List as _List,
     Optional as _Optional,
@@ -50,6 +51,38 @@ def segmented_accessor(elements, raw_segments, idx):
     return elements[start:end]
 
 
+
+
+def get_source_location():
+    """
+    Returns a source location from the frame just before the one whose
+    filename includes 'python_packages'.
+    """
+    frame = inspect.currentframe()
+    outer_frames = inspect.getouterframes(frame)
+    
+    # Traverse the frames in reverse order, excluding the current frame
+    selected_frame = None
+    for i in range(len(outer_frames)-1, -1, -1):
+        current_frame = outer_frames[i]
+        if 'python_packages' in current_frame.filename:
+            # Select the frame before the one containing 'python_packages'
+            selected_frame = outer_frames[i+1] if i-1 >= 0 else current_frame
+            break
+    if selected_frame is None:
+        # If no frame containing 'python_packages' is found, use the last frame
+        selected_frame = outer_frames[-1]
+    
+    # Create file location using the selected frame
+    file_loc = _cext.ir.Location.file(
+            selected_frame.filename,
+            selected_frame.lineno,
+            0
+        )
+    loc = _cext.ir.Location.name(selected_frame.function, childLoc = file_loc)
+    return loc
+
+
 def equally_sized_accessor(
     elements, n_simple, n_variadic, n_preceding_simple, n_preceding_variadic
 ):
diff --git a/mlir/test/python/ir/location.py b/mlir/test/python/ir/location.py
index f66d6c501dcf5c..6503b76f5c9c80 100644
--- a/mlir/test/python/ir/location.py
+++ b/mlir/test/python/ir/location.py
@@ -2,6 +2,7 @@
 
 import gc
 from mlir.ir import *
+from mlir.dialects import arith
 
 
 def run(f):
@@ -150,3 +151,32 @@ def testLocationCapsule():
 
 
 run(testLocationCapsule)
+
+
+
+# CHECK-LABEL: TEST: autoGeneratedLocation
+def autoGeneratedLocation():
+    def generator() :
+        return arith.ConstantOp(value=123, result=IntegerType.get_signless(32)) 
+    with Context() as ctx, Location.unknown():
+        module = Module.create()
+        with InsertionPoint(module.body):
+            a = arith.ConstantOp(value=42, result=IntegerType.get_signless(32))
+            b = arith.AddIOp(a, generator())
+        module.operation.print(enable_debug_info=True)
+
+#CHECK: module {
+#CHECK:  %{{.*}} = arith.constant 42 : i32 loc(#loc4)
+#CHECK:  %{{.*}} = arith.constant 123 : i32 loc(#loc5)
+#CHECK:  %{{.*}} = arith.addi %{{.*}}, %{{.*}} : i32 loc(#loc6)
+#CHECK: } loc(#loc)
+    
+#CHECK: #loc = loc(unknown)
+#CHECK: #loc1 = loc({{.*}}:164:0)
+#CHECK: #loc2 = loc({{.*}}:160:0)
+#CHECK: #loc3 = loc({{.*}}:165:0)
+#CHECK: #loc4 = loc("autoGeneratedLocation"(#loc1))
+#CHECK: #loc5 = loc("generator"(#loc2))
+#CHECK: #loc6 = loc("autoGeneratedLocation"(#loc3))
+
+run(autoGeneratedLocation)
\ No newline at end of file
diff --git a/mlir/tools/mlir-tblgen/OpPythonBindingGen.cpp b/mlir/tools/mlir-tblgen/OpPythonBindingGen.cpp
index 0c5c936f5addee..bcea716748bd31 100644
--- a/mlir/tools/mlir-tblgen/OpPythonBindingGen.cpp
+++ b/mlir/tools/mlir-tblgen/OpPythonBindingGen.cpp
@@ -35,6 +35,7 @@ constexpr const char *fileHeader = R"Py(
 from ._ods_common import _cext as _ods_cext
 from ._ods_common import (
     equally_sized_accessor as _ods_equally_sized_accessor,
+    get_source_location as _get_source_location,
     get_default_loc_context as _ods_get_default_loc_context,
     get_op_result_or_op_results as _get_op_result_or_op_results,
     get_op_result_or_value as _get_op_result_or_value,
@@ -491,6 +492,8 @@ constexpr const char *initTemplate = R"Py(
     attributes = {{}
     regions = None
     {1}
+    if loc is None:
+      loc = _get_source_location()
     super().__init__(self.build_generic({2}))
 )Py";
 

Copy link

github-actions bot commented Oct 18, 2024

⚠️ Python code formatter, darker found issues in your code. ⚠️

You can test this locally with the following command:
darker --check --diff -r a3cd269fbebecb6971e216a9c29ad8933ad7b0fc...c18c3370b70cb8746c0a85cab8600bcefa3455a1 mlir/python/mlir/dialects/_ods_common.py mlir/test/python/ir/location.py
View the diff from darker here.
--- python/mlir/dialects/_ods_common.py	2024-10-18 15:39:51.000000 +0000
+++ python/mlir/dialects/_ods_common.py	2024-10-18 16:17:37.263737 +0000
@@ -163,11 +163,13 @@
     if isinstance(op, _cext.ir.OpView):
         op = op.operation
     return (
         list(get_op_results_or_values(op))
         if len(op.results) > 1
-        else get_op_result_or_value(op) if len(op.results) > 0 else op
+        else get_op_result_or_value(op)
+        if len(op.results) > 0
+        else op
     )
 
 
 ResultValueTypeTuple = _cext.ir.Operation, _cext.ir.OpView, _cext.ir.Value
 ResultValueT = _Union[ResultValueTypeTuple]

Copy link
Member

@matthias-springer matthias-springer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having mandatory location information was an important point in the design of MLIR. In C++, you cannot create an op without a location. We somewhat abandoned that principle in Python by making loc optional. It makes sense to me to infer the location in the Python interpreter when possible.

#CHECK: #loc5 = loc("generator"(#loc2))
#CHECK: #loc6 = loc("autoGeneratedLocation"(#loc3))

run(autoGeneratedLocation)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: new line

@makslevental
Copy link
Contributor

makslevental commented Oct 18, 2024

I have a version of this in mlir-python-extras.

It roughly looks the same so not many comments on impl but what I'll say is that I've found there's no way to make this robust because of how floppy both python control flow and the module system can be; did you know you can create objects that walk/talk like a module out of thin air? Those "modules" will have no filename attribute. Seems like an edge case I know but I suspect Jax people will not like this because of their whole stacked interpreters thing...

In addition the frame_info API is not stable and won't be so at minimum, in this PR you'll at least need the same sys.version_info.minor >= 11 check I have.

Testing is also quite annoying because of line numbers moving around due to imports.

Unforuntately, despite how nice it would be to have this feature, I would not recommend this get merged.

@grypp
Copy link
Member Author

grypp commented Oct 21, 2024

I completely agree with you that it’s nearly impossible to automatically set the location for every case. (PR also checks the python_packages directory and noticed that there may be another library there, but it doesn't include mlir/.)

That said, we could position this PR to benefit basic Python binding users. For advance users, such as with JAX or a custom DSL, they can (and should) set the location themselves. FWIW, the PR doesn't overwrite when loc != None. As @matthias-springer mentioned the location must be set in C++ side. But python implicitly set it to unknown.

I’d love to hear your final thoughts on this. I am fine either way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mlir:core MLIR Core Infrastructure mlir:python MLIR Python bindings mlir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants