Named tensors with typed spaces #477

michaelosthege · 2023-10-15T14:37:37Z

I took the branch from #407 and added a pytensor.xtensor.spaces module that defines types to distinguish between "unordered spaces" (BaseSpace) and "ordered spaces" (OrderedSpace).

BaseSpace and OrderedSpace are similar to sets & tuples, but do not implement some operations that would mess up interpreting them as dims.

One idea here is to apply the mathematical operations not only to the variables, but also to their spaces.

For example:

# Addition between two variables uses bilateral broadcasting
Space(["a", "b"]) + Space({"c"}) -> Space({"a", "b", "c"})

This matches broadcasting in xarray:

a = xarray.DataArray([[1,2,3]], dims=["a", "b"])
b = xarray.DataArray([1,2,3,4], dims=["c"])
assert set((b + a).dims) == {"a", "b", "c"}

However, xarray.DataArray.dims are tuples, and the commutative rule does not apply to addition of xarray.DataArray variables' dims:

assert (a + b).dims == (b + a).dims  # AssertionError

In contrast, with this PR the resulting dims become an unordered space, and the resulting XTensorType are equal:

xa = ptx.as_xtensor(a)
xb = ptx.as_xtensor(b)
xc = xa + xb

xa.type  # XTensorType(int32, OrderedSpace('a', 'b'), (1, 3))
xb.type  # XTensorType(int32, OrderedSpace('c'), (4,))
xc.type  # XTensorType(float64, Space{'c', 'a', 'b'}, (None, None, None))

assert (xa + xb).type == (xb + xa).type

This was basic math, but we could introduce XOps with XOp.infer_space methods that can implement broadcasting rules for any operation:

class XOp(Op):
    def infer_space(self, fgraph, node, input_spaces) -> BaseSpace:
        raise NotImplementedError()


class SumOverTime(XOp):
    def infer_space(self, fgraph, node, input_spaces) -> BaseSpace:
        [s] = input_spaces
        if "time" not in s:
            raise ValueError("No time dim to sum over.")
        return Space(s - {"time"})

Similarly, this should allow us to implement dot products requiring OrderedSpace inputs to produce an OrderedSpace output, or a specify_dimorder XOp that orders a BaseSpace into an OrderedSpace.

Looking at the (None, None, None) shape from the code block above, I wonder if we should type XTensorType.shape as a Mapping[DimLike, int | ScalarVariable] 🤔

(cherry picked from commit 5b0c472)

ricardoV94 · 2023-10-16T09:15:09Z

Looking at the (None, None, None) shape from the code block above, I wonder if we should type XTensorType.shape as a Mapping[DimLike, int | ScalarVariable]

PyTensor variables shouldn't show up in the attributes of Variable types.

michaelosthege · 2023-10-19T20:48:10Z

had a few more thoughts on this, and found that also for unordered spaces we need to know which index a dimension has in the underlying array. With that information one can then index into shape as well.

Now the question is where this information should be kept. Either the XTensorType keeps it, ~~or we don't actually make the BaseSpace unordered~~ no, then space math would not work.

Maybe it's enough to keep a is_ordered: bool and a dims: tuple? A .space property could create the corresponding BaseSpace/OrderedSpace if needed 🤔

ferrine · 2023-12-12T10:00:56Z

pytensor/xtensor/basic.py

+class XTensorFromTensor(Op):
+    __props__ = ("dims",)
+
+    def __init__(self, dims: Iterable[DimLike]):


dims should be Sequence since Iterable can exhaust...

What do you mean by "exhaust" here?

ferrine · 2023-12-12T10:01:24Z

pytensor/xtensor/basic.py

+class XElemwise(Op):
+    __props__ = ("scalar_op",)
+
+    def __init__(self, scalar_op):


missing type hints

ferrine · 2023-12-12T10:03:12Z

pytensor/xtensor/spaces.py

+        ...
+
+
+class Dim(DimLike):


As far as I remember, it is a bad practice to inherit a base class from the protocol

Dim can be just a dataclass instead

According to the explanation in the PEP I would disagree: https://peps.python.org/pep-0544/#explicitly-declaring-implementation

By inheriting the protocol, we enable type checkers to warn about incomplete/incorrect implementations.

ricardoV94 and others added 5 commits October 9, 2023 22:05

. POC named tensors

7c5c9a3

(cherry picked from commit 5b0c472)

WIP: Type dims and spaces

05520fd

Add spaces tests and fix typing

c7bafc6

Fix various typing issues in xtensor module

bb3e38a

Make XTensor use spaces

165a090

michaelosthege added the request discussion label Oct 15, 2023

ferrine reviewed Dec 12, 2023

View reviewed changes

pytensor/xtensor/basic.py

class XElemwise(Op):

__props__ = ("scalar_op",)

def __init__(self, scalar_op):

Copy link

Member

ferrine Dec 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing type hints

ferrine reviewed Dec 12, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Named tensors with typed spaces #477

Named tensors with typed spaces #477

michaelosthege commented Oct 15, 2023 •

edited

Loading

ricardoV94 commented Oct 16, 2023

michaelosthege commented Oct 19, 2023

ferrine Dec 12, 2023

michaelosthege Dec 12, 2023

ferrine Dec 12, 2023

ferrine Dec 12, 2023

ferrine Dec 12, 2023

michaelosthege Dec 12, 2023

Named tensors with typed spaces #477

Are you sure you want to change the base?

Named tensors with typed spaces #477

Conversation

michaelosthege commented Oct 15, 2023 • edited Loading

ricardoV94 commented Oct 16, 2023

michaelosthege commented Oct 19, 2023

ferrine Dec 12, 2023

Choose a reason for hiding this comment

michaelosthege Dec 12, 2023

Choose a reason for hiding this comment

ferrine Dec 12, 2023

Choose a reason for hiding this comment

ferrine Dec 12, 2023

Choose a reason for hiding this comment

ferrine Dec 12, 2023

Choose a reason for hiding this comment

michaelosthege Dec 12, 2023

Choose a reason for hiding this comment

michaelosthege commented Oct 15, 2023 •

edited

Loading