Open
Description
Describe the issue:
The Max
op which is a subclass of CAReduce
fails for 64 bit unsigned integers. This is also evident in the PR 731 and its test. The exact number where uint64 starts failing for is 9223372036854775.
Error can also be reproduced with the following example:
Reproducable code example:
import pytensor.tensor as pt
import numpy as np
dtype="uint64"
n = pt.vector("n", shape=(None,), dtype=dtype)
test_n = np.array([0, 9223372036854775], dtype)
assert pt.max(n).dtype == dtype
print(pt.max(n).eval({n: test_n})) # 9223372036854776
print(test_n.max()) # 9223372036854775
assert pt.max(n).eval({n: test_n}) == test_n.max() # Fails, returns 1 larger
Error message:
No response
PyTensor version information:
Python 3.11.8
Pytensor 2.20.0
Context for the issue:
No response
Metadata
Metadata
Assignees
Labels
Type
Projects
Milestone
Relationships
Development
No branches or pull requests
Activity
aseyboldt commentedon May 14, 2024
Consider me intrigued...
Sounds like somewhere there is a conversion to float64, because that exact number is one bigger than the largest integer a float64 can represent exactly:
np.float64(9223372036854775).astype(np.int64)
is our magic9223372036854776
.aseyboldt commentedon May 14, 2024
This seems to have been around for a long time.
The maximum and minimum are implemented as CAReduction (
pytensor.scalar.basic.ScalarMaximum
). There seem to be two problems with this implementation: Both scalar Ops specifyinf
and-inf
as identity element, but if we are working with integer types, those do not even have an identity.But the direct source of the bug seems to be this generated C-Code:
It tries to ignore nans, but by doing so it implicitly converts the intermediate values to floats.
I guess it might make sense to split the ScalarOps: One for floats (where we have an identity) and one for ints (where we don't). And then use the nan-check only for floats.
edit
This was introduced here: aesara-devs/aesara#297
ricardoV94 commentedon May 14, 2024
The conversion story makes sense. Regarding infty, the code actually initializes those to zero when it's (u)integers.
The number thing was not a hard cutoff. It starts working with larger values and then fails again. In case that helps.
And yes this bug is likely there for a while but was hidden when the max/min could be constant folded, triggering the C implementation of MaxAndArgmax instead which just calls numpy C code and handles it correctly.
Split up TestMinMax::test_uint
Split up TestMinMax::test_uint
Split up TestMinMax::test_uint
Split up TestMinMax::test_uint
Split up TestMinMax::test_uint
Split up TestMinMax::test_uint
Split up TestMinMax::test_uint
Split up TestMinMax::test_uint
Split up TestMinMax::test_uint
Split up TestMinMax::test_uint
Split up TestMinMax::test_uint
Split up TestMinMax::test_uint
Split up TestMinMax::test_uint
Split up TestMinMax::test_uint
Split up TestMinMax::test_uint
Split up TestMinMax::test_uint
Split up TestMinMax::test_uint
Split up TestMinMax::test_uint