Skip to content
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 18 additions & 3 deletions pytensor/tensor/rewriting/math.py
Original file line number Diff line number Diff line change
Expand Up @@ -1905,12 +1905,27 @@ def local_reciprocal_canon(fgraph, node):
@register_canonicalize
@node_rewriter([pt_pow])
def local_pow_canonicalize(fgraph, node):
cst = get_underlying_scalar_constant_value(
"""
Rewrites for exponential functions with straight-forward simplifications:
1. x ** 0 -> 1
2. x ** 1 -> x
3. 1 ** x -> 1

In all cases, the shape of the output is the result of broadcasting the shapes of the inputs.
"""

cst_base = get_underlying_scalar_constant_value(
node.inputs[0], only_process_constants=True, raise_not_constant=False
)
if cst_base == 1:
return [broadcast_arrays(*node.inputs)[0].astype(node.outputs[0].dtype)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can clean up a bit by storing inp_idx in the 3 branches and doing something like:

inp_idx = None
if case1:
  inp_idx = 1
elif case 2
  inp_idx = 0
...

if inp_idx is None:
  return None

new_out = broadcast_arrays(*node.inputs)[inp_idx]

if new out.dtype != node.out.dtype:
  new_out = cast(...)

return [new_out]

?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my concern about alloc_like applies to the old code as well

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it can be as clean as this, because in the x ** 0 case we're not using either input value in the output. But I tried to make it look more like this.


cst_exponent = get_underlying_scalar_constant_value(
node.inputs[1], only_process_constants=True, raise_not_constant=False
)
if cst == 0:
if cst_exponent == 0:
return [alloc_like(1, node.outputs[0], fgraph)]
if cst == 1:
if cst_exponent == 1:
return [alloc_like(node.inputs[0], node.outputs[0], fgraph)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could make an infinite recursion if the ShapeOpt is not running, as it will default to alloc(1, *pow(1, x).shape).

You can do pt.broadcast_arrays(*node.inputs)[0] instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but does that mean we should also change local_canonicalize_pow ? Because I copied the return from there

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines to that rewrite?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

found it yes. Why don't you combine your changes with that rewrite?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check how it looks now? I had to set the dtype to the output as well, not sure if there's a better way



Expand Down
19 changes: 19 additions & 0 deletions tests/tensor/rewriting/test_math.py
Original file line number Diff line number Diff line change
Expand Up @@ -4571,3 +4571,22 @@ def test_log_kv_stabilization():
out.eval({x: 1000.0}, mode=mode),
-1003.2180912984705,
)


@pytest.mark.parametrize("shape", [(), (4, 5, 6)], ids=["scalar", "tensor"])
def test_pow_1_rewrite(shape):
x = pt.tensor("x", shape=shape)
z = 1**x

assert isinstance(z.owner.op, Elemwise) and isinstance(
z.owner.op.scalar_op, ps.basic.Pow
)

f = pytensor.function([x], z)
assert not any(
isinstance(node.op, Elemwise) and isinstance(node.op.scalar_op, ps.basic.Pow)
for node in f.maker.fgraph.toposort()
)

x_val = np.random.random(shape).astype(config.floatX)
np.testing.assert_allclose(z.eval({x: x_val}), f(x_val))
Loading