Fix bug in `local_elemwise_alloc` and add tests #767

ricardoV94 · 2022-01-19T14:02:43Z

Rewrite would return a new graph identical to itself, resulting in an endless rewrite cycle

import aesara
import aesara.tensor as at

x = at.matrix("x")
y = at.matrix("y")
z = at.mul(at.alloc(x, 1, 5), y)

with aesara.config.change_flags(optimizer_verbose=True):
    f = aesara.function([x, y], z, mode="FAST_RUN")

optimizer: rewrite local_elemwise_alloc replaces Elemwise{mul,no_inplace}.0 with Elemwise{mul,no_inplace}.0
optimizer: rewrite local_elemwise_alloc replaces Elemwise{mul,no_inplace}.0 with Elemwise{mul,no_inplace}.0
optimizer: rewrite local_elemwise_alloc replaces Elemwise{mul,no_inplace}.0 with Elemwise{mul,no_inplace}.0
optimizer: rewrite local_elemwise_alloc replaces Elemwise{mul,no_inplace}.0 with Elemwise{mul,no_inplace}.0
optimizer: rewrite local_elemwise_alloc replaces Elemwise{mul,no_inplace}.0 with Elemwise{mul,no_inplace}.0
optimizer: rewrite local_elemwise_alloc replaces Elemwise{mul,no_inplace}.0 with Elemwise{mul,no_inplace}.0
optimizer: rewrite local_elemwise_alloc replaces Elemwise{mul,no_inplace}.0 with Elemwise{mul,no_inplace}.0
optimizer: rewrite local_elemwise_alloc replaces Elemwise{mul,no_inplace}.0 with Elemwise{mul,no_inplace}.0
...

~~There are still flaws in the rewrite logic, as indicated by the new triggered assertion~~

First showed up in pymc-devs/pymc#5369

TODO:

Rewrite docstrings to make clear what is going on in this rewrite
Make sure tests cover all rewrite cases
Consider removing the special Dimshuffle of Alloc logic

aesara/tensor/basic_opt.py

codecov · 2022-01-19T18:35:36Z

Codecov Report

Merging #767 (7e3792c) into main (240827c) will increase coverage by 0.00%.
The diff coverage is 95.00%.

@@           Coverage Diff           @@
##             main     #767   +/-   ##
=======================================
  Coverage   78.35%   78.35%           
=======================================
  Files         152      152           
  Lines       47685    47687    +2     
  Branches    10881    10880    -1     
=======================================
+ Hits        37364    37367    +3     
+ Misses       7773     7772    -1     
  Partials     2548     2548

Impacted Files	Coverage Δ
aesara/tensor/basic_opt.py	`85.22% <95.00%> (+0.07%)`	⬆️

aesara/tensor/basic_opt.py

tests/tensor/test_basic_opt.py

aesara/tensor/basic_opt.py

brandonwillard

Can you add a MWE that reproduces the issue in the description (and explicitly shows the problematic graphs)?

ricardoV94 · 2022-01-19T19:15:08Z

Can you add a MWE that reproduces the issue in the description?

Done

brandonwillard · 2022-01-19T19:31:14Z

I'm starting to see that local_elemwise_alloc is one of those old rewrites that needs to be completely rewritten.

For instance, some of its DimShuffle parts seem like they should be independent rewrites (or even existing ones).

ricardoV94 · 2022-01-19T20:07:22Z

I don't quite get how this rewrite is supposed to work now that I think about it. If we start with Elemwise(Alloc1(x, shape), Alloc2(y, shape)) and then remove Alloc1 because we have Alloc2 -> Elemwise(x, Alloc2(y, shape)), how do we prevent Alloc2 from being removed in the next iteration? If we do that, we lost all shape information, no? In other words, these new tests should actually fail:

(lambda x, y: at.mul(at.alloc(x, 15, 1), at.alloc(y, 15, 1))),
(lambda x, y: at.mul(at.alloc(x, 15, 2), at.alloc(y, 15, 2))),

Nevermind, that's what the asssert_op_in does

brandonwillard · 2022-01-19T20:53:37Z

I don't quite get how this rewrite is supposed to work now that I think about it. If we start with Elemwise(Alloc1(x, shape), Alloc2(y, shape)) and then remove Alloc1 because we have Alloc2 -> Elemwise(x, Alloc2(y, shape)), how do we prevent Alloc2 from being removed in the next iteration? If we do that, we lost all shape information, no? In other words, these new tests should actually fail:
(lambda x, y: at.mul(at.alloc(x, 15, 1), at.alloc(y, 15, 1))),
(lambda x, y: at.mul(at.alloc(x, 15, 2), at.alloc(y, 15, 2))),
Nevermind, that's what the asssert_op_in does

Yeah, the docstring and logic are both unnecessarily confusing and convoluted. That—combined with a lack of quality (e.g. direct) unit tests—almost necessarily gives rise to these kinds of issues.

This is yet another example of why we need to move toward formalizing our rewrites and their descriptions. In this case, the docstring uses its own notation (e.g. y.TensorType(BROADCAST CONDITION)), and that only serves to make things more confusing.

ricardoV94 · 2022-01-19T20:59:01Z

I cleaned a bit the test logic in a separate commit. The two test conditions with the dimshuffle fail (now that I removed "specialize" from the include), because those Alloc remain in the Assert that is added around y.

This is not necessarily a failure, but suggests the original rewrite was not quite able to deliver what it meant to?

brandonwillard · 2022-01-19T21:03:28Z

This is not necessarily a failure, but suggests the original rewrite was not quite able to deliver what it meant to?

Wouldn't surprise me, but, while we're fixing this, let's at least make our assumptions/goals very clear in the docstring.

Also, we need to do something about that DimShuffle complexity. I think it's nearly doubling the logic in this rewrite and it doesn't even need to be there.

Is there a DimShuffle lift/removal that can be applied to Allocs independently. For example, DimShuffle{"x", ...}(Alloc(y, s1, ...)) -> Alloc(y, 1, s1, ...)?

ricardoV94 · 2022-01-19T21:05:13Z

This is not necessarily a failure, but suggests the original rewrite was not quite able to deliver what it meant to?

Wouldn't surprise me, but, while we're fixing this, let's at least make our assumptions/goals very clear in the docstring.

Also, we need to do something about that DimShuffle complexity. I think it's nearly doubling the logic in this rewrite and it doesn't even need to be there.

Is there a DimShuffle lift/removal that can be applied to Allocs independently. For example, DimShuffle{"x", ...}(Alloc(y, s1, ...)) -> Alloc(y, 1, s1, ...)?

Yes that could be done, but according to the comments in the docstrings that would be less efficient in the general case (where Alloc stay in the graph). I have no basis to ~~believe~~ evaluate that statement though

brandonwillard · 2022-01-19T21:18:31Z

Yes that could be done, but according to the comments in the docstrings that would be less efficient in the general case (where Alloc stay in the graph)

I don't see how that could be a real problem (for at least the case of adding broadcastable dimensions to an Alloc), and, if it actually was realistically more efficient, wouldn't we want an independent "specialization" rewrite that changes all such Allocs into DimShuffles? It wouldn't make any sense to limit that to the context of this rewrite only.

Here's a quick and dirty check:

def alloc(v, shape):
    res = np.empty(shape)
    res[...] = v
    return res


%timeit alloc(2, (1000, 1000))
258 µs ± 8.04 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit alloc(2, (1, 1, 1, 1, 1000, 1000, 1, 1, 1, 1))
278 µs ± 19.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit np.expand_dims(alloc(2, (1000, 1000)), (0, 1, 2, 3, -1, -2, -3, -4))
282 µs ± 10.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


%timeit alloc(2, (5000, 5000))
21.7 ms ± 2.15 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit alloc(2, (1, 1, 1, 1, 5000, 5000, 1, 1, 1, 1))
21 ms ± 166 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit np.expand_dims(alloc(2, (5000, 5000)), (0, 1, 2, 3, -1, -2, -3, -4))
21.4 ms ± 764 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

The old codebase has some confusing "efficiency"-based comments like that—ones that I'm not sure were ever necessarily true or relevant.

Regardless of the practical truth of these kinds of statements, if we consider the entire context (e.g. the ability to add complementary rewrites/canonicalizations, the fact that we do multiple passes and have distinct rewrite stages, etc.), they may be completely irrelevant.

brandonwillard · 2022-01-19T23:18:31Z

The docstring of local_canonicalize_alloc mentions this Alloc-to-DimShuffle topic and says Theano/Theano#4072 provides reasons for why it's needed. I don't see anything about performance—or any real justifications other than that it might've helped resolve a different problem by generating graphs that some logic could already handle.

In other words, the rewrites and canonical form for Alloc are a mess.

Update: Yes, that's definitely the reason, because local_useless_dimshuffle_in_reshape won't be applied unless those Allocs are (partially) converted to DimShuffles.

stale

The rewrite would sometimes return a new graph identical to the original, resulting in divergence.

ricardoV94 commented Jan 19, 2022

View reviewed changes

aesara/tensor/basic_opt.py Show resolved Hide resolved

ricardoV94 commented Jan 19, 2022

View reviewed changes

aesara/tensor/basic_opt.py Outdated Show resolved Hide resolved

brandonwillard requested changes Jan 19, 2022

View reviewed changes

brandonwillard added bug Something isn't working important labels Jan 19, 2022

brandonwillard previously requested changes Jan 19, 2022

View reviewed changes

Refactor local_canonicalize_alloc into local_alloc_sink_dimshuffle

faa1fc1

brandonwillard mentioned this pull request Jan 20, 2022

Reconsider Alloc's canonical form and rewrites #771

Open

brandonwillard force-pushed the fix_local_elemwise_alloc_bug branch from c0e7a3a to e4c025f Compare January 20, 2022 01:35

brandonwillard marked this pull request as ready for review January 20, 2022 01:35

Refactor and fix local_elemwise_alloc

7e3792c

The rewrite would sometimes return a new graph identical to the original, resulting in divergence.

brandonwillard force-pushed the fix_local_elemwise_alloc_bug branch from e4c025f to 7e3792c Compare January 20, 2022 02:13

brandonwillard merged commit 7c1558a into aesara-devs:main Jan 20, 2022

Uh oh!

Fix bug in local_elemwise_alloc and add tests #767

Fix bug in local_elemwise_alloc and add tests #767

Uh oh!

Conversation

ricardoV94 commented Jan 19, 2022 • edited by brandonwillard Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Jan 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

brandonwillard left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ricardoV94 commented Jan 19, 2022

Uh oh!

brandonwillard commented Jan 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ricardoV94 commented Jan 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandonwillard commented Jan 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ricardoV94 commented Jan 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandonwillard commented Jan 19, 2022

Uh oh!

ricardoV94 commented Jan 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandonwillard commented Jan 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandonwillard commented Jan 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Fix bug in `local_elemwise_alloc` and add tests #767

Fix bug in `local_elemwise_alloc` and add tests #767

ricardoV94 commented Jan 19, 2022 •

edited by brandonwillard

Loading

codecov bot commented Jan 19, 2022 •

edited

Loading

brandonwillard left a comment •

edited

Loading

brandonwillard commented Jan 19, 2022 •

edited

Loading

ricardoV94 commented Jan 19, 2022 •

edited

Loading

brandonwillard commented Jan 19, 2022 •

edited

Loading

ricardoV94 commented Jan 19, 2022 •

edited

Loading

ricardoV94 commented Jan 19, 2022 •

edited

Loading

brandonwillard commented Jan 19, 2022 •

edited

Loading

brandonwillard commented Jan 19, 2022 •

edited

Loading