Skip to content

BUG: signal: Fix calculation of extended image indices in convolve2d. #16134

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 13, 2022

Conversation

WarrenWeckesser
Copy link
Member

@WarrenWeckesser WarrenWeckesser commented May 7, 2022

To handle the boundary arguments "wrap" and "symm", the convolve2d
code computes indices into the first argument that might be negative or
larger than the size of the image. The code adjusts negative or big
values so that the adjusted index refers to the correct data element
within the array. The problem was that this adjustment code assumed
that it would only require that the data be extended once. E.g. if
an axis had length 100, it assumed that the extended index would be
in [-100, 200]. This is not true in general; the kernel (i.e. the
second argument of convolve2d) can be any size, and so the required
extension could be any size. If the kernel was too big, the adjusted
indices might still be negative or too big, resulting in attempts to
access memory outside of input array's data. That could result in
reading garbage, or a segmentation fault.

The fix is to use a couple helper functions that can handle arbitrarily
large kernel dimensions when computing the extended data.

Closes gh-8684.
Closes gh-8814.
Closes gh-12686.

@WarrenWeckesser WarrenWeckesser added defect A clear bug or issue that prevents SciPy from being installed or used as expected scipy.signal C/C++ Items related to the internal C/C++ code base labels May 7, 2022
@WarrenWeckesser WarrenWeckesser added this to the 1.9.0 milestone May 7, 2022
To handle the `boundary` arguments "wrap" and "symm", the `convolve2d`
code computes indices into the first argument that might be negative or
larger than the size of the image.  The code adjusts negative or big
values so that the adjusted index refers to the correct data element
within the array.  The problem was that this adjustment code assumed
that it would only require that the data be extended *once*. E.g. if
an axis had length 100, it assumed that the extended index would be
in [-100, 200].  This is not true in general; the kernel (i.e. the
second argument of `convolve2d`) can be any size, and so the required
extension could be any size.  If the kernel was too big, the adjusted
indices might still be negative or too big, resulting in attempts to
access memory outside of input array's data.  That could result in
reading garbage, or a segmentation fault.

The fix is to use a couple helper functions that can handle arbitrarily
large kernel dimensions when computing the extended data.

Closes scipygh-8684.
Closes scipygh-8814.
Closes scipygh-12686.
Copy link
Member

@larsoner larsoner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty straghtforward to me, +1 for merge from my end.

@WarrenWeckesser anyone else you think we should ping before merge? If not, feel free to merge in a couple of days if nobody else says anything

@WarrenWeckesser
Copy link
Member Author

Thanks @larsoner.

anyone else you think we should ping...

@endolith is involved with scipy.signal, though admittedly convolve2d is more of an image processing function than a DSP function.

@coderforlife, you did a deep dive into the C code and correctly analyzed the problem in #12686 (back in 2020!), so you might be interested in taking a look now that the problem is finally being fixed.

@@ -99,6 +99,19 @@ static OneMultAddFunction *OneMultAdd[]={NULL,
CLONGDOUBLE_onemultadd,
NULL, NULL, NULL, NULL};

static int64_t
reflect_symm_index(int64_t j, int64_t m)
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can there be a comment nearby to explain what j and m are? Like docstrings in python functions

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if (boundary == REFLECT) ind0 = -1-ind0;
else if (boundary == CIRCULAR) ind0 = Ns[0] + ind0;
if (boundary == REFLECT) ind0 = reflect_symm_index(ind0, Ns[0]);
else if (boundary == CIRCULAR) ind0 = circular_wrap_index(ind0, Ns[0]);
else bounds_pad_flag = true;
}
else if (ind0 >= Ns[0]) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this check just be removed now since it's included in the index functions and both branches are identical?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I made the suggested changes locally instead of using the github "suggested changes" UI.

if (boundary == REFLECT) ind1 = -1-ind1;
else if (boundary == CIRCULAR) ind1 = Ns[1] + ind1;
if (boundary == REFLECT) ind1 = reflect_symm_index(ind1, Ns[1]);
else if (boundary == CIRCULAR) ind1 = circular_wrap_index(ind1, Ns[1]);
else bounds_pad_flag = true;
}
else if (ind1 >= Ns[1]) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and this removed too

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 179 to 188
if (ind0 < 0) {
if (boundary == REFLECT) ind0 = -1-ind0;
else if (boundary == CIRCULAR) ind0 = Ns[0] + ind0;
if (boundary == REFLECT) ind0 = reflect_symm_index(ind0, Ns[0]);
else if (boundary == CIRCULAR) ind0 = circular_wrap_index(ind0, Ns[0]);
else bounds_pad_flag = true;
}
else if (ind0 >= Ns[0]) {
if (boundary == REFLECT) ind0 = Ns[0]+Ns[0]-1-ind0;
else if (boundary == CIRCULAR) ind0 = ind0 - Ns[0];
if (boundary == REFLECT) ind0 = reflect_symm_index(ind0, Ns[0]);
else if (boundary == CIRCULAR) ind0 = circular_wrap_index(ind0, Ns[0]);
else bounds_pad_flag = true;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (ind0 < 0) {
if (boundary == REFLECT) ind0 = -1-ind0;
else if (boundary == CIRCULAR) ind0 = Ns[0] + ind0;
if (boundary == REFLECT) ind0 = reflect_symm_index(ind0, Ns[0]);
else if (boundary == CIRCULAR) ind0 = circular_wrap_index(ind0, Ns[0]);
else bounds_pad_flag = true;
}
else if (ind0 >= Ns[0]) {
if (boundary == REFLECT) ind0 = Ns[0]+Ns[0]-1-ind0;
else if (boundary == CIRCULAR) ind0 = ind0 - Ns[0];
if (boundary == REFLECT) ind0 = reflect_symm_index(ind0, Ns[0]);
else if (boundary == CIRCULAR) ind0 = circular_wrap_index(ind0, Ns[0]);
else bounds_pad_flag = true;
}
if ((ind0 < 0) || (ind0 >= Ns[0])) {
if (boundary == REFLECT) ind0 = reflect_symm_index(ind0, Ns[0]);
else if (boundary == CIRCULAR) ind0 = circular_wrap_index(ind0, Ns[0]);
else bounds_pad_flag = true;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion feature doesn't seem to be working

Copy link
Member

@endolith endolith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good otherwise

@coderforlife
Copy link

I did look it over and it looked good. Sorry, I couldn't make a PR myself, I discovered the problem while working on cupy's implementation of these functions.

Provide comments for the C functions reflect_symm_index(j, m)
and circular_wrap_index(j, m).

Remove some repeated C code that can be covered in a
single code block.
@WarrenWeckesser
Copy link
Member Author

Thanks @endolith, all good suggestions that I've implemented.

And thanks for taking a look, @coderforlife. No need to apologize for not submitting a PR. Reporting and analyzing a problem does not obligate you to fix it!

@WarrenWeckesser
Copy link
Member Author

By the way, the horrible code alignment and mix of tabs and spaces in _firfilter.c makes reading that code more challenging than necessary. I don't want to mix a big reformatting job with this PR, but perhaps in a separate maintenance PR this file (and several other C files here and in scipy.special) can be cleaned up with clang-format or some other formatter.

@ilayn
Copy link
Member

ilayn commented May 10, 2022

If you deem necessary, then we can review it by hiding whitespace changes. Usually I agree it is a mess to have these changes but C files don't see too much change over time hence I think we can do it for this one occasion.

Copy link
Member

@endolith endolith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code all looks good to me now, comments are great. Not sure about the test failures

@WarrenWeckesser
Copy link
Member Author

WarrenWeckesser commented May 10, 2022

The test failures are not related to this PR. They are the result of changes in NumPy; see #16152.

@WarrenWeckesser
Copy link
Member Author

@ilayn, I won't reformat the file in this PR. That can be done in a separate PR, after we've decided the extent to which we want to clean up our C and C++ files, and which style to use when we do it.

@larsoner
Copy link
Member

@ilayn, I won't reformat the file in this PR. That can be done in a separate PR, after we've decided the extent to which we want to clean up our C and C++ files, and which style to use when we do it.

Agreed this should be a separate PR, and that we should probably do it. @WarrenWeckesser want to get the ball rolling maybe with a post suggesting your preferred solution/formatter to the scipy-dev list? If nobody objects we can do it and add the commit to .git-blame-ignore-revs like we're looking at doing for NumpyDoc: numpy/numpydoc#391 (comment)

As noted by @WarrenWeckesser the failures are unrelated and it looks like all comments have been addressed, so I'll merge in a couple of days unless others ask for more changes or time to look!

@ilayn
Copy link
Member

ilayn commented May 10, 2022

after we've decided the extent to which we want to clean up our C and C++ files, and which style to use when we do it.

That shed would not be painted very soon, hence my comment 😃 But no problem, I don't see it as a blocker anyways. +1 from me too.

@larsoner
Copy link
Member

Thanks @WarrenWeckesser !

@WarrenWeckesser WarrenWeckesser deleted the convolve2d branch May 13, 2022 17:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C/C++ Items related to the internal C/C++ code base defect A clear bug or issue that prevents SciPy from being installed or used as expected scipy.signal
Projects
None yet
6 participants