BUG: signal: Fix calculation of extended image indices in convolve2d. #16134

WarrenWeckesser · 2022-05-07T00:03:52Z

To handle the boundary arguments "wrap" and "symm", the convolve2d
code computes indices into the first argument that might be negative or
larger than the size of the image. The code adjusts negative or big
values so that the adjusted index refers to the correct data element
within the array. The problem was that this adjustment code assumed
that it would only require that the data be extended once. E.g. if
an axis had length 100, it assumed that the extended index would be
in [-100, 200]. This is not true in general; the kernel (i.e. the
second argument of convolve2d) can be any size, and so the required
extension could be any size. If the kernel was too big, the adjusted
indices might still be negative or too big, resulting in attempts to
access memory outside of input array's data. That could result in
reading garbage, or a segmentation fault.

The fix is to use a couple helper functions that can handle arbitrarily
large kernel dimensions when computing the extended data.

Closes gh-8684.
Closes gh-8814.
Closes gh-12686.

To handle the `boundary` arguments "wrap" and "symm", the `convolve2d` code computes indices into the first argument that might be negative or larger than the size of the image. The code adjusts negative or big values so that the adjusted index refers to the correct data element within the array. The problem was that this adjustment code assumed that it would only require that the data be extended *once*. E.g. if an axis had length 100, it assumed that the extended index would be in [-100, 200]. This is not true in general; the kernel (i.e. the second argument of `convolve2d`) can be any size, and so the required extension could be any size. If the kernel was too big, the adjusted indices might still be negative or too big, resulting in attempts to access memory outside of input array's data. That could result in reading garbage, or a segmentation fault. The fix is to use a couple helper functions that can handle arbitrarily large kernel dimensions when computing the extended data. Closes scipygh-8684. Closes scipygh-8814. Closes scipygh-12686.

larsoner

Looks pretty straghtforward to me, +1 for merge from my end.

@WarrenWeckesser anyone else you think we should ping before merge? If not, feel free to merge in a couple of days if nobody else says anything

WarrenWeckesser · 2022-05-09T13:06:42Z

Thanks @larsoner.

anyone else you think we should ping...

@endolith is involved with scipy.signal, though admittedly convolve2d is more of an image processing function than a DSP function.

@coderforlife, you did a deep dive into the C code and correctly analyzed the problem in #12686 (back in 2020!), so you might be interested in taking a look now that the problem is finally being fixed.

endolith · 2022-05-09T18:51:42Z

scipy/signal/_firfilter.c

@@ -99,6 +99,19 @@ static OneMultAddFunction *OneMultAdd[]={NULL,
 					 CLONGDOUBLE_onemultadd,
                                         NULL, NULL, NULL, NULL};

+static int64_t
+reflect_symm_index(int64_t j, int64_t m)
+{


Can there be a comment nearby to explain what j and m are? Like docstrings in python functions

endolith · 2022-05-09T20:08:02Z

scipy/signal/_firfilter.c

-	  if (boundary == REFLECT) ind0 = -1-ind0;
-	  else if (boundary == CIRCULAR) ind0 = Ns[0] + ind0;
+	  if (boundary == REFLECT) ind0 = reflect_symm_index(ind0, Ns[0]);
+	  else if (boundary == CIRCULAR) ind0 = circular_wrap_index(ind0, Ns[0]);
 	  else bounds_pad_flag = true;
 	}
 	else if (ind0 >= Ns[0]) {


Can this check just be removed now since it's included in the index functions and both branches are identical?

Done. I made the suggested changes locally instead of using the github "suggested changes" UI.

endolith · 2022-05-09T20:08:19Z

scipy/signal/_firfilter.c

-	      if (boundary == REFLECT) ind1 = -1-ind1;
-	      else if (boundary == CIRCULAR) ind1 = Ns[1] + ind1;
+	      if (boundary == REFLECT) ind1 = reflect_symm_index(ind1, Ns[1]);
+	      else if (boundary == CIRCULAR) ind1 = circular_wrap_index(ind1, Ns[1]);
 	      else bounds_pad_flag = true;
 	    }
 	    else if (ind1 >= Ns[1]) {


and this removed too

endolith · 2022-05-09T20:30:16Z

scipy/signal/_firfilter.c

 	if (ind0 < 0) {
-	  if (boundary == REFLECT) ind0 = -1-ind0;
-	  else if (boundary == CIRCULAR) ind0 = Ns[0] + ind0;
+	  if (boundary == REFLECT) ind0 = reflect_symm_index(ind0, Ns[0]);
+	  else if (boundary == CIRCULAR) ind0 = circular_wrap_index(ind0, Ns[0]);
 	  else bounds_pad_flag = true;
 	}
 	else if (ind0 >= Ns[0]) {
-	  if (boundary == REFLECT) ind0 = Ns[0]+Ns[0]-1-ind0;
-	  else if (boundary == CIRCULAR) ind0 = ind0 - Ns[0];
+	  if (boundary == REFLECT) ind0 = reflect_symm_index(ind0, Ns[0]);
+	  else if (boundary == CIRCULAR) ind0 = circular_wrap_index(ind0, Ns[0]);
 	  else bounds_pad_flag = true;
 	}


Suggested change

if (ind0 < 0) {

if (boundary == REFLECT) ind0 = -1-ind0;

else if (boundary == CIRCULAR) ind0 = Ns[0] + ind0;

if (boundary == REFLECT) ind0 = reflect_symm_index(ind0, Ns[0]);

else if (boundary == CIRCULAR) ind0 = circular_wrap_index(ind0, Ns[0]);

else bounds_pad_flag = true;

}

else if (ind0 >= Ns[0]) {

if (boundary == REFLECT) ind0 = Ns[0]+Ns[0]-1-ind0;

else if (boundary == CIRCULAR) ind0 = ind0 - Ns[0];

if (boundary == REFLECT) ind0 = reflect_symm_index(ind0, Ns[0]);

else if (boundary == CIRCULAR) ind0 = circular_wrap_index(ind0, Ns[0]);

else bounds_pad_flag = true;

}

if ((ind0 < 0) || (ind0 >= Ns[0])) {

if (boundary == REFLECT) ind0 = reflect_symm_index(ind0, Ns[0]);

else if (boundary == CIRCULAR) ind0 = circular_wrap_index(ind0, Ns[0]);

else bounds_pad_flag = true;

}

Suggestion feature doesn't seem to be working

endolith

Looks good otherwise

coderforlife · 2022-05-10T02:35:07Z

I did look it over and it looked good. Sorry, I couldn't make a PR myself, I discovered the problem while working on cupy's implementation of these functions.

Provide comments for the C functions reflect_symm_index(j, m) and circular_wrap_index(j, m). Remove some repeated C code that can be covered in a single code block.

WarrenWeckesser · 2022-05-10T06:40:49Z

Thanks @endolith, all good suggestions that I've implemented.

And thanks for taking a look, @coderforlife. No need to apologize for not submitting a PR. Reporting and analyzing a problem does not obligate you to fix it!

WarrenWeckesser · 2022-05-10T06:47:54Z

By the way, the horrible code alignment and mix of tabs and spaces in _firfilter.c makes reading that code more challenging than necessary. I don't want to mix a big reformatting job with this PR, but perhaps in a separate maintenance PR this file (and several other C files here and in scipy.special) can be cleaned up with clang-format or some other formatter.

ilayn · 2022-05-10T12:06:16Z

If you deem necessary, then we can review it by hiding whitespace changes. Usually I agree it is a mess to have these changes but C files don't see too much change over time hence I think we can do it for this one occasion.

endolith

Code all looks good to me now, comments are great. Not sure about the test failures

WarrenWeckesser · 2022-05-10T16:21:06Z

The test failures are not related to this PR. They are the result of changes in NumPy; see #16152.

WarrenWeckesser · 2022-05-10T16:26:11Z

@ilayn, I won't reformat the file in this PR. That can be done in a separate PR, after we've decided the extent to which we want to clean up our C and C++ files, and which style to use when we do it.

larsoner · 2022-05-10T16:38:51Z

@ilayn, I won't reformat the file in this PR. That can be done in a separate PR, after we've decided the extent to which we want to clean up our C and C++ files, and which style to use when we do it.

Agreed this should be a separate PR, and that we should probably do it. @WarrenWeckesser want to get the ball rolling maybe with a post suggesting your preferred solution/formatter to the scipy-dev list? If nobody objects we can do it and add the commit to .git-blame-ignore-revs like we're looking at doing for NumpyDoc: numpy/numpydoc#391 (comment)

As noted by @WarrenWeckesser the failures are unrelated and it looks like all comments have been addressed, so I'll merge in a couple of days unless others ask for more changes or time to look!

ilayn · 2022-05-10T19:35:37Z

after we've decided the extent to which we want to clean up our C and C++ files, and which style to use when we do it.

That shed would not be painted very soon, hence my comment 😃 But no problem, I don't see it as a blocker anyways. +1 from me too.

larsoner · 2022-05-13T17:06:18Z

Thanks @WarrenWeckesser !

WarrenWeckesser added defect A clear bug or issue that prevents SciPy from being installed or used as expected scipy.signal C/C++ Items related to the internal C/C++ code base labels May 7, 2022

WarrenWeckesser added this to the 1.9.0 milestone May 7, 2022

WarrenWeckesser requested review from larsoner and ilayn as code owners May 7, 2022 00:03

WarrenWeckesser force-pushed the convolve2d branch from 4dcd6a7 to 315bc75 Compare May 7, 2022 15:55

larsoner approved these changes May 9, 2022

View reviewed changes

endolith reviewed May 9, 2022

View reviewed changes

endolith suggested changes May 9, 2022

View reviewed changes

MAINT: signal: Add comments, simplify code.

b553136

Provide comments for the C functions reflect_symm_index(j, m) and circular_wrap_index(j, m). Remove some repeated C code that can be covered in a single code block.

endolith approved these changes May 10, 2022

View reviewed changes

stefanv approved these changes May 10, 2022

View reviewed changes

dschmitz89 mentioned this pull request May 11, 2022

ENH: Adding DIRECT algorithm to scipy.optimize #14300

Merged

larsoner merged commit 8f55987 into scipy:main May 13, 2022

WarrenWeckesser deleted the convolve2d branch May 13, 2022 17:20

Uh oh!

BUG: signal: Fix calculation of extended image indices in convolve2d. #16134

BUG: signal: Fix calculation of extended image indices in convolve2d. #16134

Uh oh!

Conversation

WarrenWeckesser commented May 7, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

larsoner left a comment

Choose a reason for hiding this comment

Uh oh!

WarrenWeckesser commented May 9, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

endolith left a comment

Choose a reason for hiding this comment

Uh oh!

coderforlife commented May 10, 2022

Uh oh!

WarrenWeckesser commented May 10, 2022

Uh oh!

WarrenWeckesser commented May 10, 2022

Uh oh!

ilayn commented May 10, 2022

Uh oh!

endolith left a comment

Choose a reason for hiding this comment

Uh oh!

WarrenWeckesser commented May 10, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

WarrenWeckesser commented May 10, 2022

Uh oh!

larsoner commented May 10, 2022

Uh oh!

ilayn commented May 10, 2022

Uh oh!

larsoner commented May 13, 2022

Uh oh!

Uh oh!

WarrenWeckesser commented May 7, 2022 •

edited

Loading

WarrenWeckesser commented May 10, 2022 •

edited

Loading