Add `fmodf16` #469

tgross35 · 2025-01-24T05:09:30Z

No description provided.

This can replace `fmod` and `fmodf`. As part of this change I was able to replace some of the `while` loops with `leading_zeros`.

tgross35 · 2025-01-24T06:04:49Z

fmodf128 seems to be unbearably slow (as in, the slowest thing we test) so I am splitting that off until I can look into it further #470.

sjrd · 2025-03-25T16:19:51Z

src/math/generic/fmod.rs

+
+    /* normalize x and y */
+    if ex == 0 {
+        let i = ix << F::EXP_BITS;


I think this was mistranslated from the previous loop. It fails to account for the sign bit. I believe it should be

Suggested change

let i = ix << F::EXP_BITS;

let i = ix << (F::EXP_BITS + 1);

I discovered this because I translated this implementation to Wasm by hand for use in the Scala.js-to-Wasm backend:
scala-js/scala-js#5147
My test cases with subnormals failed with the formula as is, but do pass with the added + 1.

Thanks for finding this, I think you are correct. I'll make sure the failures get tested with the fix. I'll double check with all the cases from scala-js/scala-js#5147, but do you happen to know off the top of your head which ones failed?

Among the tests I wrote, the following fail without the + 1s.

Tests are test(expectedResult, x, y):

For Double (binary64):

test(2.0696e-320, 2.1, 3.123e-320) test(1.772535e-318, 2.1, 2.253547e-318)

For Float (binary32):

test(8.085e-42f, 2.1f, 8.858e-42f) test(6.1636e-40f, 2.1f, 6.39164e-40f) test(4.77036e-40f, 5.5f, 6.39164e-40f) test(-5.64734e-40f, -151.189f, 6.39164e-40f)

Basically they're cases where x is a normal but y is a subnormal. If they're both subnormals, the two errors "cancel out". And if x is a subnormal but y is a normal, then x < y and so the earlier branch exits early before getting here.

Thanks for the detailed info, opened #535 to look at this.

sjrd · 2025-03-25T16:22:00Z

src/math/generic/fmod.rs

+        ex -= i.leading_zeros() as i32;
+        ix <<= -ex + 1;
+    } else {
+        ix &= F::Int::MAX >> F::EXP_BITS;


There is something similar here. I believe the intent was also to have the + 1 :

Suggested change

ix &= F::Int::MAX >> F::EXP_BITS;

ix &= F::Int::MAX >> (F::EXP_BITS + 1);

However in this branch, it doesn't make any difference. The only bit for which it can change the result is anyway or'ed back to 1 on the very next line. But it seems more clearly correct with the + 1, as it then creates a mask that exactly matches the mantissa bits, rather than "the mantissa bits plus arbitrarily the low-order bit of the exponent".

From discussion at [1] our loop count calculation is incorrect, causing an issue with subnormal numbers. Add test cases for known failures. [1]: rust-lang#469 (comment)

Discussed at [1], there was an off-by-one mistake when converting from the loop routine to using `leading_zeros` for normalization. Currently, using `EXP_BITS` has the effect that `ix` after the branch has its MSB _one bit to the left_ of the implicit bit's position, whereas a shift by `EXP_BITS + 1` ensures that the MSB is at the implicit bit's position, matching what is done for normals. This doesn't seem to have any effect in our implementation since the failing test cases from [1] appear to still have correct results. Since the result of using `EXP_BITS + 1` is more consistent with what is done for normals, apply this here. [1]: rust-lang#469 (comment)

Discussed at [1], there was an off-by-one mistake when converting from the loop routine to using `leading_zeros` for normalization. Currently, using `EXP_BITS` has the effect that `ix` after the branch has its MSB _one bit to the left_ of the implicit bit's position, whereas a shift by `EXP_BITS + 1` ensures that the MSB is exactly at the implicit bit's position, matching what is done for normals (where the implicit bit is set to be explicit). This doesn't seem to have any effect in our implementation since the failing test cases from [1] appear to still have correct results. Since the result of using `EXP_BITS + 1` is more consistent with what is done for normals, apply this here. [1]: rust-lang#469 (comment)

From discussion at [1] our loop count calculation is incorrect, causing an issue with subnormal numbers. Add test cases for known failures. [1]: rust-lang#469 (comment)

Discussed at [1], there was an off-by-one mistake when converting from the loop routine to using `leading_zeros` for normalization. Currently, using `EXP_BITS` has the effect that `ix` after the branch has its MSB _one bit to the left_ of the implicit bit's position, whereas a shift by `EXP_BITS + 1` ensures that the MSB is exactly at the implicit bit's position, matching what is done for normals (where the implicit bit is set to be explicit). This doesn't seem to have any effect in our implementation since the failing test cases from [1] appear to still have correct results. Since the result of using `EXP_BITS + 1` is more consistent with what is done for normals, apply this here. [1]: rust-lang#469 (comment)

From discussion at [1] our loop count calculation is incorrect, causing an issue with subnormal numbers. Add test cases for known failures. [1]: #469 (comment)

Discussed at [1], there was an off-by-one mistake when converting from the loop routine to using `leading_zeros` for normalization. Currently, using `EXP_BITS` has the effect that `ix` after the branch has its MSB _one bit to the left_ of the implicit bit's position, whereas a shift by `EXP_BITS + 1` ensures that the MSB is exactly at the implicit bit's position, matching what is done for normals (where the implicit bit is set to be explicit). This doesn't seem to have any effect in our implementation since the failing test cases from [1] appear to still have correct results. Since the result of using `EXP_BITS + 1` is more consistent with what is done for normals, apply this here. [1]: #469 (comment)

Add `fmodf16`

From discussion at [1] our loop count calculation is incorrect, causing an issue with subnormal numbers. Add test cases for known failures. [1]: #469 (comment)

Discussed at [1], there was an off-by-one mistake when converting from the loop routine to using `leading_zeros` for normalization. Currently, using `EXP_BITS` has the effect that `ix` after the branch has its MSB _one bit to the left_ of the implicit bit's position, whereas a shift by `EXP_BITS + 1` ensures that the MSB is exactly at the implicit bit's position, matching what is done for normals (where the implicit bit is set to be explicit). This doesn't seem to have any effect in our implementation since the failing test cases from [1] appear to still have correct results. Since the result of using `EXP_BITS + 1` is more consistent with what is done for normals, apply this here. [1]: #469 (comment)

tgross35 force-pushed the generic-mod branch from 4b38072 to 39ceb34 Compare January 24, 2025 05:13

Add a generic version of fmod

566e8e1

This can replace `fmod` and `fmodf`. As part of this change I was able to replace some of the `while` loops with `leading_zeros`.

tgross35 force-pushed the generic-mod branch 2 times, most recently from 7dea0e4 to 52441d9 Compare January 24, 2025 06:02

tgross35 mentioned this pull request Jan 24, 2025

Add fmodf128 #470

Merged

Add fmodf16 using the generic implementation

856e314

tgross35 force-pushed the generic-mod branch from 52441d9 to 856e314 Compare January 24, 2025 06:04

tgross35 changed the title ~~Add fmodf16 and fmodf128~~ Add fmodf16 Jan 24, 2025

tgross35 enabled auto-merge January 24, 2025 06:05

tgross35 merged commit dae4f1a into rust-lang:master Jan 24, 2025
35 checks passed

tgross35 deleted the generic-mod branch January 24, 2025 06:41

sjrd reviewed Mar 25, 2025

View reviewed changes

sjrd mentioned this pull request Apr 6, 2025

Wasm: Implement fmod on the Scala side. scala-js/scala-js#5149

Merged

tgross35 mentioned this pull request Apr 15, 2025

fmod: Correct the normalization of subnormals #535

Merged

tgross35 added a commit that referenced this pull request Apr 16, 2025

fmod: Add regression tests for subnormal issue

e64f55e

From discussion at [1] our loop count calculation is incorrect, causing an issue with subnormal numbers. Add test cases for known failures. [1]: #469 (comment)

tgross35 added a commit that referenced this pull request Apr 18, 2025

Merge pull request #469 from tgross35/generic-mod

6a178e1

Add `fmodf16`

tgross35 added a commit that referenced this pull request Apr 18, 2025

fmod: Add regression tests for subnormal issue

34b7092

From discussion at [1] our loop count calculation is incorrect, causing an issue with subnormal numbers. Add test cases for known failures. [1]: #469 (comment)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add `fmodf16` #469

Add `fmodf16` #469

Uh oh!

tgross35 commented Jan 24, 2025

Uh oh!

tgross35 commented Jan 24, 2025

Uh oh!

Uh oh!

sjrd Mar 25, 2025

Uh oh!

tgross35 Apr 1, 2025

Uh oh!

sjrd Apr 1, 2025

Uh oh!

tgross35 Apr 15, 2025

Uh oh!

sjrd Mar 25, 2025

Uh oh!

Uh oh!

	ix &= F::Int::MAX >> F::EXP_BITS;
	ix &= F::Int::MAX >> (F::EXP_BITS + 1);

Add fmodf16 #469

Add fmodf16 #469

Uh oh!

Conversation

tgross35 commented Jan 24, 2025

Uh oh!

tgross35 commented Jan 24, 2025

Uh oh!

Uh oh!

sjrd Mar 25, 2025

Choose a reason for hiding this comment

Uh oh!

tgross35 Apr 1, 2025

Choose a reason for hiding this comment

Uh oh!

sjrd Apr 1, 2025

Choose a reason for hiding this comment

Uh oh!

tgross35 Apr 15, 2025

Choose a reason for hiding this comment

Uh oh!

sjrd Mar 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Add `fmodf16` #469

Add `fmodf16` #469