Skip to content
This repository was archived by the owner on Apr 28, 2025. It is now read-only.

Add fmodf16 #469

Merged
merged 2 commits into from
Jan 24, 2025
Merged

Add fmodf16 #469

merged 2 commits into from
Jan 24, 2025

Conversation

tgross35
Copy link
Contributor

No description provided.

This can replace `fmod` and `fmodf`. As part of this change I was able
to replace some of the `while` loops with `leading_zeros`.
@tgross35 tgross35 force-pushed the generic-mod branch 2 times, most recently from 7dea0e4 to 52441d9 Compare January 24, 2025 06:02
@tgross35 tgross35 mentioned this pull request Jan 24, 2025
@tgross35
Copy link
Contributor Author

fmodf128 seems to be unbearably slow (as in, the slowest thing we test) so I am splitting that off until I can look into it further #470.

@tgross35 tgross35 changed the title Add fmodf16 and fmodf128 Add fmodf16 Jan 24, 2025
@tgross35 tgross35 enabled auto-merge January 24, 2025 06:05
@tgross35 tgross35 merged commit dae4f1a into rust-lang:master Jan 24, 2025
35 checks passed
@tgross35 tgross35 deleted the generic-mod branch January 24, 2025 06:41

/* normalize x and y */
if ex == 0 {
let i = ix << F::EXP_BITS;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was mistranslated from the previous loop. It fails to account for the sign bit. I believe it should be

Suggested change
let i = ix << F::EXP_BITS;
let i = ix << (F::EXP_BITS + 1);

I discovered this because I translated this implementation to Wasm by hand for use in the Scala.js-to-Wasm backend:
scala-js/scala-js#5147
My test cases with subnormals failed with the formula as is, but do pass with the added + 1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for finding this, I think you are correct. I'll make sure the failures get tested with the fix. I'll double check with all the cases from scala-js/scala-js#5147, but do you happen to know off the top of your head which ones failed?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Among the tests I wrote, the following fail without the + 1s.

Tests are test(expectedResult, x, y):

For Double (binary64):

test(2.0696e-320, 2.1, 3.123e-320)
test(1.772535e-318, 2.1, 2.253547e-318)

For Float (binary32):

test(8.085e-42f, 2.1f, 8.858e-42f)
test(6.1636e-40f, 2.1f, 6.39164e-40f)
test(4.77036e-40f, 5.5f, 6.39164e-40f)
test(-5.64734e-40f, -151.189f, 6.39164e-40f)

Basically they're cases where x is a normal but y is a subnormal. If they're both subnormals, the two errors "cancel out". And if x is a subnormal but y is a normal, then x < y and so the earlier branch exits early before getting here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed info, opened #535 to look at this.

ex -= i.leading_zeros() as i32;
ix <<= -ex + 1;
} else {
ix &= F::Int::MAX >> F::EXP_BITS;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is something similar here. I believe the intent was also to have the + 1 :

Suggested change
ix &= F::Int::MAX >> F::EXP_BITS;
ix &= F::Int::MAX >> (F::EXP_BITS + 1);

However in this branch, it doesn't make any difference. The only bit for which it can change the result is anyway or'ed back to 1 on the very next line. But it seems more clearly correct with the + 1, as it then creates a mask that exactly matches the mantissa bits, rather than "the mantissa bits plus arbitrarily the low-order bit of the exponent".

tgross35 added a commit to tgross35/rust-libm that referenced this pull request Apr 15, 2025
From discussion at [1] our loop count calculation is incorrect, causing
an issue with subnormal numbers. Add test cases for known failures.

[1]: rust-lang#469 (comment)
tgross35 added a commit to tgross35/rust-libm that referenced this pull request Apr 16, 2025
Discussed at [1], there was an off-by-one mistake when converting from
the loop routine to using `leading_zeros` for normalization.

Currently, using `EXP_BITS` has the effect that `ix` after the branch
has its MSB _one bit to the left_ of the implicit bit's position,
whereas a shift by `EXP_BITS + 1` ensures that the MSB is at the
implicit bit's position, matching what is done for normals. This doesn't
seem to have any effect in our implementation since the failing test
cases from [1] appear to still have correct results.

Since the result of using `EXP_BITS + 1` is more consistent with what is
done for normals, apply this here.

[1]: rust-lang#469 (comment)
tgross35 added a commit to tgross35/rust-libm that referenced this pull request Apr 16, 2025
Discussed at [1], there was an off-by-one mistake when converting from
the loop routine to using `leading_zeros` for normalization.

Currently, using `EXP_BITS` has the effect that `ix` after the branch
has its MSB _one bit to the left_ of the implicit bit's position,
whereas a shift by `EXP_BITS + 1` ensures that the MSB is exactly at the
implicit bit's position, matching what is done for normals (where the
implicit bit is set to be explicit). This doesn't seem to have any
effect in our implementation since the failing test cases from [1]
appear to still have correct results.

Since the result of using `EXP_BITS + 1` is more consistent with what is
done for normals, apply this here.

[1]: rust-lang#469 (comment)
tgross35 added a commit to tgross35/rust-libm that referenced this pull request Apr 16, 2025
Discussed at [1], there was an off-by-one mistake when converting from
the loop routine to using `leading_zeros` for normalization.

Currently, using `EXP_BITS` has the effect that `ix` after the branch
has its MSB _one bit to the left_ of the implicit bit's position,
whereas a shift by `EXP_BITS + 1` ensures that the MSB is exactly at the
implicit bit's position, matching what is done for normals (where the
implicit bit is set to be explicit). This doesn't seem to have any
effect in our implementation since the failing test cases from [1]
appear to still have correct results.

Since the result of using `EXP_BITS + 1` is more consistent with what is
done for normals, apply this here.

[1]: rust-lang#469 (comment)
tgross35 added a commit to tgross35/rust-libm that referenced this pull request Apr 16, 2025
From discussion at [1] our loop count calculation is incorrect, causing
an issue with subnormal numbers. Add test cases for known failures.

[1]: rust-lang#469 (comment)
tgross35 added a commit to tgross35/rust-libm that referenced this pull request Apr 16, 2025
Discussed at [1], there was an off-by-one mistake when converting from
the loop routine to using `leading_zeros` for normalization.

Currently, using `EXP_BITS` has the effect that `ix` after the branch
has its MSB _one bit to the left_ of the implicit bit's position,
whereas a shift by `EXP_BITS + 1` ensures that the MSB is exactly at the
implicit bit's position, matching what is done for normals (where the
implicit bit is set to be explicit). This doesn't seem to have any
effect in our implementation since the failing test cases from [1]
appear to still have correct results.

Since the result of using `EXP_BITS + 1` is more consistent with what is
done for normals, apply this here.

[1]: rust-lang#469 (comment)
tgross35 added a commit that referenced this pull request Apr 16, 2025
From discussion at [1] our loop count calculation is incorrect, causing
an issue with subnormal numbers. Add test cases for known failures.

[1]: #469 (comment)
tgross35 added a commit that referenced this pull request Apr 16, 2025
Discussed at [1], there was an off-by-one mistake when converting from
the loop routine to using `leading_zeros` for normalization.

Currently, using `EXP_BITS` has the effect that `ix` after the branch
has its MSB _one bit to the left_ of the implicit bit's position,
whereas a shift by `EXP_BITS + 1` ensures that the MSB is exactly at the
implicit bit's position, matching what is done for normals (where the
implicit bit is set to be explicit). This doesn't seem to have any
effect in our implementation since the failing test cases from [1]
appear to still have correct results.

Since the result of using `EXP_BITS + 1` is more consistent with what is
done for normals, apply this here.

[1]: #469 (comment)
tgross35 added a commit that referenced this pull request Apr 18, 2025
tgross35 added a commit that referenced this pull request Apr 18, 2025
From discussion at [1] our loop count calculation is incorrect, causing
an issue with subnormal numbers. Add test cases for known failures.

[1]: #469 (comment)
tgross35 added a commit that referenced this pull request Apr 18, 2025
Discussed at [1], there was an off-by-one mistake when converting from
the loop routine to using `leading_zeros` for normalization.

Currently, using `EXP_BITS` has the effect that `ix` after the branch
has its MSB _one bit to the left_ of the implicit bit's position,
whereas a shift by `EXP_BITS + 1` ensures that the MSB is exactly at the
implicit bit's position, matching what is done for normals (where the
implicit bit is set to be explicit). This doesn't seem to have any
effect in our implementation since the failing test cases from [1]
appear to still have correct results.

Since the result of using `EXP_BITS + 1` is more consistent with what is
done for normals, apply this here.

[1]: #469 (comment)
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants