Skip to content

invalid opcode regression in x86_64-unknown-linux-musl release builds while compiling code using generic-array #135997

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
SohumB opened this issue Jan 24, 2025 · 3 comments · Fixed by #136450
Assignees
Labels
A-mir-opt Area: MIR optimizations A-mir-opt-GVN Area: MIR opt Global Value Numbering (GVN) C-bug Category: This is a bug. I-miscompile Issue: Correct Rust code lowers to incorrect machine code O-musl Target: The musl libc P-critical Critical priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@SohumB
Copy link

SohumB commented Jan 24, 2025

Repro.

searched nightlies: from nightly-2024-11-21 to nightly-2025-01-24
regressed nightly: nightly-2025-01-10
searched commit range: a580b5c...8247594
regressed commit: b6b8361

bisected with cargo-bisect-rustc v0.6.9

Host triple: x86_64-unknown-linux-gnu
Reproduce with:

cargo bisect-rustc -vv --start=2024-11-21 --target=x86_64-unknown-linux-musl --script=run.sh

I definitely don't understand MIR, but I was curious enough to look at it, so in case it helps anyone else, here's what appears to be the relevant section (without the StorageLive/StorageDead calls)

succeeding
        _2 = copy (_1.0: Box<GenericArray<usize, U1>>);

        _3 = copy (((_2.0: Unique<GenericArray<usize, U1>>).0: NonNull<GenericArray<usize, U1>>).0: 
                    *const GenericArray<usize, U1>);

        _4 = *const [usize] from (copy _3, const 1_usize);
        _6 = NonNull::<[usize]> { pointer: copy _4 };
        _11 = copy _6 as *mut [usize] (Transmute);
        _10 = copy _11 as *const usize (PtrToPtr);

        _5 = NonNull::<usize> { pointer: move _10 };
        _9 = copy _5 as *mut usize (Transmute);
        _8 = Offset(copy _9, const 1_usize);
        _7 = move _8 as *const usize (PtrToPtr);
        drop(_1) -> [return: bb2, unwind continue];
failing
        _2 = copy (_1.0: Box<GenericArray<usize, U1>>);

        _3 = copy ((_2.0: Unique<GenericArray<usize, U1>>).0: NonNull<GenericArray<usize, U1>>) 
                    as *const GenericArray<usize, U1> (Transmute);

        _4 = *const [usize] from (copy _3, const 1_usize);
        _9 = copy _4 as *const usize (Transmute);

        _5 = NonNull::<usize> { pointer: move _9 };
        _8 = copy _4 as *mut usize (Transmute);
        _7 = Offset(copy _8, const 1_usize);
        _6 = move _7 as *const usize (PtrToPtr);
        drop(_1) -> [return: bb2, unwind continue];
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Jan 24, 2025
@lqd
Copy link
Member

lqd commented Jan 24, 2025

(cc @scottmcm on that bisection)

@SohumB
Copy link
Author

SohumB commented Jan 24, 2025

        _5 = NonNull::<usize> { pointer: move _9 };
        _8 = copy _4 as *mut usize (Transmute);

Is this the issue? _5 is created but then never used in the failing MIR. In the succeeding MIR, it's what is copy _5'd.

@jieyouxu jieyouxu added O-musl Target: The musl libc A-mir-opt Area: MIR optimizations A-mir-opt-GVN Area: MIR opt Global Value Numbering (GVN) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. I-miscompile Issue: Correct Rust code lowers to incorrect machine code C-bug Category: This is a bug. labels Jan 24, 2025
@scottmcm
Copy link
Member

Hmm, the metadata checks should be keeping things from merging that *const [usize] to *const usize cast

// PtrToPtr-then-Transmute is fine so long as the pointer cast is identity:
// `*const T -> *mut T -> NonNull<T>` is fine, but we need to check for narrowing
// to skip things like `*const [i32] -> *const i32 -> NonNull<T>`.
(PtrToPtr, Transmute)
if self.pointers_have_same_metadata(inner_from, inner_to) =>
{
Some(Transmute)
}
// Similarly, for Transmute-then-PtrToPtr. Note that we need to check different
// variables for their metadata, and thus this can't merge with the previous arm.
(Transmute, PtrToPtr) if self.pointers_have_same_metadata(from, to) => {
Some(Transmute)
}

But clearly something in there is going wrong. I'll take a look.

@saethlin saethlin removed the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Jan 25, 2025
@Noratrieb Noratrieb added the P-critical Critical priority label Feb 1, 2025
@compiler-errors compiler-errors self-assigned this Feb 3, 2025
bors added a commit to rust-lang-ci/rust that referenced this issue Feb 6, 2025
…thlin

Don't reset cast kind without also updating the operand in `simplify_cast` in GVN

Consider this heavily elided segment of the pre-GVN example code that was committed as a test:

```rust
          let _4: *const ();
          let _5: *const [()];
          let mut _6: *const ();
          let _7: *mut ();
          let mut _8: *const [()];
          let mut _9: std::boxed::Box<()>;
          let mut _10: *const ();
          /* ... */
          // Deref a box
          _10 = copy ((_9.0: std::ptr::Unique<()>).0: std::ptr::NonNull<()>) as *const () (Transmute);
          _4 = copy _10;
          _6 = copy _4;
          // Inlined body of `slice::from_raw_parts`, to turn a unit pointer into a slice-of-unit pointer
          _5 = *const [()] from (copy _6, copy _11);
          _8 = copy _5;
          // Cast the raw slice-of-unit pointer back to a unit pointer
          _7 = copy _8 as *mut () (PtrToPtr);
```

A malformed optimization was changing `_7` (which casted the slice-of-unit ptr to a unit ptr) to:

```
          _7 = copy _5 as *mut () (Transmute);
```

...where `_8` was just replaced with `_5` bc of simple copy propagation, that part is not important... the CastKind changing to Transmute is the important part here.

In rust-lang#133324, two new functionalities were implemented:
* Peeking through unsized -> sized PtrToPtr casts whose operand is `AggregateKind::RawPtr`, to turn it into PtrToPtr casts of the base of the aggregate. In this case, this allows us to see that the value of `_7` is just a ptr-to-ptr cast of `_6`.
* Folding a PtrToPtr cast of an operand which is a Transmute cast into just a single Transmute, which (theoretically) allows us to treat `_7` as a transmute into `*mut ()` of the base of the cast of `_10`, which is the place projection of `((_9.0: std::ptr::Unique<()>).0: std::ptr::NonNull<()>)`.

However, when applying those two subsequent optimizations, we must *not* update the CastKind of the final cast *unless* we also update the operand of the cast, since the operand may no longer make sense with the updated CastKind.

In this case, this is problematic because the type of `_8` is `*const [()]`, but that operand in assignment statement of `_7` does *not* get turned into something like `((_9.0: std::ptr::Unique<()>).0: std::ptr::NonNull<()>)` -- **in other words, `try_to_operand` fails** -- because GVN only turns value nodes into locals or consts, not projections of locals. So we fail to update the operand, but we still update the CastKind to Transmute, which means we now are transmuting types of different sizes (a wide pointer and a thin pointer).

r? `@scottmcm` or `@cjgillot`

Fixes rust-lang#136361
Fixes rust-lang#135997
bors added a commit to rust-lang-ci/rust that referenced this issue Feb 7, 2025
…thlin

Don't reset cast kind without also updating the operand in `simplify_cast` in GVN

Consider this heavily elided segment of the pre-GVN example code that was committed as a test:

```rust
          let _4: *const ();
          let _5: *const [()];
          let mut _6: *const ();
          let _7: *mut ();
          let mut _8: *const [()];
          let mut _9: std::boxed::Box<()>;
          let mut _10: *const ();
          /* ... */
          // Deref a box
          _10 = copy ((_9.0: std::ptr::Unique<()>).0: std::ptr::NonNull<()>) as *const () (Transmute);
          _4 = copy _10;
          _6 = copy _4;
          // Inlined body of `slice::from_raw_parts`, to turn a unit pointer into a slice-of-unit pointer
          _5 = *const [()] from (copy _6, copy _11);
          _8 = copy _5;
          // Cast the raw slice-of-unit pointer back to a unit pointer
          _7 = copy _8 as *mut () (PtrToPtr);
```

A malformed optimization was changing `_7` (which casted the slice-of-unit ptr to a unit ptr) to:

```
          _7 = copy _5 as *mut () (Transmute);
```

...where `_8` was just replaced with `_5` bc of simple copy propagation, that part is not important... the CastKind changing to Transmute is the important part here.

In rust-lang#133324, two new functionalities were implemented:
* Peeking through unsized -> sized PtrToPtr casts whose operand is `AggregateKind::RawPtr`, to turn it into PtrToPtr casts of the base of the aggregate. In this case, this allows us to see that the value of `_7` is just a ptr-to-ptr cast of `_6`.
* Folding a PtrToPtr cast of an operand which is a Transmute cast into just a single Transmute, which (theoretically) allows us to treat `_7` as a transmute into `*mut ()` of the base of the cast of `_10`, which is the place projection of `((_9.0: std::ptr::Unique<()>).0: std::ptr::NonNull<()>)`.

However, when applying those two subsequent optimizations, we must *not* update the CastKind of the final cast *unless* we also update the operand of the cast, since the operand may no longer make sense with the updated CastKind.

In this case, this is problematic because the type of `_8` is `*const [()]`, but that operand in assignment statement of `_7` does *not* get turned into something like `((_9.0: std::ptr::Unique<()>).0: std::ptr::NonNull<()>)` -- **in other words, `try_to_operand` fails** -- because GVN only turns value nodes into locals or consts, not projections of locals. So we fail to update the operand, but we still update the CastKind to Transmute, which means we now are transmuting types of different sizes (a wide pointer and a thin pointer).

r? `@scottmcm` or `@cjgillot`

Fixes rust-lang#136361
Fixes rust-lang#135997
@bors bors closed this as completed in 550e035 Feb 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-mir-opt Area: MIR optimizations A-mir-opt-GVN Area: MIR opt Global Value Numbering (GVN) C-bug Category: This is a bug. I-miscompile Issue: Correct Rust code lowers to incorrect machine code O-musl Target: The musl libc P-critical Critical priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants