Skip to content

ICE when include_bytes-ing ~>1GB of data in lib.rs #103607

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Heliozoa opened this issue Oct 27, 2022 · 6 comments · Fixed by #113542
Closed

ICE when include_bytes-ing ~>1GB of data in lib.rs #103607

Heliozoa opened this issue Oct 27, 2022 · 6 comments · Fixed by #113542
Labels
C-bug Category: This is a bug. I-ICE Issue: The compiler panicked, giving an Internal Compilation Error (ICE) ❄️ S-has-mcve Status: A Minimal Complete and Verifiable Example has been found for this issue T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@Heliozoa
Copy link
Contributor

Code

dd if=/dev/random of=file bs=1G count=1

// Cargo.toml
[package]
name = "ice"
version = "0.1.0"
edition = "2021"
// lib.rs
const _FILE &[u8] = include_bytes!("../file");

I was not able to reproduce the crash when I moved the code to main.rs.

Including one ~750MB file did not cause a crash, but including both a ~750MB file and a ~250MB file did.

Meta

Was able to reproduce on stable and nightly
rustc --version --verbose:

rustc 1.64.0 (a55dd71d5 2022-09-19)
binary: rustc
commit-hash: a55dd71d5fb0ec5a6a3a9e8c27b2127ba491ce52
commit-date: 2022-09-19
host: x86_64-unknown-linux-gnu
release: 1.64.0
LLVM version: 14.0.6

rustc +nightly --version --verbose:

rustc 1.66.0-nightly (bed4ad65b 2022-10-25)
binary: rustc
commit-hash: bed4ad65bf7a1cef39e3d66b3670189581b3b073
commit-date: 2022-10-25
host: x86_64-unknown-linux-gnu
release: 1.66.0-nightly
LLVM version: 15.0.2

Error output

When running cargo (+nightly) check

thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: TryFromIntError(())', compiler/rustc_metadata/src/rmeta/table.rs:234:49
Backtrace

thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: TryFromIntError(())', compiler/rustc_metadata/src/rmeta/table.rs:234:49
stack backtrace:
   0:     0x7f710456c850 - std::backtrace_rs::backtrace::libunwind::trace::hc24175774fcccc98
                               at /rustc/bed4ad65bf7a1cef39e3d66b3670189581b3b073/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   1:     0x7f710456c850 - std::backtrace_rs::backtrace::trace_unsynchronized::hddff1d85f511dc1b
                               at /rustc/bed4ad65bf7a1cef39e3d66b3670189581b3b073/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:     0x7f710456c850 - std::sys_common::backtrace::_print_fmt::he96e84d8aca849f4
                               at /rustc/bed4ad65bf7a1cef39e3d66b3670189581b3b073/library/std/src/sys_common/backtrace.rs:65:5
   3:     0x7f710456c850 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h8c690aa67f9b9f0b
                               at /rustc/bed4ad65bf7a1cef39e3d66b3670189581b3b073/library/std/src/sys_common/backtrace.rs:44:22
   4:     0x7f71045c87ae - core::fmt::write::h9205d5073fda8a0b
                               at /rustc/bed4ad65bf7a1cef39e3d66b3670189581b3b073/library/core/src/fmt/mod.rs:1209:17
   5:     0x7f710455cac5 - std::io::Write::write_fmt::h72ccddbdf2befca8
                               at /rustc/bed4ad65bf7a1cef39e3d66b3670189581b3b073/library/std/src/io/mod.rs:1682:15
   6:     0x7f710456c615 - std::sys_common::backtrace::_print::h2243d2fa1c5bb008
                               at /rustc/bed4ad65bf7a1cef39e3d66b3670189581b3b073/library/std/src/sys_common/backtrace.rs:47:5
   7:     0x7f710456c615 - std::sys_common::backtrace::print::hf6a50c16875f42b0
                               at /rustc/bed4ad65bf7a1cef39e3d66b3670189581b3b073/library/std/src/sys_common/backtrace.rs:34:9
   8:     0x7f710456f41f - std::panicking::default_hook::{{closure}}::h57a3846e717bd3fa
                               at /rustc/bed4ad65bf7a1cef39e3d66b3670189581b3b073/library/std/src/panicking.rs:267:22
   9:     0x7f710456f15a - std::panicking::default_hook::h1631ff080b8ff770
                               at /rustc/bed4ad65bf7a1cef39e3d66b3670189581b3b073/library/std/src/panicking.rs:286:9
  10:     0x7f710456fc28 - std::panicking::rust_panic_with_hook::h483dc93595b087d7
                               at /rustc/bed4ad65bf7a1cef39e3d66b3670189581b3b073/library/std/src/panicking.rs:688:13
  11:     0x7f710456f9c7 - std::panicking::begin_panic_handler::{{closure}}::h7aec963b30e3fcb8
                               at /rustc/bed4ad65bf7a1cef39e3d66b3670189581b3b073/library/std/src/panicking.rs:579:13
  12:     0x7f710456ccfc - std::sys_common::backtrace::__rust_end_short_backtrace::hb169843e253050fb
                               at /rustc/bed4ad65bf7a1cef39e3d66b3670189581b3b073/library/std/src/sys_common/backtrace.rs:137:18
  13:     0x7f710456f6e2 - rust_begin_unwind
                               at /rustc/bed4ad65bf7a1cef39e3d66b3670189581b3b073/library/std/src/panicking.rs:575:5
  14:     0x7f71045c5193 - core::panicking::panic_fmt::h480607ab04f30057
                               at /rustc/bed4ad65bf7a1cef39e3d66b3670189581b3b073/library/core/src/panicking.rs:65:14
  15:     0x7f71045c5703 - core::result::unwrap_failed::h126ae8e0aa9c3f20
                               at /rustc/bed4ad65bf7a1cef39e3d66b3670189581b3b073/library/core/src/result.rs:1791:5
  16:     0x7f710618f32e - <rustc_span[7a2d20772580c7f8]::hygiene::HygieneEncodeContext>::encode::<(&mut rustc_metadata[2692809bc02232ae]::rmeta::encoder::EncodeContext, &mut rustc_metadata[2692809bc02232ae]::rmeta::table::TableBuilder<u32, rustc_metadata[2692809bc02232ae]::rmeta::LazyValue<rustc_span[7a2d20772580c7f8]::hygiene::SyntaxContextData>>, &mut rustc_metadata[2692809bc02232ae]::rmeta::table::TableBuilder<rustc_span[7a2d20772580c7f8]::hygiene::ExpnIndex, rustc_metadata[2692809bc02232ae]::rmeta::LazyValue<rustc_span[7a2d20772580c7f8]::hygiene::ExpnData>>, &mut rustc_metadata[2692809bc02232ae]::rmeta::table::TableBuilder<rustc_span[7a2d20772580c7f8]::hygiene::ExpnIndex, rustc_metadata[2692809bc02232ae]::rmeta::LazyValue<rustc_span[7a2d20772580c7f8]::hygiene::ExpnHash>>), <rustc_metadata[2692809bc02232ae]::rmeta::encoder::EncodeContext>::encode_hygiene::{closure#0}, <rustc_metadata[2692809bc02232ae]::rmeta::encoder::EncodeContext>::encode_hygiene::{closure#1}>
  17:     0x7f710617611b - <rustc_metadata[2692809bc02232ae]::rmeta::encoder::EncodeContext>::encode_crate_root
  18:     0x7f7106a87f9d - rustc_metadata[2692809bc02232ae]::rmeta::encoder::encode_metadata_impl
  19:     0x7f7106a7641e - rustc_data_structures[5434cf185ef65262]::sync::join::<rustc_metadata[2692809bc02232ae]::rmeta::encoder::encode_metadata::{closure#0}, rustc_metadata[2692809bc02232ae]::rmeta::encoder::encode_metadata::{closure#1}, (), ()>
  20:     0x7f7106a762ad - rustc_metadata[2692809bc02232ae]::rmeta::encoder::encode_metadata
  21:     0x7f7106a7520e - rustc_metadata[2692809bc02232ae]::fs::encode_and_write_metadata
  22:     0x7f7106a2a303 - <rustc_interface[fd3d32691052b72b]::passes::QueryContext>::enter::<<rustc_interface[fd3d32691052b72b]::queries::Queries>::ongoing_codegen::{closure#0}::{closure#0}, core[6
  24:     0x7f7106a20512 - <rustc_interface[fd3d32691052b72b]::interface::Compiler>::enter::<rustc_driver[62b0704af81f9759]::run_compiler::{closure#1}::{closure#2}, core[69220df7bf4c31ea]::result::Result<core[69220df7bf4c31ea]::option::Option<rustc_interface[fd3d32691052b72b]::queries::Linker>, rustc_errors[f55630e1eea6f6be]::ErrorGuaranteed>>
  25:     0x7f7106a176d2 - rustc_span[7a2d20772580c7f8]::with_source_map::<core[69220df7bf4c31ea]::result::Result<(), rustc_errors[f55630e1eea6f6be]::ErrorGuaranteed>, rustc_interface[fd3d32691052b72b]::interface::run_compiler<core[69220df7bf4c31ea]::result::Result<(), rustc_errors[f55630e1eea6f6be]::ErrorGuaranteed>, rustc_driver[62b0704af81f9759]::run_compiler::{closure#1}>::{closure#0}::{closure#1}>
  26:     0x7f7106a171c9 - <scoped_tls[c5a5fd0946957a01]::ScopedKey<rustc_span[7a2d20772580c7f8]::SessionGlobals>>::set::<rustc_interface[fd3d32691052b72b]::interface::run_compiler<core[69220df7bf4c31ea]::result::Result<(), rustc_errors[f55630e1eea6f6be]::ErrorGuaranteed>, rustc_driver[62b0704af81f9759]::run_compiler::{closure#1}>::{closure#0}, core[69220df7bf4c31ea]::result::Result<(), rustc_errors[f55630e1eea6f6be]::ErrorGuaranteed>>
  27:     0x7f7106a167d8 - std[e5fb0ad32021ee83]::sys_common::backtrace::__rust_begin_short_backtrace::<rustc_interface[fd3d32691052b72b]::util::run_in_thread_pool_with_globals<rustc_interface[fd3d32691052b72b]::interface::run_compiler<core[69220df7bf4c31ea]::result::Result<(), rustc_errors[f55630e1eea6f6be]::ErrorGuaranteed>, rustc_driver[62b0704af81f9759]::run_compiler::{closure#1}>::{closure#0}, core[69220df7bf4c31ea]::result::Result<(), rustc_errors[f55630e1eea6f6be]::ErrorGuaranteed>>::{closure#0}::{closure#0}, core[69220df7bf4c31ea]::result::Result<(), rustc_errors[f55630e1eea6f6be]::ErrorGuaranteed>>
  28:     0x7f7106a164fc - <<std[e5fb0ad32021ee83]::thread::Builder>::spawn_unchecked_<rustc_interface[fd3d32691052b72b]::util::run_in_thread_pool_with_globals<rustc_interface[fd3d32691052b72b]::interface::run_compiler<core[69220df7bf4c31ea]::result::Result<(), rustc_errors[f55630e1eea6f6be]::ErrorGuaranteed>, rustc_driver[62b0704af81f9759]::run_compiler::{closure#1}>::{closure#0}, core[69220df7bf4c31ea]::result::Result<(), rustc_errors[f55630e1eea6f6be]::ErrorGuaranteed>>::{closure#0}::{closure#0}, core[69220df7bf4c31ea]::result::Result<(), rustc_errors[f55630e1eea6f6be]::ErrorGuaranteed>>::{closure#1} as core[69220df7bf4c31ea]::ops::function::FnOnce<()>>::call_once::{shim:vtable#0}
  29:     0x7f71083e2e33 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::hab9c0585205bcace
                               at /rustc/bed4ad65bf7a1cef39e3d66b3670189581b3b073/library/alloc/src/boxed.rs:1987:9
  30:     0x7f71083e2e33 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::hc1b376671ff7f5ba
                               at /rustc/bed4ad65bf7a1cef39e3d66b3670189581b3b073/library/alloc/src/boxed.rs:1987:9
  31:     0x7f71083e2e33 - std::sys::unix::thread::Thread::new::thread_start::h1e626d066d8b584f
                               at /rustc/bed4ad65bf7a1cef39e3d66b3670189581b3b073/library/std/src/sys/unix/thread.rs:108:17
  32:     0x7f71043198fd - <unknown>
  33:     0x7f710439ba60 - <unknown>
  34:                0x0 - <unknown>
error: could not compile `ice`

@Heliozoa Heliozoa added C-bug Category: This is a bug. I-ICE Issue: The compiler panicked, giving an Internal Compilation Error (ICE) ❄️ T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Oct 27, 2022
@WaffleLapkin
Copy link
Member

The crash happens in the metadata encoding here:

let position = self.map_or(0, |lazy| lazy.position.get());
let position: u32 = position.try_into().unwrap();

This code converts usize to u32 crashing if it doesn't fit which it doesn't in this case. For me the position is 4_328_556_769 which is a bit more than u32::MAX = 4_294_967_295.

MRE is

#![crate_type = "lib"]
const _FILE: &[u8] = include_bytes!("./file");

create a file with dd if=/dev/random of=file bs=1G count=1, then rustc t.rs (note that at least for me it takes 6 whole minutes to get to the ICE).

Not honestly sure what is a proper fix here.

@langston-barrett
Copy link
Contributor

@rustbot label +S-bug-has-mcve

@rustbot rustbot added the S-has-mcve Status: A Minimal Complete and Verifiable Example has been found for this issue label Mar 18, 2023
@soupslurpr
Copy link

soupslurpr commented Aug 27, 2023

Is there a workaround for this or some other way to include the file as bytes in the code?

@WaffleLapkin
Copy link
Member

@soupslurpr not that I'm aware of, I'm afraid.

Although as a bit of an update, @saethlin is currently working on lifting the u32 restriction from rmeta: rust-lang/compiler-team#666.

(we also need to check why it takes so long to include big files and possibly try to optimize the code there, but it's another thing)

@saethlin
Copy link
Member

You may be able to work around this by moving the large const or static to its own crate. That can only get you so far, but it should definitely get you close to the 4 GB limit that we currently have (that is fixed by the PR I just linked).

There's also a performance/memory issue that happens when you get to there though. I'm now working on that too, I'll link a PR to this issue if I come up with a fix.

@soupslurpr
Copy link

OK cool, I'll just wait for it to get merged since I'm doing something else right now. Thanks

@bors bors closed this as completed in d64c845 Aug 30, 2023
github-actions bot pushed a commit to rust-lang/miri that referenced this issue Aug 31, 2023
Adapt table sizes to the contents

This is an implementation of rust-lang/compiler-team#666

The objective of this PR is to permit the rmeta format to accommodate larger crates that need offsets larger than a `u32` can store without compromising performance for crates that do not need such range. The second commit is a number of tiny optimization opportunities I noticed while looking at perf recordings of the first commit.

The rmeta tables need to have fixed-size elements to permit lazy random access. But the size only needs to be fixed _per table_, not per element type. This PR adds another `usize` to the table header which indicates the table element size. As each element of a table is set, we keep track of the widest encoded table value, then don't bother encoding all the unused trailing bytes on each value. When decoding table elements, we copy them to a full-width array if they are not already full-width.

`LazyArray` needs some special treatment. Most other values that are encoded in tables are indexes or offsets, and those tend to be small so we get to drop a lot of zero bytes off the end. But `LazyArray` encodes _two_ small values in a fixed-width table element: A position of the table and the length of the table. The treatment described above could trim zero bytes off the table length, but any nonzero length shields the position bytes from the optimization. To improve this, we interleave the bytes of position and length. This change is responsible for about half of the crate metadata win on many crates.

Fixes rust-lang/rust#112934 (probably)
Fixes rust-lang/rust#103607
lnicola pushed a commit to lnicola/rust-analyzer that referenced this issue Apr 7, 2024
Adapt table sizes to the contents

This is an implementation of rust-lang/compiler-team#666

The objective of this PR is to permit the rmeta format to accommodate larger crates that need offsets larger than a `u32` can store without compromising performance for crates that do not need such range. The second commit is a number of tiny optimization opportunities I noticed while looking at perf recordings of the first commit.

The rmeta tables need to have fixed-size elements to permit lazy random access. But the size only needs to be fixed _per table_, not per element type. This PR adds another `usize` to the table header which indicates the table element size. As each element of a table is set, we keep track of the widest encoded table value, then don't bother encoding all the unused trailing bytes on each value. When decoding table elements, we copy them to a full-width array if they are not already full-width.

`LazyArray` needs some special treatment. Most other values that are encoded in tables are indexes or offsets, and those tend to be small so we get to drop a lot of zero bytes off the end. But `LazyArray` encodes _two_ small values in a fixed-width table element: A position of the table and the length of the table. The treatment described above could trim zero bytes off the table length, but any nonzero length shields the position bytes from the optimization. To improve this, we interleave the bytes of position and length. This change is responsible for about half of the crate metadata win on many crates.

Fixes rust-lang/rust#112934 (probably)
Fixes rust-lang/rust#103607
RalfJung pushed a commit to RalfJung/rust-analyzer that referenced this issue Apr 27, 2024
Adapt table sizes to the contents

This is an implementation of rust-lang/compiler-team#666

The objective of this PR is to permit the rmeta format to accommodate larger crates that need offsets larger than a `u32` can store without compromising performance for crates that do not need such range. The second commit is a number of tiny optimization opportunities I noticed while looking at perf recordings of the first commit.

The rmeta tables need to have fixed-size elements to permit lazy random access. But the size only needs to be fixed _per table_, not per element type. This PR adds another `usize` to the table header which indicates the table element size. As each element of a table is set, we keep track of the widest encoded table value, then don't bother encoding all the unused trailing bytes on each value. When decoding table elements, we copy them to a full-width array if they are not already full-width.

`LazyArray` needs some special treatment. Most other values that are encoded in tables are indexes or offsets, and those tend to be small so we get to drop a lot of zero bytes off the end. But `LazyArray` encodes _two_ small values in a fixed-width table element: A position of the table and the length of the table. The treatment described above could trim zero bytes off the table length, but any nonzero length shields the position bytes from the optimization. To improve this, we interleave the bytes of position and length. This change is responsible for about half of the crate metadata win on many crates.

Fixes rust-lang/rust#112934 (probably)
Fixes rust-lang/rust#103607
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. I-ICE Issue: The compiler panicked, giving an Internal Compilation Error (ICE) ❄️ S-has-mcve Status: A Minimal Complete and Verifiable Example has been found for this issue T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants