-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Closed
Labels
enhancementNew feature or requestNew feature or request
Description
Is your feature request related to a problem or challenge?
The size of datafusion's binary has grown significantly in the last few releases
This likely leads to higher compile times as well as larger overall binary size
| version | size of datafusion-cli binary |
|---|---|
main at 57d1309 |
92M |
43.0.0 |
87M |
42.0.0 |
83M |
41.0.0 |
72M |
40.0.0 |
69M |
39.0.0 |
68M |
The sizes are measured like this:
git checkout version
cd datafusion-cli
cargo build --release
du -h target/release/datafusion-cliAlso, people such as @g3blv have noticed that the WASM build has increased 50%:
#9834 (comment)
Describe the solution you'd like
I would like to reduce the binary size of DataFusion if possible
At least I would like to understand where the code size comes from and offer hints about how to reduce the size if needed
Describe alternatives you've considered
A common source of code size is templated functions (as that generates multiple copies of the same function(s)).
Here is some fascianting information from running cargo bloat -p datafusion
File .text Size Crate Name
0.1% 0.3% 79.7KiB blake2 blake2::Blake2bVarCore::compress
0.1% 0.2% 70.7KiB blake2 blake2::Blake2sVarCore::compress
0.1% 0.2% 67.1KiB sqlparser <sqlparser::ast::Statement as core::fmt::Display>::fmt
0.1% 0.2% 61.4KiB blake3 _blake3_hash4_neon
0.1% 0.2% 56.4KiB chrono_tz <chrono_tz::timezones::Tz as chrono_tz::timezone_impl::TimeSpans>::timespans
0.1% 0.2% 44.7KiB arrow_cast <i64 as lexical_write_integer::api::ToLexical>::to_lexical
0.1% 0.1% 42.8KiB arrow_cast arrow_cast::cast::cast_with_options
0.0% 0.1% 35.9KiB rand <rand_chacha::chacha::ChaCha12Core as rand_core::block::BlockRngCore>::generate
0.0% 0.1% 34.9KiB arrow_cast lexical_parse_float::slow::parse_mantissa
0.0% 0.1% 33.1KiB arrow_cast lexical_parse_float::parse::parse_complete
0.0% 0.1% 33.1KiB arrow_cast lexical_parse_float::parse::parse_complete
0.0% 0.1% 29.0KiB regex_automata regex_automata::hybrid::search::find_fwd
0.0% 0.1% 27.6KiB blake3 blake3::portable::compress_in_place
0.0% 0.1% 27.1KiB aho_corasick aho_corasick::automaton::try_find_fwd
0.0% 0.1% 25.2KiB sqlparser <sqlparser::ast::Expr as core::fmt::Display>::fmt
0.0% 0.1% 23.8KiB datafusion_common datafusion_common::scalar::ScalarValue::iter_to_array
0.0% 0.1% 23.7KiB datafusion_common datafusion_common::scalar::ScalarValue::iter_to_array
0.0% 0.1% 23.7KiB datafusion_physical_expr datafusion_common::scalar::ScalarValue::iter_to_array
0.0% 0.1% 23.7KiB datafusion_functions_aggregate datafusion_common::scalar::ScalarValue::iter_to_array
0.0% 0.1% 22.0KiB arrow_cast <u64 as lexical_write_integer::api::ToLexical>::to_lexical
36.7% 97.4% 27.7MiB And 139272 smaller methods. Use -n N to show more.
37.7% 100.0% 28.4MiB .text section size, the file size is 75.4MiBAdditional context
No response
comphead
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request