Skip to content

Conversation

gcanat
Copy link
Contributor

@gcanat gcanat commented Sep 23, 2024

As suggested in #137, I took inspiration from wc2 to improve wc_file_bytes.

In my benchmarks (tested with 1G utf8.txt file generated with wctool), it is a bit faster than wc2, and on par with wz, except when passing multiple files as argument because wz will parallelize computation, but I guess we don't want to pull rayon as a dependency here :)

@jgarzik
Copy link
Contributor

jgarzik commented Sep 23, 2024

Looks good at a quick glance.

The table initialization can be improved further with a static table, yes? e.g. https://www.reddit.com/r/rust/comments/1dokkh2/recipe_for_compiletime_sparse_array_initialization/

@gcanat
Copy link
Contributor Author

gcanat commented Sep 23, 2024

OK I made the change.
For what it's worth here's the result on 1G utf8.txt

tool branch time
posixutils-rs main 2.66s
posixutils-rs wc2 1.50s
wc2.c main 2.10s
wz main 1.57s
wc v8.32 15.33

@jgarzik jgarzik merged commit 1b1ab55 into rustcoreutils:main Sep 23, 2024
2 checks passed
@jgarzik
Copy link
Contributor

jgarzik commented Sep 23, 2024

Thanks, merged.

Interested in tackling wc_file_chars()? :)

@jgarzik
Copy link
Contributor

jgarzik commented Sep 23, 2024

Post-merge comment: was_space need not appear in CountInfo at all, as far as I can tell.

@gcanat
Copy link
Contributor Author

gcanat commented Sep 23, 2024

Oh yeah, I can have a look at wc_file_chars() in the coming days. And will try to handle was_space differently.

@jgarzik
Copy link
Contributor

jgarzik commented Sep 23, 2024

Oh yeah, I can have a look at wc_file_chars() in the coming days. And will try to handle was_space differently.

Thanks.

re was_space, it appears just a local, since it is not accumulated, and should not count across different files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants