Skip to content

Conversation

piotrrzysko
Copy link
Contributor

First of all, thank you for this library!

This work is not finished yet, since computing the maximum compressed length is not implemented for all schemes. So far, I've added support only for the headless compression versions of bit-packing and variable byte.

I wanted to get quick feedback before spending more time analyzing the other schemes.

Motivation for this change

  1. I'd like to remove UncompressibleInputException. Handling it adds unnecessary complexity for applications using the library. If it's thrown, it also introduces a performance penalty, since we've already wasted time trying to compress "uncompressible" data.
  2. Having each scheme estimate its own maximum size allows for more precision than the generic formula previously used in the Compressor classes: input.length + input.length / 100 + 1024

Notes

For bit-packing and variable byte, the estimation could be made even more precise if we could pass the maximum bit width of the input array to maxHeadlessCompressedLength. In my case, this value is known, but I'm not sure if that applies to other schemes or whether it's a general enough case for the library.

@lemire
Copy link
Member

lemire commented Sep 29, 2025

This looks like a good addition.

Calculating the maximum compressed length is now implemented
for all schemes except:
* Kamikaze — as stated in the Javadoc, it is not intended for production use.
* VectorFastPFOR — it does not appear to be supported for production yet,
  as it is not included in the release package.
Since we can now calculate the maximum compressed length, there is no longer
a need to handle underestimated output array lengths by throwing this exception.
@piotrrzysko
Copy link
Contributor Author

I addressed the remaining schemes except:

  • Kamikaze - as stated in the Javadoc, it is not intended for production use
  • VectorFastPFOR - it does not appear to be supported for production yet, as it is not included in the release package

@piotrrzysko piotrrzysko marked this pull request as ready for review October 1, 2025 10:16
@lemire
Copy link
Member

lemire commented Oct 1, 2025

Running tests. If they go green, I will recommend merging.

@piotrrzysko
Copy link
Contributor Author

The build is green. Could you please merge the PR?

@lemire lemire merged commit f3c52d4 into fast-pack:master Oct 3, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants