Commit 661d75a
authored
BUG: Snappy checksum check (#2252)
<!--
Thanks for opening a pull request!
-->
# Rationale for this change
The `SnappyCodec.decompress()` method has a bug where the CRC32 checksum
is extracted from the compressed data **after** the data has already
been truncated to remove the checksum. This results in reading the wrong
4 bytes for checksum validation, causing the CRC32 check to fail
incorrectly.
**Root Cause:**
In the current implementation:
1. `data = data[0:-4]` removes the last 4 bytes (checksum) from the data
2. `checksum = data[-4:]` then tries to get the checksum from the
already-truncated data
3. This means `checksum` contains the wrong bytes (last 4 bytes of
compressed data instead of the actual checksum)
**Solution:**
Extract the checksum **before** truncating the data:
```python
checksum = data[-4:] # store checksum before truncating data
data = data[0:-4] # remove checksum from the data
```
This ensures data integrity checks work correctly for snappy-compressed
Avro data.
# Are these changes tested?
The fix resolves the logical error in the checksum extraction order.
Existing tests should pass, and any snappy-compressed data with valid
checksums will now decompress successfully instead of failing with
"Checksum failure" errors.
The change is minimal and only reorders two existing lines of code,
making it low-risk.
# Are there any user-facing changes?
**Yes** - This is a bug fix that improves functionality:
- **Before:** Snappy-compressed Avro data would fail to decompress with
"Checksum failure" errors even when the data and checksum were valid
- **After:** Snappy-compressed Avro data with valid checksums will
decompress correctly
This fix resolves data integrity validation issues for users working
with snappy-compressed Avro files. No API changes are introduced.1 parent 58e5ad6 commit 661d75a
1 file changed
+2
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
51 | 51 | | |
52 | 52 | | |
53 | 53 | | |
54 | | - | |
| 54 | + | |
| 55 | + | |
55 | 56 | | |
56 | | - | |
57 | 57 | | |
58 | 58 | | |
59 | 59 | | |
| |||
0 commit comments