Skip to content

baszalmstra/async-compression-bzip2-truncation-issue

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

async-compression BzDecoder Truncation Issue Reproducer

This repository demonstrates a data integrity issue in the async-compression crate where BzDecoder silently accepts truncated bzip2 streams without raising errors.

The Problem

When given a truncated bzip2 stream:

  • Sync bzip2::read::BzDecoder: Correctly fails with UnexpectedEof error
  • Async async-compression::BzDecoder: Silently succeeds with 0 bytes decompressed (no error!)

This can lead to silent data corruption in applications that rely on proper error handling for data integrity.

Quick Start

# Clone the repository
git clone https://github.com/YOUR_USERNAME/async-compression-bzip2-truncation-issue.git
cd async-compression-bzip2-truncation-issue

# Run the reproducer
cargo run

Expected Output

=== Demonstrating async-compression BzDecoder truncation issue ===

Original data size: 5400 bytes
Compressed data size: 142 bytes
Truncated data size: 71 bytes (50% of compressed)

--- Test 1: Sync bzip2::read::BzDecoder ---
✗ Sync decoder failed (this is expected for truncated data)
  Error: Custom { kind: UnexpectedEof, error: "decompression not finished but EOF reached" }

--- Test 2: Async async-compression BzDecoder ---
✓ Async decoder succeeded
  Bytes decompressed: 0
  Output size: 0 bytes
  ⚠ WARNING: Decompressed 0 bytes from truncated stream!
  🔴 ISSUE: No error was raised despite receiving truncated data!

=== Summary ===
The async-compression BzDecoder silently accepts truncated bzip2 streams
and returns success with 0 bytes decompressed, rather than raising an error.
This can lead to silent data corruption in applications that rely on proper
error handling for data integrity.

What This Demonstrates

The code in src/main.rs:

  1. Creates test data and compresses it with bzip2
  2. Truncates the compressed data to 50% of its original size
  3. Tests decompression with both sync and async decoders
  4. Shows that the sync decoder correctly errors, while the async decoder silently succeeds

Impact

This is a serious data integrity issue because:

  • Silent data loss: Applications cannot detect that decompression failed
  • Cache corruption: Package managers may cache incomplete data as "valid"
  • Security concerns: Applications expecting validation through error handling will accept corrupted streams

Real-World Context

This issue was discovered while implementing async tar.bz2 extraction for the Rattler package manager. Test suites that simulate network failures by truncating streams revealed that the async implementation silently accepted corrupted data where the sync implementation correctly raised errors.

Related Issue

See the upstream issue: async-compression#XXX

Environment

  • async-compression version: 0.4.32
  • bzip2 version: 0.6.1
  • tokio version: 1.48.0
  • Rust version: 1.83+

License

This reproducer is released under MIT OR Apache-2.0 (same as async-compression).

About

No description, website, or topics provided.

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages