Skip to content

Conversation

@sorah
Copy link
Member

@sorah sorah commented Aug 10, 2023

zstream_discard_input was encoding and character-aware when given input is user-provided, so this discards len chars instead of len bytes.

Also Zlib.gunzip explains in its rdoc that it is equivalent with the following code, but this doesn't fail for UTF-8 String.

string = %w[1f8b0800c28000000003cb48cdc9c9070086a6103605000000].pack("H*").force_encoding('UTF-8')
sio = StringIO.new(string)
gz = Zlib::GzipReader.new(sio, encoding: Encoding::ASCII_8BIT)
p gz.read #=> "hello"
gz&.close
p Zlib.gunzip(string) #=> Zlib::DataError

Reported and discovered by @eagletmt at https://twitter.com/eagletmt/status/1689692467929694209

zstream_discard_input was encoding and character-aware when given input is user-provided, so this discards `len` chars instead of `len` bytes.

Also Zlib.gunzip explains in its rdoc that it is equivalent with the following code, but this doesn't fail for UTF-8 String.

```ruby
string = %w[1f8b0800c28000000003cb48cdc9c9070086a6103605000000].pack("H*").force_encoding('UTF-8')
sio = StringIO.new(string)
gz = Zlib::GzipReader.new(sio, encoding: Encoding::ASCII_8BIT)
p gz.read
gz&.close
```

Reported and discovered by eagletmt at https://twitter.com/eagletmt/status/1689692467929694209
@sorah sorah merged commit c5e58bc into master Aug 10, 2023
@sorah sorah deleted the subseq branch August 10, 2023 20:12
matzbot pushed a commit to ruby/ruby that referenced this pull request Aug 10, 2023
(ruby/zlib#55)

zstream_discard_input was encoding and character-aware when given input is user-provided, so this discards `len` chars instead of `len` bytes.

Also Zlib.gunzip explains in its rdoc that it is equivalent with the following code, but this doesn't fail for UTF-8 String.

```ruby
string = %w[1f8b0800c28000000003cb48cdc9c9070086a6103605000000].pack("H*").force_encoding('UTF-8')
sio = StringIO.new(string)
p gz.read #=> "hello"
gz&.close
p Zlib.gunzip(string) #=> Zlib::DataError
```

Reported and discovered by eagletmt at https://twitter.com/eagletmt/status/1689692467929694209

ruby/zlib@c5e58bc62a
@sorah sorah changed the title Zlib.gunzip should not fail with utf-8 strings Zlib.gunzip should not fail with non-ascii-8bit strings Aug 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants