Skip to content

CompressStream and DecompressStream really handles deflate content encoding #64

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ajgarlag opened this issue Oct 21, 2016 · 12 comments
Closed

Comments

@ajgarlag
Copy link
Contributor

Q A
Bug? yes
New Feature? no
Version v1.4.0

Actual Behavior

An stream encoded with deflate content-encoding cannot be read by InflateStream, but with DecompressStream.

Expected Behavior

The deflate content-encoding should be managed by InflateStream and DeflateStream while
CompressStream and DecompressStream should handle compress content-encoding.

Steps to Reproduce

<?php

require 'vendor/autoload.php';

$payload = file_get_contents('http://httpbin.org/deflate');

$innerStream1 = new spec\Http\Message\Encoding\MemoryStream($payload);
$inflateStream = new Http\Message\Encoding\InflateStream($innerStream1);
$inflatedContents = $inflateStream->getContents();
var_dump($inflatedContents !== ""); //Should be true

$innerStream2 = new spec\Http\Message\Encoding\MemoryStream($payload);
$decompressStream = new Http\Message\Encoding\DecompressStream($innerStream2);
$decompressedContents = $decompressStream->getContents();
var_dump($decompressedContents === ""); //Should be true

var_dump($inflatedContents);
var_dump($decompressedContents);

$realContents = $decompressedContents;

$innerStream3 = new spec\Http\Message\Encoding\MemoryStream($realContents);
$deflateStream = new Http\Message\Encoding\DeflateStream($innerStream3);
$deflatedContents = $deflateStream->getContents();
var_dump($deflatedContents === $payload); //Should be true

$innerStream4 = new spec\Http\Message\Encoding\MemoryStream($realContents);
$compressStream = new Http\Message\Encoding\CompressStream($innerStream4);
$compressedContents = $compressStream->getContents();
var_dump($compressedContents !== $payload); //Should be true

Possible Solutions

To avoid BC breaks, I think we should add a warning in the documentation and in the class docblocks
and rename them in 2.0.
I'm looking for and HTTP endpoint to test compress content-encoding, but I have not found anyone.

@joelwurtz
Copy link
Member

joelwurtz commented Oct 21, 2016

Are we sure this is an error from our side ? it seems that the deflate of httpbin return in fact a compress, test it with :

<?php

$payload = file_get_contents('http://httpbin.org/deflate');

var_dump(gzinflate($payload));
var_dump(gzuncompress($payload));

@ajgarlag
Copy link
Contributor Author

ajgarlag commented Oct 21, 2016

That was my first thought, so I tested it at http://www.gidnetwork.com/tools/gzip-test.php which reports that http://httpbin.org/deflate compression type is deflate.

Also, reading http://php.net/manual/en/filters.compression.php I saw that all examples related to deflate encoding use a window option with value 15 which is the value used in CompressStream and DecompressStream.

It would be useful to have another endpoint to test the deflate content encoding.

@joelwurtz
Copy link
Member

This indicate that PHP may be wrong then as here : https://github.com/php/php-src/blob/master/ext/zlib/php_zlib.h#L31

You can see that PHP_ZLIB_ENCODING_RAW is -15, and this windows size is used by the gzdeflate / gzinflate php function as see here : https://github.com/php/php-src/blob/master/ext/zlib/zlib.c#L728

However php doc say in http://php.net/manual/en/function.gzdeflate.php is RFC 1950 + 1951 so it should use the windows size of 15 and not -15 no ?

@joelwurtz
Copy link
Member

So finally, PHP is right, it's the right window as a deflate stream like said in RFC 1951 is basically a RFC 1950 withtout the headers which is created in zlib by specifing a window of -15

However if you look at httpbin here : https://github.com/Runscope/httpbin/blob/master/httpbin/filters.py#L79

They use zlib.compressObj without any parameters which use by default a window size of 15 (like described here : https://docs.python.org/2/library/zlib.html)

This window size set a header like specified in RFC 1950 (but no compliant with 1951).

Both RFC use the same compression algorithm that's why i think http://www.gidnetwork.com/tools/gzip-test.php report this as being a deflate (to test, but i think it will indicate deflate with or without the headers)

@ajgarlag
Copy link
Contributor Author

The best definition for the deflate content encoding can be located at the RFC 7230 Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing which in section 4.2.2 describes the Deflate Coding with these words:

The "deflate" coding is a "zlib" data format [RFC1950] containing a
"deflate" compressed data stream [RFC1951] that uses a combination of
the Lempel-Ziv (LZ77) compression algorithm and Huffman coding.

Note: Some non-conformant implementations send the "deflate"
compressed data without the zlib wrapper.

Another test tool that identifies http://httpbin.org/deflate as deflate content encoding clarifies that:

deflate - despite its name the zlib compression (RFC 1950) should be used (in combination with the deflate compression (RFC 1951)) as described in the RFC 2616. The implementation in the real world however seems to vary between the zlib compression and the (raw) deflate compression. Due to this confusion, gzip has positioned itself as the more reliable default method.

So I think that the right window size to use is 15, or even better, the PHP constant ZLIB_ENCODING_DEFLATE.

Anyway, as long as major HTTP servers like Apache and Nginx only support GZIP compression, I think we must deprecate CompressStream, DecompressStream, DeflateStream and InflateStream in 2.0

@joelwurtz
Copy link
Member

Yes, right, deflate is the a windows size of 15, i think i misinterpret some functions names and http content encoding.

In fact there is no "compress" support in PHP, and we should not support this IMO as there is no interest (since everybody use deflate nowadays)

@joelwurtz
Copy link
Member

Finally,

I think our Stream is right, maybe the wording is a little wrong, but Compress is RFC 1950, and Deflate is RFC 1951. The only update here could maybe be a documentation change to explain better which one to use (in fact there is no interest to use directly the Deflate / Inflate Stream).

However the DecoderPlugin is wrong :

  • It should use the DecompressStream when having deflate in Transfer/Content Encoding
  • It should not support the compress value in this header (as there is no support in PHP, and nobody use it)

@ajgarlag
Copy link
Contributor Author

IMO we must even deprecate the deflate content-encoding in 2.0 and support only gzip compression.

@dbu
Copy link
Contributor

dbu commented Oct 26, 2016 via email

@ajgarlag
Copy link
Contributor Author

I think that deflate support should be deprecated because I could not found any endpoint (apart from http://httpbin.org/deflate) that support this content-encoding.

Anyway, it's trivial to fix, so maybe we should drop support for compress only.

@dbu
Copy link
Contributor

dbu commented Oct 27, 2016

as long as we can make reasonably sure that we handle it correctly and its not a huge effort, i'd prefer to keep it in for completeness sake.

@joelwurtz
Copy link
Member

Closing this as this issue was more in the client-common and was fixed a long time ago

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants