-
Notifications
You must be signed in to change notification settings - Fork 161
[Discussion] What should we do about HPACK never-indexed header fields? #194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
So the problem here is that its really up to the developers to decide which headers should not be compressed for security reasons. While it is sane to assume that by default (with maybe a simple list/set that can be set by the hyper / hpack user) we should not compress Authorization or Cookie headers, really any header containing session identifying information could be vulnerable to the attack. I've seen many a REST API, for example, using custom X- headers for storing session and authentication data. There would be no "catch all" for Hyper / HPACK as far as I can see, and since it's a library, providing that might be out of scope anyway. |
Speaking as nghttp2 maintainer, nghttp2 uses never-indexed for Authorization header fields, and Cookie header fields whose value is less than 20 bytes. The 20 bytes cookie comes from Firefox codebase. Since we'd like to avoid exact match, we care about short entropy cookie here. Authorization is typically short, so they are encoded in never indexed always. |
Thanks @tatsuhiro-t, that's extremely valuable feedback. It's also almost exactly what I was considering doing (except for the length limitation on cookies, which is a good idea). |
This is just my 2 cents, I am a bit hesitant to support the 20 byte limit: pulling off a successful CRIME attack really comes down to how many requests can be made against the system that compresses the sensitive header. It is definitely a "brute force" guessing attack, but it is made easier by the fact that you should be able to make many simultaneous requests against the compressing host given the resources (like a large bot network). At the very very least I'd say it should be an additional option that can be changed with an API call (and in general maybe we want to extend the ability to have a maximum byte length for any header set by the user to never be indexed). I hope that makes sense? |
For mitmproxy (or middleboxes on general), if we would like to follow the On Wed, 6 Apr 2016 07:03 Cory Benfield, [email protected] wrote:
|
@mhils @Lukasa Maybe what HPACK and eventually Hyper emits in terms of headers could also include an element specifying that it is not to be indexed? IIRC the hpack decoder takes care of determining which headers from the sender were not indexed but doesn't actually make that information available in any meaningful way. @Lukasa do you think it is something the table object could / should track? |
The table object shouldn't track it: that's part of the purpose of the never indexed headers, they shouldn't reside in the compression tables in any way. I'm beginning to wonder if we should transition to a "richer" representation of headers when generating them from HPACK, and then when working with them in hyper-h2. For example, we could use namedtuples that contain a value indicating whether they're indexed and emit those from HPACK and then work with them inside hyper-h2. That would allow us to attach richer metadata to each header (for now just the never-index flag) and provide the user with objects they can hook into to flag that appropriately. The way I see it, we can go a few different ways with this:
Of the set of ideas I think I prefer 4, even though it requires the most work, because it enables us to keep most of the current behaviour the same and provides the nicest API for checking whether headers are safe to index. It also makes mitmproxy's job easier (it can pass the entire header block back into hyper-h2 and just automatically get the right behaviour), and allows us to provide a declarative API to consumers that want to mark headers as never indexed ("instantiate this special kind of tuple!"). Thoughts? |
@Lukasa If I understand correctly, would we need to rework the interface for the encoder/decoder, or just the emissions of the decoder (HPACK) for no 4 which I agree sounds like the better solution. |
The encoder would need to be reworked too, but it's fairly minor: we can enhance the encoder to replace three-tuples with these special tuples. Shouldn't take long at all. =) |
So, the tuples would be implemented roughly like this: class HeaderTuple(tuple):
__slots__ = ()
indexable = True
def __new__(_cls, *args):
return tuple.__new__(_cls, args)
class NeverIndexedHeaderTuple(HeaderTuple):
__slots__ = ()
indexable = False This has a nice side-effect: because indexable isn't mutable (as no tuple fields are mutable), it becomes very difficult to accidentally break these header tuples. They also unpack exactly as one would expect, and altogether just behave properly. |
Do you really need to define |
@sigmavirus24 To get it to behave like a tuple you do. The tuple constructor doesn't actually take multiple arguments, it takes a single iterable. That feels really awkward when what you want to write is |
Ok, hpack v2.2.0 has just been shipped with the brand new tuple classes. We can plumb support through hyper-h2 now. |
This isn't done yet, silly GitHub. |
This is a complex issue and I may fork it out into multiple places once we've identified work items (if any), but I'd like to briefly talk about the problem.
RFC 7541 (the HPACK specification) provides support for what it calls "never indexed" header fields (§ 6.2.3). These fields have certain restrictions, which exist to serve one specific goal:
The core reasoning is discussed at length in RFC 7541 § 7.1, but can be summarised as follows. It is possible for attackers to mount attacks similar to the CRIME attack against the HPACK compression algorithm state. Put another way, if the attacker is capable of getting any entity that emits privacy-sensitive headers to emit headers of their own construction, they are potentially able to use the size of the responses to probe the compression state of the endpoint. That can expose users to the risk of having their credentials stolen: obviously very bad.
RFC 7541 points out that
The cases that worry me here are:
Happily, RFC 7541's "never indexed" literals exist to solve this problem. These header fields are sent in their literal form with one extra caveat: intermediaries MUST NOT translate them to any other form. That means that they never get added to the compression context of any HTTP/2 box in the network.
The purpose of this thread is to work out what hyper-h2 should do about this. The Python HPACK library has support for emitting headers in this form (since 1.1.0), and handles receiving them appropriately.
I have two questions:
I'd like to solicit answers to those questions from some people. I'm explicitly tagging the Hyper devs (@python-hyper/core), the mitmproxy devs (@Kriechi, @mhils), and some other people who care about this sort of thing (@jimcarreer, @bagder, @tatsuhiro-t) to get your ideas about this.
The text was updated successfully, but these errors were encountered: