-
-
Notifications
You must be signed in to change notification settings - Fork 33.7k
bpo-11671: add header validation from http.client to wsgiref.headers.Headers #15299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…, and validate only when inserting a header
Lib/wsgiref/headers.py
Outdated
| if not _is_legal_header_name(header.encode('ascii')): | ||
| raise ValueError('Invalid header name %r' % (header,)) | ||
| if _is_illegal_header_value(value.encode('ascii')): | ||
| raise ValueError('Invalid header value %r' % (value,)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to initialize self.headers to an empty list and then call "for header, value in headers: self.add_header(header, value)". Then remove self._convert_string_type(k) calls below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added this. Why would we want to remove self._convert_string_type(k) calls though?
|
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
|
I have made the requested changes; please review again |
|
Thanks for making the requested changes! @vstinner: please review the changes made to this pull request. |
| ('ValidName', 'InvalidValue\r'), | ||
| ('ValidName', 'InvalidValue\n'), | ||
| (b'InvalidName', 'ValidValue', AssertionError), | ||
| ('ValidName', b'InvalidValue', AssertionError) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to get a TypeError rather than an AssertionError.
| ('ValidName', b'InvalidValue', AssertionError) | ||
| ) | ||
|
|
||
| def test_add_header_validation(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to merge the two tests a single test, to factorize code.
I suggest to remove the b'InvalidName' and b'InvalidValue' from validation_cases. Maybe create a second list, maybe put it directly in the test method to avoid with self.assertRaises(exception[0]) if len(exception) else self.assertRaisesRegex(ValueError, 'Invalid header'): complexity.
| '\r\n' | ||
| ) | ||
|
|
||
| validation_cases = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment mentioning bpo-11671.
Rather than an hardcoded list, why not building such list using a few invalid characters. For example for header names:
invalid_names = []
invalid_name_chars = ['\r', '\n', '\r\n', ':']
for invalid_chars in invalid_name_chars:
invalid_names.append(f'{invalid_chars}InvalidName')
invalid_names.append(f'Invalid{invalid_chars}Name')
invalid_names.append(f'InvalidName{invalid_chars}')
invalid_names.append(' InvalidName')
Are whitespaces (" ") and colon (":") only invalid at the beginning of the name?
| v = self._convert_string_type(v) | ||
| parts.append(_formatparam(k.replace('_', '-'), v)) | ||
| self._headers.append((self._convert_string_type(_name), "; ".join(parts))) | ||
| header = self._convert_string_type(_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you rename the variable to header_name (or maybe name, I didn't check if it's already used)?
| value = "; ".join(parts) | ||
| if not _is_legal_header_name(header.encode('ascii')): | ||
| raise ValueError('Invalid header name %r' % (header,)) | ||
| if _is_illegal_header_value(value.encode('ascii')): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
http/client.py encodes the value to Latin1, not ASCII
| ('ValidName', 'InvalidValue\n'), | ||
| (b'InvalidName', 'ValidValue', AssertionError), | ||
| ('ValidName', b'InvalidValue', AssertionError) | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there already tests for header names which cannot be encoded to ASCII? And tests for values not encodable to Latin1?
|
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
|
@epicfaace, please address the code review comments. Thank you! |
|
@epicfaace ping |
Adds header validation to disallow invalid characters in header names and values. This also fixes https://bugs.python.org/issue28778
I based this off the initial patch in bpo-11671, with a few changes:
I know that http.client's regex (added in bpo-22928, a112a8a) is not RFC compliant, but it seemed better to at least make both libraries have the same behavior with regards to header validation.
https://bugs.python.org/issue11671