Skip to content

Conversation

@epicfaace
Copy link
Contributor

@epicfaace epicfaace commented Aug 15, 2019

Adds header validation to disallow invalid characters in header names and values. This also fixes https://bugs.python.org/issue28778

I based this off the initial patch in bpo-11671, with a few changes:

  • only check whether headers are valid upon class initialization or calling add_header
  • use regex's from http.client, so that it disallows spaces in header names too

I know that http.client's regex (added in bpo-22928, a112a8a) is not RFC compliant, but it seemed better to at least make both libraries have the same behavior with regards to header validation.

https://bugs.python.org/issue11671

if not _is_legal_header_name(header.encode('ascii')):
raise ValueError('Invalid header name %r' % (header,))
if _is_illegal_header_value(value.encode('ascii')):
raise ValueError('Invalid header value %r' % (value,))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to initialize self.headers to an empty list and then call "for header, value in headers: self.add_header(header, value)". Then remove self._convert_string_type(k) calls below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this. Why would we want to remove self._convert_string_type(k) calls though?

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@epicfaace
Copy link
Contributor Author

I have made the requested changes; please review again

@bedevere-bot
Copy link

Thanks for making the requested changes!

@vstinner: please review the changes made to this pull request.

('ValidName', 'InvalidValue\r'),
('ValidName', 'InvalidValue\n'),
(b'InvalidName', 'ValidValue', AssertionError),
('ValidName', b'InvalidValue', AssertionError)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to get a TypeError rather than an AssertionError.

('ValidName', b'InvalidValue', AssertionError)
)

def test_add_header_validation(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to merge the two tests a single test, to factorize code.

I suggest to remove the b'InvalidName' and b'InvalidValue' from validation_cases. Maybe create a second list, maybe put it directly in the test method to avoid with self.assertRaises(exception[0]) if len(exception) else self.assertRaisesRegex(ValueError, 'Invalid header'): complexity.

'\r\n'
)

validation_cases = (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment mentioning bpo-11671.

Rather than an hardcoded list, why not building such list using a few invalid characters. For example for header names:

invalid_names = []
invalid_name_chars = ['\r', '\n', '\r\n', ':']
for invalid_chars in invalid_name_chars:
  invalid_names.append(f'{invalid_chars}InvalidName')
  invalid_names.append(f'Invalid{invalid_chars}Name')
  invalid_names.append(f'InvalidName{invalid_chars}')
invalid_names.append(' InvalidName')

Are whitespaces (" ") and colon (":") only invalid at the beginning of the name?

v = self._convert_string_type(v)
parts.append(_formatparam(k.replace('_', '-'), v))
self._headers.append((self._convert_string_type(_name), "; ".join(parts)))
header = self._convert_string_type(_name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you rename the variable to header_name (or maybe name, I didn't check if it's already used)?

value = "; ".join(parts)
if not _is_legal_header_name(header.encode('ascii')):
raise ValueError('Invalid header name %r' % (header,))
if _is_illegal_header_value(value.encode('ascii')):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

http/client.py encodes the value to Latin1, not ASCII

('ValidName', 'InvalidValue\n'),
(b'InvalidName', 'ValidValue', AssertionError),
('ValidName', b'InvalidValue', AssertionError)
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there already tests for header names which cannot be encoded to ASCII? And tests for values not encodable to Latin1?

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@csabella
Copy link
Contributor

csabella commented Feb 4, 2020

@epicfaace, please address the code review comments. Thank you!

@csabella
Copy link
Contributor

@epicfaace ping

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants