Skip to content

Commit 97f43c0

Browse files
committed
#15160: Extend the new email parser to handle MIME headers.
This code passes all the same tests that the existing RFC mime header parser passes, plus a bunch of additional ones. There are a couple of commented out tests where there are issues with the folding. The folding doesn't normally get invoked for headers parsed from source, and the cases are marginal anyway (headers with invalid binary data) so I'm not worried about them, but will fix them after the beta. There are things that can be done to make this API even more convenient, but I think this is a solid foundation worth having. And the parser is a full RFC parser, so it handles cases that the current parser doesn't. (There are also probably cases where it fails when the current parser doesn't, but I haven't found them yet ;) Oh, yeah, and there are some really ugly bits in the parser for handling some 'postel' cases that are unfortunately common. I hope/plan to to eventually refactor a lot of the code in the parser which should reduce the line count...but there is no escaping the fact that the error recovery is welter of special cases.
1 parent 49c15d4 commit 97f43c0

File tree

6 files changed

+1918
-34
lines changed

6 files changed

+1918
-34
lines changed

Doc/library/email.headerregistry.rst

Lines changed: 70 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -234,11 +234,80 @@ headers.
234234
result in a :exc:`ValueError`.
235235

236236

237-
Each of the above classes also has a ``Unique`` variant (for example,
237+
Many of the above classes also have a ``Unique`` variant (for example,
238238
``UniqueUnstructuredHeader``). The only difference is that in the ``Unique``
239239
variant, :attr:`~.BaseHeader.max_count` is set to 1.
240240

241241

242+
.. class:: MIMEVersionHeader
243+
244+
There is really only one valid value for the :mailheader:`MIME-Version`
245+
header, and that is ``1.0``. For future proofing, this header class
246+
supports other valid version numbers. If a version number has a valid value
247+
per :rfc:`2045`, then the header object will have non-``None`` values for
248+
the following attributes:
249+
250+
.. attribute:: version
251+
252+
The version number as a string, with any whitespace and/or comments
253+
removed.
254+
255+
.. attribute:: major
256+
257+
The major version number as an integer
258+
259+
.. attribute:: minor
260+
261+
The minor version number as an integer
262+
263+
264+
.. class:: ParameterizedMIMEHeader
265+
266+
MOME headers all start with the prefix 'Content-'. Each specific header has
267+
a certain value, described under the class for that header. Some can
268+
also take a list of supplemental parameters, which have a common format.
269+
This class serves as a base for all the MIME headers that take parameters.
270+
271+
.. attrbibute:: params
272+
273+
A dictionary mapping parameter names to parameter values.
274+
275+
276+
.. class:: ContentTypeHeader
277+
278+
A :class:`ParameterizedMIMEHheader` class that handles the
279+
:mailheader:`Content-Type` header.
280+
281+
.. attribute:: content_type
282+
283+
The content type string, in the form ``maintype/subtype``.
284+
285+
.. attribute:: maintype
286+
287+
.. attribute:: subtype
288+
289+
290+
.. class:: ContentDispositionHeader
291+
292+
A :class:`ParameterizedMIMEHheader` class that handles the
293+
:mailheader:`Content-Disposition` header.
294+
295+
.. attribute:: content-disposition
296+
297+
``inline`` and ``attachment`` are the only valid values in common use.
298+
299+
300+
.. class:: ContentTransferEncoding
301+
302+
Handles the :mailheader:`Content-Transfer-Encoding` header.
303+
304+
.. attribute:: cte
305+
306+
Valid values are ``7bit``, ``8bit``, ``base64``, and
307+
``quoted-printable``. See :rfc:`2045` for more information.
308+
309+
310+
242311
.. class:: HeaderRegistry(base_class=BaseHeader, \
243312
default_class=UnstructuredHeader, \
244313
use_default_map=True)

0 commit comments

Comments
 (0)