Skip to content

Commit ce9d708

Browse files
[3.13] gh-131535: Fix stale example in html.parser docs, make examples doctests (GH-131551) (GH-133587)
(cherry picked from commit ee76e36) Co-authored-by: Brian Schubert <[email protected]>
1 parent ac99d7e commit ce9d708

File tree

1 file changed

+37
-14
lines changed

1 file changed

+37
-14
lines changed

Doc/library/html.parser.rst

+37-14
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,9 @@ Example HTML Parser Application
4343

4444
As a basic example, below is a simple HTML parser that uses the
4545
:class:`HTMLParser` class to print out start tags, end tags, and data
46-
as they are encountered::
46+
as they are encountered:
47+
48+
.. testcode::
4749

4850
from html.parser import HTMLParser
4951

@@ -63,7 +65,7 @@ as they are encountered::
6365

6466
The output will then be:
6567

66-
.. code-block:: none
68+
.. testoutput::
6769

6870
Encountered a start tag: html
6971
Encountered a start tag: head
@@ -230,7 +232,9 @@ Examples
230232
--------
231233

232234
The following class implements a parser that will be used to illustrate more
233-
examples::
235+
examples:
236+
237+
.. testcode::
234238

235239
from html.parser import HTMLParser
236240
from html.entities import name2codepoint
@@ -266,13 +270,17 @@ examples::
266270

267271
parser = MyHTMLParser()
268272

269-
Parsing a doctype::
273+
Parsing a doctype:
274+
275+
.. doctest::
270276

271277
>>> parser.feed('<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" '
272278
... '"http://www.w3.org/TR/html4/strict.dtd">')
273279
Decl : DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"
274280

275-
Parsing an element with a few attributes and a title::
281+
Parsing an element with a few attributes and a title:
282+
283+
.. doctest::
276284

277285
>>> parser.feed('<img src="python-logo.png" alt="The Python logo">')
278286
Start tag: img
@@ -285,7 +293,9 @@ Parsing an element with a few attributes and a title::
285293
End tag : h1
286294

287295
The content of ``script`` and ``style`` elements is returned as is, without
288-
further parsing::
296+
further parsing:
297+
298+
.. doctest::
289299

290300
>>> parser.feed('<style type="text/css">#python { color: green }</style>')
291301
Start tag: style
@@ -300,35 +310,48 @@ further parsing::
300310
Data : alert("<strong>hello!</strong>");
301311
End tag : script
302312

303-
Parsing comments::
313+
Parsing comments:
314+
315+
.. doctest::
304316

305-
>>> parser.feed('<!-- a comment -->'
317+
>>> parser.feed('<!--a comment-->'
306318
... '<!--[if IE 9]>IE-specific content<![endif]-->')
307-
Comment : a comment
319+
Comment : a comment
308320
Comment : [if IE 9]>IE-specific content<![endif]
309321

310322
Parsing named and numeric character references and converting them to the
311-
correct char (note: these 3 references are all equivalent to ``'>'``)::
323+
correct char (note: these 3 references are all equivalent to ``'>'``):
312324

325+
.. doctest::
326+
327+
>>> parser = MyHTMLParser()
328+
>>> parser.feed('&gt;&#62;&#x3E;')
329+
Data : >>>
330+
331+
>>> parser = MyHTMLParser(convert_charrefs=False)
313332
>>> parser.feed('&gt;&#62;&#x3E;')
314333
Named ent: >
315334
Num ent : >
316335
Num ent : >
317336

318337
Feeding incomplete chunks to :meth:`~HTMLParser.feed` works, but
319338
:meth:`~HTMLParser.handle_data` might be called more than once
320-
(unless *convert_charrefs* is set to ``True``)::
339+
(unless *convert_charrefs* is set to ``True``):
321340

322-
>>> for chunk in ['<sp', 'an>buff', 'ered ', 'text</s', 'pan>']:
341+
.. doctest::
342+
343+
>>> for chunk in ['<sp', 'an>buff', 'ered', ' text</s', 'pan>']:
323344
... parser.feed(chunk)
324345
...
325346
Start tag: span
326347
Data : buff
327348
Data : ered
328-
Data : text
349+
Data : text
329350
End tag : span
330351

331-
Parsing invalid HTML (e.g. unquoted attributes) also works::
352+
Parsing invalid HTML (e.g. unquoted attributes) also works:
353+
354+
.. doctest::
332355

333356
>>> parser.feed('<p><a class=link href=#main>tag soup</p ></a>')
334357
Start tag: p

0 commit comments

Comments
 (0)