@@ -43,7 +43,9 @@ Example HTML Parser Application
43
43
44
44
As a basic example, below is a simple HTML parser that uses the
45
45
:class: `HTMLParser ` class to print out start tags, end tags, and data
46
- as they are encountered::
46
+ as they are encountered:
47
+
48
+ .. testcode ::
47
49
48
50
from html.parser import HTMLParser
49
51
@@ -63,7 +65,7 @@ as they are encountered::
63
65
64
66
The output will then be:
65
67
66
- .. code-block :: none
68
+ .. testoutput ::
67
69
68
70
Encountered a start tag: html
69
71
Encountered a start tag: head
@@ -230,7 +232,9 @@ Examples
230
232
--------
231
233
232
234
The following class implements a parser that will be used to illustrate more
233
- examples::
235
+ examples:
236
+
237
+ .. testcode ::
234
238
235
239
from html.parser import HTMLParser
236
240
from html.entities import name2codepoint
@@ -266,13 +270,17 @@ examples::
266
270
267
271
parser = MyHTMLParser()
268
272
269
- Parsing a doctype::
273
+ Parsing a doctype:
274
+
275
+ .. doctest ::
270
276
271
277
>>> parser.feed(' <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" '
272
278
... ' "http://www.w3.org/TR/html4/strict.dtd">' )
273
279
Decl : DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"
274
280
275
- Parsing an element with a few attributes and a title::
281
+ Parsing an element with a few attributes and a title:
282
+
283
+ .. doctest ::
276
284
277
285
>>> parser.feed(' <img src="python-logo.png" alt="The Python logo">' )
278
286
Start tag: img
@@ -285,7 +293,9 @@ Parsing an element with a few attributes and a title::
285
293
End tag : h1
286
294
287
295
The content of ``script `` and ``style `` elements is returned as is, without
288
- further parsing::
296
+ further parsing:
297
+
298
+ .. doctest ::
289
299
290
300
>>> parser.feed(' <style type="text/css">#python { color: green }</style>' )
291
301
Start tag: style
@@ -300,35 +310,48 @@ further parsing::
300
310
Data : alert("<strong>hello!</strong>");
301
311
End tag : script
302
312
303
- Parsing comments::
313
+ Parsing comments:
314
+
315
+ .. doctest ::
304
316
305
- >>> parser.feed('<!-- a comment -->'
317
+ >>> parser.feed(' <!--a comment-->'
306
318
... ' <!--[if IE 9]>IE-specific content<![endif]-->' )
307
- Comment : a comment
319
+ Comment : a comment
308
320
Comment : [if IE 9]>IE-specific content<![endif]
309
321
310
322
Parsing named and numeric character references and converting them to the
311
- correct char (note: these 3 references are all equivalent to ``'>' ``)::
323
+ correct char (note: these 3 references are all equivalent to ``'>' ``):
312
324
325
+ .. doctest ::
326
+
327
+ >>> parser = MyHTMLParser()
328
+ >>> parser.feed(' >>>' )
329
+ Data : >>>
330
+
331
+ >>> parser = MyHTMLParser(convert_charrefs = False )
313
332
>>> parser.feed(' >>>' )
314
333
Named ent: >
315
334
Num ent : >
316
335
Num ent : >
317
336
318
337
Feeding incomplete chunks to :meth: `~HTMLParser.feed ` works, but
319
338
:meth: `~HTMLParser.handle_data ` might be called more than once
320
- (unless *convert_charrefs * is set to ``True ``)::
339
+ (unless *convert_charrefs * is set to ``True ``):
321
340
322
- >>> for chunk in ['<sp', 'an>buff', 'ered ', 'text</s', 'pan>']:
341
+ .. doctest ::
342
+
343
+ >>> for chunk in [' <sp' , ' an>buff' , ' ered' , ' text</s' , ' pan>' ]:
323
344
... parser.feed(chunk)
324
345
...
325
346
Start tag: span
326
347
Data : buff
327
348
Data : ered
328
- Data : text
349
+ Data : text
329
350
End tag : span
330
351
331
- Parsing invalid HTML (e.g. unquoted attributes) also works::
352
+ Parsing invalid HTML (e.g. unquoted attributes) also works:
353
+
354
+ .. doctest ::
332
355
333
356
>>> parser.feed(' <p><a class=link href=#main>tag soup</p ></a>' )
334
357
Start tag: p
0 commit comments