@@ -296,9 +296,9 @@ that will be used if the ``filter`` argument is missing or ``None``.
296
296
297
297
If both the argument and attribute are ``None ``:
298
298
299
- * In Python 3.12-3.13, a ``DeprecationWarning `` will be emitted and
300
- extraction will use the ``'fully_trusted' `` filter.
301
- * In Python 3.14+, it will use the ``'data' `` filter.
299
+ * In Python 3.12-3.13, a ``DeprecationWarning `` will be emitted and
300
+ extraction will use the ``'fully_trusted' `` filter.
301
+ * In Python 3.14+, it will use the ``'data' `` filter.
302
302
303
303
Applications and system integrators may wish to change ``extraction_filter ``
304
304
of the ``TarFile `` class itself to set a global default.
@@ -343,7 +343,7 @@ New docs will tell users to consider:
343
343
* checking that filenames have expected extensions (discouraging files that
344
344
execute when you “click on them”, or extension-less files like Windows
345
345
special device names),
346
- * limiting the number of extracted, files total size of extracted data,
346
+ * limiting the number of extracted files, total size of extracted data,
347
347
and size of individual files,
348
348
* checking for files that would be shadowed on case-insensitive filesystems.
349
349
@@ -385,11 +385,12 @@ tarfile CLI
385
385
-----------
386
386
387
387
The CLI (``python -m tarfile ``) will gain a ``--filter `` option
388
- that will take the nams of one of the provided default filters.
388
+ that will take the name of one of the provided default filters.
389
389
It won't be possible to specify a custom filter function.
390
390
391
391
If ``--filter `` is not given, the CLI will use the default filter
392
- (``'legacy_warning' `` for a deprecation period, then ``'data' ``).
392
+ (``'fully_trusted' `` with a deprecation warning now, and ``'data' `` from
393
+ Python 3.14 on).
393
394
394
395
There will be no short option. (``-f `` would be confusingly similar to
395
396
the filename option of GNU ``tar ``.)
@@ -417,7 +418,9 @@ gains a ``filter`` argument, if it ever does).
417
418
418
419
If ``filter `` is not specified (or left as ``None ``), it won't be passed
419
420
on, so extracting a tarball will use the default filter
420
- (``'legacy_warning' `` for a deprecation period, then ``'data' ``).
421
+ (``'fully_trusted' `` with a deprecation warning now, and ``'data' `` from
422
+ Python 3.14 on).
423
+
421
424
422
425
Complex filters
423
426
---------------
@@ -433,6 +436,14 @@ For example, with a hypothetical ``StatefulFilter`` users would write::
433
436
434
437
A simple ``StatefulFilter `` example will be added to the docs.
435
438
439
+ .. note ::
440
+
441
+ The need for stateful filters is a reason against allowing
442
+ registration of custom filter names in addition to ``'fully_trusted' ``,
443
+ ``'tar' `` and ``'data' ``.
444
+ With such a mechanism, API for (at least) set-up and tear-down would need
445
+ to be set in stone.
446
+
436
447
437
448
Backwards Compatibility
438
449
=======================
@@ -564,10 +575,44 @@ Feature-wise, *tar format* and *UNIX-like filesystem* are essentially
564
575
equivalent, so ``tar `` is a good name.
565
576
566
577
567
- Open Issues
568
- ===========
578
+ Possible Further Work
579
+ =====================
569
580
570
- How far should this be backported?
581
+ Adding filters to zipfile and shutil.unpack_archive
582
+ ---------------------------------------------------
583
+
584
+ For consistency, :external+py3.11:mod: `zipfile ` and
585
+ :external+py3.11:func: `shutil.unpack_archive ` could gain support
586
+ for a ``filter `` argument.
587
+ However, this would require research that this PEP's author can't promise
588
+ for Python 3.12.
589
+
590
+ Filters for ``zipfile `` would probably not help security.
591
+ Zip is used primarily for cross-platform data bundles, and correspondingly,
592
+ :external+py3.11:meth: `ZipFile.extract <zipfile.ZipFile.extract> `'s defaults
593
+ are already similar to what a ``'data' `` filter would do.
594
+ A ``'fully_trusted' `` filter, which would *newly allow * absolute paths and
595
+ ``.. `` path components, might not be useful for much except
596
+ a unified ``unpack_archive `` API.
597
+
598
+ Filters should be useful for use cases other than security, but those
599
+ would usually need custom filter functions, and those would need API that works
600
+ with both :external+py3.11:class: `~tarfile.TarInfo ` and
601
+ :external+py3.11:class: `~zipfile.ZipInfo `.
602
+ That is *definitely * out of scope of this PEP.
603
+
604
+ If only this PEP is implemented and nothing changes for ``zipfile ``,
605
+ the effect for callers of ``unpack_archive `` is that the default
606
+ for *tar * files is changing from ``'fully_trusted' `` to
607
+ the more appropriate ``'data' ``.
608
+ In the interim period, Python 3.12-3.13 will emit ``DeprecationWarning ``.
609
+ That's annoying, but there are several ways to handle it: e.g. add a
610
+ ``filter `` argument conditionally, set ``TarFile.extraction_filter ``
611
+ globally, or ignore/suppress the warning until Python 3.14.
612
+
613
+ Also, since many calls to ``unpack_archive `` are likely to be unsafe,
614
+ there's hope that the ``DeprecationWarning `` will often turn out to be
615
+ a helpful hint to review affected code.
571
616
572
617
573
618
Thanks
0 commit comments