Skip to content

Commit 0a84bb5

Browse files
committed
Update documentation to reflect the primacy of 'files()' and remove the caveats about traversability. Add new caveats about limited support for resource readers.
1 parent 68c7db1 commit 0a84bb5

File tree

4 files changed

+88
-81
lines changed

4 files changed

+88
-81
lines changed

importlib_resources/docs/changelog.rst

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,12 +2,14 @@
22
importlib_resources NEWS
33
==========================
44

5-
1.1.0 (2020-01-19)
5+
1.1.0 (2020-02-16)
66
==================
77
* Add support for retrieving resources from subdirectories of packages
8-
through the new ``get()`` function, which returns a ``Traversable``
8+
through the new ``files()`` function, which returns a ``Traversable``
99
object with ``joinpath`` and ``read_*`` interfaces matching those
10-
of ``pathlib.Path`` objects.
10+
of ``pathlib.Path`` objects. This new function supersedes all of the
11+
previous functionality as it provides a more general-purpose access
12+
to a package's resources.
1113

1214
1.0.2 (2018-11-01)
1315
==================

importlib_resources/docs/index.rst

Lines changed: 9 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -7,17 +7,16 @@ in Python packages. It provides functionality similar to ``pkg_resources``
77
`Basic Resource Access`_ API, but without all of the overhead and performance
88
problems of ``pkg_resources``.
99

10-
In our terminology, a *resource* is a file that is located within an
11-
importable `Python package`_. Resources can live on the file system, in a zip
12-
file, or in any place that has a loader_ supporting the appropriate API for
13-
reading resources. Directories are not resources.
14-
15-
``importlib_resources`` is a backport of Python 3.7's standard library
16-
`importlib.resources`_ module for Python 2.7, and 3.4 through 3.6. Users of
17-
Python 3.7 and beyond are encouraged to use the standard library module, and
18-
in fact for these versions, ``importlib_resources`` just shadows that module.
10+
In our terminology, a *resource* is a file tree that is located within an
11+
importable `Python package`_. Resources can live on the file system or in a
12+
zip file, with limited support for loader_ supporting the appropriate API for
13+
reading resources.
14+
15+
``importlib_resources`` is a backport of Python 3.9's standard library
16+
`importlib.resources`_ module for Python 2.7, and 3.5 through 3.8. Users of
17+
Python 3.9 and beyond are encouraged to use the standard library module.
1918
Developers looking for detailed API descriptions should refer to the Python
20-
3.7 standard library documentation.
19+
3.9 standard library documentation.
2120

2221
The documentation here includes a general :ref:`usage <using>` guide and a
2322
:ref:`migration <migration>` guide for projects that want to adopt

importlib_resources/docs/migration.rst

Lines changed: 26 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -17,11 +17,6 @@ access`_ APIs:
1717
* ``pkg_resources.resource_listdir()``
1818
* ``pkg_resources.resource_isdir()``
1919

20-
Keep in mind that ``pkg_resources`` defines *resources* to include
21-
directories. ``importlib_resources`` does not treat directories as resources;
22-
since only files are allowed as resources, file names in the
23-
``importlib_resources`` API may *not* include path separators (e.g. slashes).
24-
2520

2621
pkg_resources.resource_filename()
2722
=================================
@@ -34,9 +29,11 @@ that ``pkg_resources()`` also *implicitly* cleans up this temporary file,
3429
without control over its lifetime by the programmer.
3530

3631
``importlib_resources`` takes a different approach. Its equivalent API is the
37-
``path()`` function, which returns a context manager providing a
38-
:py:class:`pathlib.Path` object. This means users have both the flexibility
39-
and responsibility to manage the lifetime of the temporary file. Note though
32+
``files()`` function, which returns a Traversable object implementing a
33+
subset of the
34+
:py:class:`pathlib.Path` interface suitable for reading the contents and
35+
provides a wrapper for creating a temporary file on the system in a
36+
context whose lifetime is managed by the user. Note though
4037
that if the resource is *already* on the file system, ``importlib_resources``
4138
still returns a context manager, but nothing needs to get cleaned up.
4239

@@ -46,7 +43,8 @@ Here's an example from ``pkg_resources()``::
4643

4744
The best way to convert this is with the following idiom::
4845

49-
with importlib_resources.path('my.package', 'resource.dat') as path:
46+
ref = importlib_resources.files('my.package') / 'resource.dat'
47+
with importlib_resources.trees.as_file(ref) as path:
5048
# Do something with path. After the with-statement exits, any
5149
# temporary file created will be immediately cleaned up.
5250

@@ -56,8 +54,9 @@ to stick around for a while? One way of doing this is to use an
5654

5755
from contextlib import ExitStack
5856
file_manager = ExitStack()
57+
ref = importlib_resources.files('my.package') / 'resource.dat'
5958
path = file_manager.enter_context(
60-
importlib_resources.path('my.package', 'resource.dat'))
59+
importlib_resources.trees.as_file(ref))
6160

6261
Now ``path`` will continue to exist until you explicitly call
6362
``file_manager.close()``. What if you want the file to exist until the
@@ -67,8 +66,9 @@ process exits, or you can't pass ``file_manager`` around in your code? Use an
6766
import atexit
6867
file_manager = ExitStack()
6968
atexit.register(file_manager.close)
69+
ref = importlib_resources.files('my.package') / 'resource.dat'
7070
path = file_manager.enter_context(
71-
importlib_resources.path('my.package', 'resource.dat'))
71+
importlib_resources.trees.as_file(ref))
7272

7373
Assuming your Python interpreter exits gracefully, the temporary file will be
7474
cleaned up when Python exits.
@@ -86,7 +86,8 @@ bytes. E.g.::
8686

8787
The equivalent code in ``importlib_resources`` is pretty straightforward::
8888

89-
with importlib_resources.open_binary('my.package', 'resource.dat') as fp:
89+
ref = importlib_resources.files('my.package').joinpath('resource.dat')
90+
with ref.open() as fp:
9091
my_bytes = fp.read()
9192

9293

@@ -103,7 +104,8 @@ following example is often written for clarity as::
103104

104105
This can be easily rewritten like so::
105106

106-
contents = importlib_resources.read_binary('my.package', 'resource.dat')
107+
ref = importlib_resources.files('my.package').joinpath('resource.dat')
108+
contents = f.read_bytes()
107109

108110

109111
pkg_resources.resource_listdir()
@@ -117,19 +119,18 @@ but it does not recurse into subdirectories, e.g.::
117119

118120
This is easily rewritten using the following idiom::
119121

120-
for entry in importlib_resources.contents('my.package.subpackage'):
121-
print(entry)
122+
for entry in importlib_resources.files('my.package.subpackage').iterdir():
123+
print(entry.name)
122124

123125
Note:
124126

125-
* ``pkg_resources`` does not require ``subpackage`` to be a Python package,
126-
but ``importlib_resources`` does.
127-
* ``importlib_resources.contents()`` returns an iterator, not a concrete
128-
sequence.
127+
* ``Traversable.iterdir()`` returns *all* the entries in the
128+
subpackage, i.e. both resources (files) and non-resources (directories).
129+
* ``Traversable.iterdir()`` returns additional traversable objects, which if
130+
directories can also be iterated over (recursively).
131+
* ``Traversable.iterdir()``, like ``pathlib.Path`` returns an iterator, not a
132+
concrete sequence.
129133
* The order in which the elements are returned is undefined.
130-
* ``importlib_resources.contents()`` returns *all* the entries in the
131-
subpackage, i.e. both resources (files) and non-resources (directories). As
132-
with ``pkg_resources.listdir()`` it does not recurse.
133134

134135

135136
pkg_resources.resource_isdir()
@@ -141,20 +142,10 @@ a package is a directory or not::
141142
if pkg_resources.resource_isdir('my.package', 'resource'):
142143
print('A directory')
143144

144-
Because ``importlib_resources`` explicitly does not define directories as
145-
resources, there's no direct equivalent. However, you can ask whether a
146-
particular resource exists inside a package, and since directories are not
147-
resources you can infer whether the resource is a directory or a file. Here
148-
is a way to do that::
145+
The ``importlib_resources`` equivalent is straightforward::
149146

150-
from importlib_resources import contents, is_resource
151-
if 'resource' in contents('my.package') and \
152-
not is_resource('my.package', 'resource'):
153-
print('It must be a directory')
154-
155-
The reason you have to do it this way and not just call
156-
``not is_resource('my.package', 'resource')`` is because this conditional will
157-
also return False when ``resource`` is not an entry in ``my.package``.
147+
if importlib_resources.files('my.package').joinpath('resource').isdir():
148+
print('A directory')
158149

159150

160151
.. _`basic resource access`: http://setuptools.readthedocs.io/en/latest/pkg_resources.html#basic-resource-access

importlib_resources/docs/using.rst

Lines changed: 48 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,8 @@ If you have a file system layout such as::
2323
one/
2424
__init__.py
2525
resource1.txt
26+
resources1/
27+
resource1.1.txt
2628
two/
2729
__init__.py
2830
resource2.txt
@@ -41,27 +43,40 @@ Each import statement gives you a Python *module* corresponding to the
4143
packages since packages are just special module instances that have an
4244
additional attribute, namely a ``__path__`` [#fn2]_.
4345

44-
In this analogy then, resources are just files within a package directory, so
46+
In this analogy then, resources are just files or directories contained in a
47+
package directory, so
4548
``data/one/resource1.txt`` and ``data/two/resource2.txt`` are both resources,
46-
as are the ``__init__.py`` files in all the directories. However the package
47-
directories themselves are *not* resources; anything that contains other
48-
things (i.e. directories) are not themselves resources.
49-
50-
Resources are always accessed relative to the package that they live in. You
51-
cannot access a resource within a subdirectory inside a package. This means
52-
that ``resource1.txt`` is a resource within the ``data.one`` package, but
53-
neither ``resource2.txt`` nor ``two/resource2.txt`` are resources within the
54-
``data`` package. If a directory isn't a package, it can't be imported and
55-
thus can't contain resources.
56-
57-
Even when this hierarchical structure isn't represented by physical files and
58-
directories, the model still holds. So zip files can contain packages and
59-
resources, as could databases or other storage medium. In fact, while
60-
``importlib_resources`` supports physical file systems and zip files by
61-
default, anything that can be loaded with a Python import system `loader`_ can
62-
provide resources, as long as the loader implements the `ResourceReader`_
63-
abstract base class.
49+
as are the ``__init__.py`` files in all the directories.
6450

51+
Resources are always accessed relative to the package that they live in.
52+
``resource1.txt`` and ``resources1/resource1.1.txt`` are resources within
53+
the ``data.one`` package, and
54+
``two/resource2.txt`` is a resource within the
55+
``data`` package.
56+
57+
58+
Caveats
59+
=======
60+
61+
Subdirectory Access
62+
-------------------
63+
64+
Prior to importlib_resources 1.1 and the ``files()`` API, resources that were
65+
not direct descendents of a package's folder were inaccessible through the
66+
API, so in the example above ``resources1/resource1.1`` is not a resource of
67+
the ``data.one`` package and ``two/resource2.txt`` is not a resource of the
68+
``data`` package. Therefore, if subdirectory access is required, use the
69+
``files()`` API.
70+
71+
Resource Reader Support
72+
-----------------------
73+
74+
Due to the limitations on resource readers to access files beyond direct
75+
descendents of a package, the ``files()`` API does not rely
76+
on the importlib ResourceReader interface and thus only supports resources
77+
exposed by the built-in path and zipfile loaders. If support for arbitrary
78+
resource readers is required, the other API functions still support loading
79+
those resources.
6580

6681
Example
6782
=======
@@ -93,25 +108,22 @@ This requires you to make Python packages of both ``email/tests`` and
93108
``email/tests/data``, by placing an empty ``__init__.py`` files in each of
94109
those directories.
95110

96-
**This is a requirement for importlib_resources too!**
97-
98111
The problem with the ``pkg_resources`` approach is that, depending on the
99-
structure of your package, ``pkg_resources`` can be very inefficient even to
100-
just import. ``pkg_resources`` is a sort of grab-bag of APIs and
101-
functionalities, and to support all of this, it sometimes has to do a ton of
102-
work at import time, e.g. to scan every package on your ``sys.path``. This
112+
packages in your environment, ``pkg_resources`` can be expensive
113+
just to import. This behavior
103114
can have a serious negative impact on things like command line startup time
104115
for Python implement commands.
105116

106-
``importlib_resources`` solves this by being built entirely on the back of the
117+
``importlib_resources`` solves this performance challenge by being built
118+
entirely on the back of the
107119
stdlib :py:mod:`importlib`. By taking advantage of all the efficiencies in
108120
Python's import system, and the fact that it's built into Python, using
109121
``importlib_resources`` can be much more performant. The equivalent code
110122
using ``importlib_resources`` would look like::
111123

112-
from importlib_resources import read_text
124+
from importlib_resources import files
113125
# Reads contents with UTF-8 encoding and returns str.
114-
eml = read_text('email.tests.data', 'message.eml')
126+
eml = files('email.tests.data').joinpath('message.eml').read_text()
115127

116128

117129
Packages or package names
@@ -124,7 +136,7 @@ passed in, it must name an importable Python package, and this is first
124136
imported. Thus the above example could also be written as::
125137

126138
import email.tests.data
127-
eml = read_text(email.tests.data, 'message.eml')
139+
eml = files(email.tests.data).joinpath('message.eml').read_text()
128140

129141

130142
File system or zip file
@@ -143,8 +155,11 @@ to this temporary file as a :py:class:`pathlib.Path` object. In order to
143155
properly clean up this temporary file, what's actually returned is a context
144156
manager that you can use in a ``with``-statement::
145157

146-
from importlib_resources import path
147-
with path(email.tests.data, 'message.eml') as eml:
158+
from importlib_resources import files
159+
from importlib_resources.trees import as_file
160+
161+
source = files(email.tests.data).joinpath('message.eml')
162+
with as_file(source) as eml:
148163
third_party_api_requiring_file_system_path(eml)
149164

150165
You can use all the standard :py:mod:`contextlib` APIs to manage this context
@@ -162,12 +177,12 @@ manager.
162177
**No**::
163178

164179
sys.path.append('relative/path/to/foo.whl')
165-
resource_bytes('foo/data.dat') # This will fail!
180+
files('foo') # This will fail!
166181

167182
**Yes**::
168183

169184
sys.path.append(os.path.abspath('relative/path/to/foo.whl'))
170-
resource_bytes('foo/data.dat')
185+
files('foo')
171186

172187
Both relative and absolute paths work for Python 3.7 and newer.
173188

0 commit comments

Comments
 (0)