Skip to content

Update documentation to better reflect how PyArrow interoperability works #634

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 14, 2021
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 14 additions & 7 deletions docs/source/features.rst
Original file line number Diff line number Diff line change
Expand Up @@ -94,13 +94,20 @@ object.
PyArrow integration
-------------------

`pyarrow`_ has its own internal idea of what a file-system is (``pyarrow.filesystem.FileSystem``),
`pyarrow`_ has its own internal idea of what a file-system is (``pyarrow.fs.FileSystem``),
and some functions, particularly the loading of parquet, require that the target be compatible.
As it happens, the design of the file-system interface in ``pyarrow`` *is* compatible with `fsspec`
(this is not by accident). Therefore at import time, ``fsspec`` checks for the existence of
``pyarrow``, and, if found, adds it to the superclasses of the spec base-class. In this manner,
all ``fsspec``-derived file-systems are also pyarrow file-systems, and can be used by pyarrow
functions.
As it happens, the design of the file-system interface in ``pyarrow`` *is* compatible with ``fsspec``
(this is not by accident).

At import time, ``fsspec`` checks for the existence of ``pyarrow``, and, if ``pyarrow < 2.0`` is
found, adds its base filesystem to the superclasses of the spec base-class.
For ``pyarrow >= 2.0``, ``fsspec`` file systems can simply be passed to ``pyarrow`` functions
that expect ``pyarrow`` filesystems, and ``pyarrow`` `will automatically wrap them
<https://arrow.apache.org/docs/python/filesystems.html#using-fsspec-compatible-filesystems>`_.

In this manner, all ``fsspec``-derived file-systems are also ``pyarrow`` file-systems, and can be used
by ``pyarrow`` functions.


.. _pyarrow: https://arrow.apache.org/docs/python/

Expand Down Expand Up @@ -377,7 +384,7 @@ Obviously, you should only define default values that are appropriate for
a given file system implementation. INI files only support string values.

Alternatively, you can provide overrides with environment variables of
the style "FSSPEC_{protocol}_{kwargname}=value".
the style ``FSSPEC_{protocol}_{kwargname}=value``.

Configuration is determined in the following order, with later items winning:

Expand Down