-
Notifications
You must be signed in to change notification settings - Fork 383
feat(gist): fsspec file system for GitHub gists (resolves #888) #1791
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks for providing! I haven't had a chance to look yet, but I will soon :) |
Most welcome, no worries! 😃 |
Quick suggestion: it would be good to enable bundling the gist ID with the URL:
like github: allows. It would require enabling extracting kwargs from the URL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I read through, I found that you answered almost all of my questions elsewhere in the code :)
I don't think there's a good way to test this thoroughly, but at least we can reasonably expect gist to be available whenever GHA is running.
Please ping me when I should have another look |
Thanks for reviewing Martin, gotten sidetracked in a CI fixing rabbit hole this week I've thankfully emerged and can return to revisit this!
Will do 🫡 |
I just noticed this is still stalled. Please ask for help if you need it. |
Oh snap I’m sorry, yeah let me take a look… |
6d5f82d
to
a10d426
Compare
Updated now with the ability to specify just a single file (and ran the linter, sorry I missed that last time) Some TODOs:
|
a10d426
to
ba2b962
Compare
A little test that the URL is parsed as expected @pytest.mark.parametrize(
"gist_id,sha,file,token,user",
[
("my-gist-id-12345", "sha_hash_a0b1", "a_file.txt", "secret_token", "my-user"),
("my-gist-id-12345", "sha_hash_a0b1", "a_file.txt", "secret_token", ""),
("my-gist-id-12345", None, "a_file.txt", "secret_token", "my-user"), # No SHA
],
)
def test_gist_url_parse(gist_id, sha, file, token, user):
if sha:
fmt_str = f"gist://{user}:{token}@{gist_id}/{sha}/{file}"
else:
fmt_str = f"gist://{user}:{token}@{gist_id}/{file}"
parsed = GistFileSystem._get_kwargs_from_urls(fmt_str)
expected = {"gist_id": gist_id, "token": token}
if user: # Only include username if it's not empty
expected["username"] = user
if sha: # Only include SHA if it's specified
expected["sha"] = sha
assert parsed == expected |
ba2b962
to
e404148
Compare
e404148
to
be6d20d
Compare
Cool, all done. A "round trip" might be nice too |
Checks passed, 3.10 failed with an intermitten HTTP error from conda repodata (I don't have the ability to re-run it), LGTM |
I like it! Let's put it in, and see if the public has feedback once using it. |
This PR introduces a new filesystem backend,
GistFileSystem
, which allows read-only access to files within a single GitHub Gist (as suggested in #888). I'd find this really useful in combination with Universal Pathlib (also an fsspec project)!GithubFileSystem
but simplified for a single gist.Users can do:
For a private gist, the same but also passing
username
andtoken
args.ls
,_open
,cat
,invalidate_cache
), read-only impldocs/source/api.rst
.Example usage
Below is a short snippet showing how to retrieve files from a public gist:
⇣