Skip to content

Treat RefResolver cache args as decorators. #333

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

avian2
Copy link

@avian2 avian2 commented May 8, 2017

Currently, RefResolver class expects urljoin_cache and remote_cache arguments to be functions that in turn call urljoin and RefResolver.resolve_from_url methods.

This makes it inconvenient to provide a custom cache implementation: in order to pass it to the constructor, you need to make a function that calls the resolve_from_url method of a RefResolver instance before the said instance exists. This is possible, but usually ugly.

With this change, the constructor treats these arguments as decorator objects. For instance, this makes functools.lru_cache(16) an acceptable value for remote_cache. Existing docstrings already seem to suggest that this should be possible.

I'm not sure how much the urljoin_cache and remote_cache arguments are used in the wild. This change obviously breaks compatibility with existing code that uses them.

This makes it more convenient to provide custom cache implementations.
avian2 added a commit to avian2/jsonmerge that referenced this pull request May 9, 2017
urljoin_cache_dec and remote_cache_dec can be used to provide caching
decorators, while old urljoin_cache and remote_cache can still be used to
directly provide callables.
@avian2
Copy link
Author

avian2 commented May 17, 2017

I've made this change backwards compatible.

New urljoin_cache_dec and remote_cache_dec arguments can now be used to provide caching decorators. Old urljoin_cache and remote_cache still accept pre-decorated functions.

@Julian
Copy link
Member

Julian commented May 17, 2017

Hey -- sorry, I was sure I responded here with something about backwards compatibility but apparently I managed to not actually do that :/

Yeah! Backwards compatible is definitely necessary here -- I can see the pain, and will have a look at the changes, but yeah we'd need to preserve whatever's here regardless.

Will try to post a more useful set of comments, I'm actually travelling so will have to find a bit of time to have a look, but really appreciated certainly.

@Julian
Copy link
Member

Julian commented May 20, 2017

So as I'm getting ready for Draft 6, I'm noticing some more things wrong with the RefResolver interface which is making me think this needs a bit of thought before just doing.

Specifically, resolve_fragment is kind of the wrong interface, because you can change scope while you're resolving, and it will affect refs within the thing you resolve to.

There's also still just a ton of stupid methods on RefResolver -- having the *_scope methods be public really bothers me.

I think this needs a bit of rethinking overall rather than patching things yet again, it's kind of a mess already :/

Thoughts / suggestions welcome of course!

@handrews
Copy link

@Julian I actually wrote an id-processor / reference resolver in Python at one point, and then decided that when I eventually get to building a python hyper-schema client it would probably make more sense to use your validator as a base anyway. I can dig it up if you want- may not be suitable for direct use but it might be nice to compare implementation hazards.

@Julian
Copy link
Member

Julian commented May 20, 2017

@handrews that'd definitely be appreciated!

My inclination is actually to move ref resolution into a new external package, with a correct interface, and to deprecate everything here, after reimplementing it on top of it.

@handrews
Copy link

I'll look into it on Monday. It's not too far off from being able to be its own library, although whether it's the API you'd want I don't know (I don't actually remember the exact state it's in- I last touched it around December, I think).

@avian2
Copy link
Author

avian2 commented May 21, 2017

@Julian I find current resolve_fragment useful in jsonmerge to resolve pointers in the scope of different subparts of the document (arrayMergeById, idRef option here, code).

While on topic of reworking the RefResolver interface. My other wish would be to have an easy way to determine if a reference is internal or external before resolving it. jsonmerge often doesn't need to follow external references. Also, arbitrary HTTP requests might be a security issue for some people (similar to how XML external entities can be).

@Julian
Copy link
Member

Julian commented May 23, 2017

@Julian I find current resolve_fragment useful in jsonmerge to resolve pointers in the scope of different subparts of the document (arrayMergeById, idRef option here, code).

Makes sense, will try to keep this use case in mind as well, we can probably come up with something that suits both that's still cleaner than what we've got here.

While on topic of reworking the RefResolver interface. My other wish would be to have an easy way to determine if a reference is internal or external before resolving it. jsonmerge often doesn't need to follow external references. Also, arbitrary HTTP requests might be a security issue for some people (similar to how XML external entities can be).

Would like to hear more about this. In the current way of dealing with things, this is the difference between resolve_from_url or not -- I've seen people who don't want to resolve remote refs override that method and make it raise an exception (even though subclassing isn't supported in jsonschema), so yeah it seems like there should be a reasonable API for swapping that method out.

@avian2
Copy link
Author

avian2 commented May 24, 2017

Would like to hear more about this. In the current way of dealing with things, this is the difference between resolve_from_url or not -- I've seen people who don't want to resolve remote refs override that method and make it raise an exception

I've made myself a is_remote_ref() method here that seems to work.

Raising an exception from resolve_remote (or resolve_from_url) was awkward for me because with code like this:

with resolver.resolving(ref) as resolved:
    ...

It's AFAIK impossible to cleanly catch and handle an exception that originates from resolving() context manager and not at the same time catch exceptions from the ... block with-in. For example, approaches discussed here don't work because in the contrast to open() the exception here only gets raised on __enter__() call.

@Julian
Copy link
Member

Julian commented May 24, 2017

contextlib.ExitStack is the way to do that pretty much. But yeah context managers are a bit painful in these ways. The duplication there is part of the interface bloat I was referring to.

@avian2
Copy link
Author

avian2 commented Mar 23, 2018

I updated this pull request for the current master branch.

As far as I can see, the remaining Sphinx "broken link" errors that are still thrown by Travis don't have anything to do with my changes.

@Julian
Copy link
Member

Julian commented Apr 3, 2018

@avian2 hrm.. will have a look to see why Travis is complaining. Thanks for updating this, I still don't quite know how I feel about it :/, but it's been sitting here long enough that I'm starting to lean towards merging it despite my desire to revamp the whole interface.

@codecov
Copy link

codecov bot commented Jan 30, 2019

Codecov Report

Merging #333 into master will increase coverage by 0.02%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     python-jsonschema/jsonschema#333      +/-   ##
==========================================
+ Coverage   96.31%   96.33%   +0.02%     
==========================================
  Files          19       19              
  Lines        2386     2403      +17     
  Branches      312      314       +2     
==========================================
+ Hits         2298     2315      +17     
  Misses         75       75              
  Partials       13       13              

Julian added a commit that referenced this pull request Apr 25, 2020
d17e1ba2 [242] Add pattern tests for numbers, objects, arrays, and null
a5302a46 [329] Add tests for uniqueItems=false
0bf6358f [166] Add null-in-enum test
807591f0 Merge pull request #340 from karenetheridge/ether/markdown-conflict-markers
e4c82f21 disambiguate underline from git conflict marker
c4d3e0f9 Merge pull request #333 from zhxiaogg/tests-for-duration-format
618b30dc not all duration impl support year and month
926d0b3d tests for duration format
22eaffcd Merge pull request #331 from jdanyow/patch-2
b2e553e9 Merge pull request #330 from jdanyow/patch-1
9db34a17 remove unused constant
cbf4654d fix typo in ref test description
44517887 Merge pull request #326 from awwright/master
0da2aaa1 Backport nested allOf/anyOf/oneOf tests
34a2dfa3 Add nested allOf, oneOf tests

git-subtree-dir: json
git-subtree-split: d17e1ba218bbd023d45182e014c0ce341bb36b16
@Julian Julian closed this Jul 15, 2020
@avian2
Copy link
Author

avian2 commented Jul 16, 2020

Was this PR closed by mistake? I understand if you don't want to merge this. I'm just asking because the message in the squashed commit that's linked above makes it look like it includes my changes, but it doesn't actually. The GitHub issue # numbers don't match the repo names.

@Julian
Copy link
Member

Julian commented Jul 16, 2020

Oy. Hi @avian2 It was indeed closed by mistake though not from the push (which yeah mingles issue #s across repos but it's a known issue), probably just me somehow misclicking while I wasn't paying attention.

Going to reopen but yeah... somehow in all this time I still don't feel comfortable merging this, so to be honest I may re-close (hopefully intentionally) and just say we need to hit python-jsonschema/referencing#3.

@Julian Julian reopened this Jul 16, 2020
@Julian Julian closed this Oct 20, 2020
@Julian Julian deleted the branch python-jsonschema:master October 20, 2020 16:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants