-
Notifications
You must be signed in to change notification settings - Fork 577
Remove requests #1175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove requests #1175
Conversation
| res = urlopen(Request(self.query_endpoint + qsa, headers=args["headers"])) | ||
| elif self.method == "POST": | ||
| args["headers"].update({"Content-Type": "application/sparql-query"}) | ||
| args["data"] = params |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we need params in args["data"] anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only params allowed for a POST SPARQL query is the query itself, as per the spec, so we don't allow any others. Auth params for secure endpoints I've catered for with a separate auth param.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still need to submit default-graph-uri and named-graph-uri parameters. So the params is still required.
Also I think we should support both application/x-www-form-urlencoded and application/sparql-query content-types. But I'm unsure how we would determine which content-type is supported by the store we are speaking with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The params attribute is still required or at least the information about the default-graph.
If you consider the following:
Load some context aware quad store (e.g. the QuitStore) with the following data:
graph <http://example.org/> {
<http://example.org/ExampleInstance> a <http://example.org/Example>
}
graph <http://othergraph.org/> {
<http://example.org/OtherInstance> a <http://example.org/Example>
}The query endpoint is assumed to be available at http://localhost:5000/sparql
class SPARQLStoreQuitStoreTestCase(unittest.TestCase):
store_name = "SPARQLStore"
path = "http://localhost:5000/sparql"
create = False
def setUp(self):
store = SPARQLStore(query_endpoint=self.path, method="POST")
self.conjunctivegraph = ConjunctiveGraph(store=store)
def tearDown(self):
self.conjunctivegraph.close()
def test_Query(self):
query = "select distinct ?inst where {?inst a <http://example.org/Example>}"
graph = self.conjunctivegraph.get_context(URIRef("http://example.org/"))
res = graph.query(query, initNs={})
assert len(res) == 1, len(res)
for i in res:
assert type(i[0]) == URIRef, i[0].n3()
assert i[0] == URIRef("http://example.org/ExampleInstance"), i[0].n3()The query is executed as POST request but does not convey the information about the default graph.
This issue is a combination of removing this line and sending the wrong Content-Type.
Actually, what the request was doing before this change and the merge of #1022 was sending a proper POST request with a application/x-www-form-urlencoded content type according to the SPARQL 1.1 Protocol (query via URL-encoded POST; https://www.w3.org/TR/2013/REC-sparql11-protocol-20130321/#query-operation).
|
|
||
| res.raise_for_status() | ||
|
|
||
| def close(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we remove close() method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the Store is actually stateless with each operation and there's nothing to close. I originally dropped open() too but it's needed if the Store is called by Graph("SPARQLStore") as opposed to SPARQLStore()
|
@nicholascar |
|
Me being lazy. I saw Python 3.5 failures and wanted to jump over those! I thought we agreed to drop 3.5 so I was really just testing that out. Also, since this PR is all about dropping dependencies, I wanted to reduce the encumbrances as much as possible. Since there is a remaining Python 3.6 error, I'm going to have to address that in further edits here so could reintroduce Python 3.5 if you like. We did agree to drop Python 3.5 right? |
|
Weird that it fails tests here as it passed everything locally on Python 3.6: I ran on that version in particular to test. I think it's just a naming collision with |
|
Yes, we did agree to drop python 3.5 for RDFLib v6.0.0, and have a minimum 3.6. Just wasn't expecting that change to come as part of the "Remove requests" PR. But it should be fine. |
|
Regarding the single failure here, the message is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is hard for me to understand the complete pull request, as it mixes a lot of different things. Would it be possible for you to split this into several unrelated pull-requests?
| res = urlopen(Request(self.query_endpoint + qsa, headers=args["headers"])) | ||
| elif self.method == "POST": | ||
| args["headers"].update({"Content-Type": "application/sparql-query"}) | ||
| args["data"] = params |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still need to submit default-graph-uri and named-graph-uri parameters. So the params is still required.
Also I think we should support both application/x-www-form-urlencoded and application/sparql-query content-types. But I'm unsure how we would determine which content-type is supported by the store we are speaking with.
| def predicate_objects(self, subject=None): | ||
| """A generator of (predicate, object) tuples for the given subject""" | ||
| for t, c in self.triples((subject, None, None)): | ||
| yield t[1], t[2] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need these methods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are all just convenience methods that are present in the normal Graph() interface and they are commonly used, so I thought they could just be present in the SPARQLStore too. This store returns quads, not triples, so I had to copy these methods into the store's own list of methods to re-implement them, else they would just break when someone called them.
This might show that there are to many open connections in to short time: https://stackoverflow.com/questions/11981933/python-urllib2-cannot-assign-requested-address?noredirect=1 This could be a limitation on the travis machine. Maybe we can also perform the tests with some local endpoint. Which might not be related to that issue. |
When I perform the test locally, they all pass so I do think it is a Travis limitation. |
OK, let me think about that! I might have missed something there. |
Not really: the focus here was just to get rid of requests and requests is only use in 2 places in the entire codebase: a couple of tests and by SPARQLStore. This PR's really only just removing those references and then patching SPARQLStore to ensure that everything works. The only optional things not about that are the convenience methods such as Oh, and some very tiny updated to metadata such as adding us to CONTRIBUTORS.rst, but those are really tiny one-liners. |
|
Open/Close to trigger Travis to see if we can overcome the port hogging test failure |
|
Here's a demo fo from rdflib.plugins.stores.sparqlstore import SPARQLStore
ENDPOINT = "http://cgi.surroundaustralia.com:7200/repositories/cgi-vocs"
NAMED_GRAPH = "http://resource.geosciml.org/classifierscheme/cgi/2016.01/classification-method-used"
# NG in query, Graph then SPARQLStore
q = """
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
SELECT (COUNT(?s) AS ?c)
WHERE {{
GRAPH <{}> {{
?s a skos:Concept .
}}
}}
""".format(NAMED_GRAPH)
g = Graph("SPARQLStore")
g.open(ENDPOINT)
for r in g.query(q):
assert int(r['c']) == 14
st = SPARQLStore(query_endpoint=ENDPOINT)
for r in st.query(q):
assert int(r['c']) == 14
# NG separate from query, Graph then SPARQLStore
q2 = """
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
SELECT (COUNT(?s) AS ?c)
WHERE {
?s a skos:Concept .
}
"""
g = Graph("SPARQLStore", identifier=NAMED_GRAPH)
g.open(ENDPOINT)
for r in g.query(q2):
assert int(r['c']) == 14
st = SPARQLStore(query_endpoint=ENDPOINT)
for r in st.query(q2, queryGraph=NAMED_GRAPH):
assert int(r['c']) == 14 |
|
Are you sure it is executed via POST? |
No, there seems to be no logic that allows SPARQLStore to ever use POST in SparqlConnector. SPARQLUpdateStore has a I will write some POST tests next. I think there were quite a few combinations of things not tested in the original SPARQLStore code but fixing them all wasn't the purpose of this PR, just to remove requests! I've had to fix lots of things to get it all working without requests though. |
|
@ashleysommer @white-gecko please can you approve this PR? The single error in Travis above is, again There are an endless number of things that I could keep fixing in SPARQLStore but that's not the goal of this PR - just the removal of requests. Actually I'm using SPARQLStore in a production system now so will keep working on it. If this PR can be merged, I'll use |
|
@nicholascar If you're confident the POST issue raised by @white-gecko above isn't related to these changes (and it was like that already when using |
Yes, I think we are going to have to better mock up things like the SPARQL endpoints that are needed for some of the tests so things like Fuseki aren't needed to be bundled into the tests.
Yes I am confident: that's a different thing and part of the incomplete or now slightly broken SPARQLStore implementation that needs love and attention. I'd like to raise a bunch of Issues for SPARQLStore and then fix them with some PRs straight after this one which I'll discuss this arvo. Having this PR through will set a base layer for me to test behaviour with. I can think of a whole bunch more tests like the ones for GET I implemented but for POST and also things like transactions testing etc. I'm concerned that transaction rollback etc isn't actually working but we don't normally notice since SPARQL Services that I use are highly available and thus don't often disappear mid way through a session. |
|
@nicholascar |
| return self._session.__dict__[k] | ||
| if auth is not None: | ||
| assert type(auth) == tuple, "auth must be a tuple" | ||
| assert len(auth) == 2, "auth must be a tuple (user, password)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my eyes it is no good practice to have assert in production code, so it should not be in a library.
The Python reference states:
The current code generator emits no code for an assert statement when optimization is requested at compile time.
https://docs.python.org/3/reference/simple_stmts.html#assert
So I think it should be replaced by if and raise and exception.
| res = urlopen(Request(self.query_endpoint + qsa, headers=args["headers"])) | ||
| elif self.method == "POST": | ||
| args["headers"].update({"Content-Type": "application/sparql-query"}) | ||
| args["data"] = params |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The params attribute is still required or at least the information about the default-graph.
If you consider the following:
Load some context aware quad store (e.g. the QuitStore) with the following data:
graph <http://example.org/> {
<http://example.org/ExampleInstance> a <http://example.org/Example>
}
graph <http://othergraph.org/> {
<http://example.org/OtherInstance> a <http://example.org/Example>
}The query endpoint is assumed to be available at http://localhost:5000/sparql
class SPARQLStoreQuitStoreTestCase(unittest.TestCase):
store_name = "SPARQLStore"
path = "http://localhost:5000/sparql"
create = False
def setUp(self):
store = SPARQLStore(query_endpoint=self.path, method="POST")
self.conjunctivegraph = ConjunctiveGraph(store=store)
def tearDown(self):
self.conjunctivegraph.close()
def test_Query(self):
query = "select distinct ?inst where {?inst a <http://example.org/Example>}"
graph = self.conjunctivegraph.get_context(URIRef("http://example.org/"))
res = graph.query(query, initNs={})
assert len(res) == 1, len(res)
for i in res:
assert type(i[0]) == URIRef, i[0].n3()
assert i[0] == URIRef("http://example.org/ExampleInstance"), i[0].n3()The query is executed as POST request but does not convey the information about the default graph.
This issue is a combination of removing this line and sending the wrong Content-Type.
Actually, what the request was doing before this change and the merge of #1022 was sending a proper POST request with a application/x-www-form-urlencoded content type according to the SPARQL 1.1 Protocol (query via URL-encoded POST; https://www.w3.org/TR/2013/REC-sparql11-protocol-20130321/#query-operation).
|
Are you sure the |
Removes requests from RDFlib requirements entirely to
Tidies requirements.txt & requirements-dev.txt
Closes #1122