-
Notifications
You must be signed in to change notification settings - Fork 579
Description
Hello-
I am using RDFLib 4.2.2, and I'm having an issue using the SPARQLUpdateStore to query a SPARQL endpoint over HTTP. This seems related to issue #755, but I would still appreciate clarification. There's a lot here, and the ultimate conclusion may be "Don't use SPARQLUpdateStore", so please feel free to skip to the end where I ask my questions.
Problem
The endpoint I am hitting expects both queries and updates to be submitted via POST. Included in the message body are parameters for 'email' and 'password', along with a 3rd parameter for the query or update body. I've used SPARQLWrapper for this sort of operation in the past, and the following process completes with the expected results:
from SPARQLWrapper import SPARQLWrapper, POST
sw = SPARQLWrapper('http://endpoint/query', 'http://endpoint/update')
sw.addParameter('email', '[email protected]')
sw.addParameter('password', 'secret123')
sw.setQuery("DESCRIBE <http://example.com/foo>")
sw.setMethod(POST)
results = sw.queryAndConvert()
Using the Store interface promises to remove one more import/dependency, so I'd prefer to use that. The first issue I run into is that the SPARQLStore and SPARQLUpdateStore classes in 4.2.2 differ from their current state on the master branch. This is noted in pull request #744; namely, the removal of the SPARQLWrapper dependency. For me, this is actually a non-issue, since the older version of SPARQLStore inherits from SPARQLWrapper, and so I should be able to use the same process. All the following works as expected:
from rdflib.plugins.stores.sparqlstore import SPARQLUpdateStore
from SPARQLWrapper import POST
su = SPARQLUpdateStore('http://endpoint/query', 'http://endpoint/update')
su.addParameter('email', '[email protected]')
su.addParameter('password', 'secret123')
su.setQuery("DESCRIBE <http://example.com/foo>")
su.setMethod(POST)
Here is where my real issue begins. Calling su.queryAndConvert() fails with the following (I'll provide full tracebacks at the end):
TypeError: query() missing 1 required positional argument: 'query'
SPARQLStore is overriding the query method, and the signature has changed. The function expects the query string to be passed in as an argument, rather than accessing it on the object as SPARQLWrapper does. Easily fixed; I now call su.query(su.queryString). Here, I get the following; same exception as #755, and the main knot I'm dealing with:
HTTPError: HTTP Error 403: Forbidden
After some fiddling, I discover that, prior to issuing any queries, both the Wrapper and Store objects produce the same request data when the _createRequest method is called:
In [5]: sw._createRequest().data
Out[5]: b'email=sdmccaul%40me.com&password=secret123&query=DESCRIBE+%3Chttp%3A//example.com/foo%3E&format=xml&output=xml&results=xml'
In [6]: su._createRequest().data
Out[6]: b'email=sdmccaul%40me.com&password=secret123&query=DESCRIBE+%3Chttp%3A//example.com/foo%3E&format=xml&output=xml&results=xml'
Once the su.query(su.queryString) method fails, the request data is now completely empty:
In [8]: su._createRequest().data
Using IPython's introspection, I believe I narrow this down to the following section of SPARQLStore.query:
def query(self, query,
initNs={},
initBindings={},
queryGraph=None,
DEBUG=False):
self.debug = DEBUG
....
self.resetQuery()
self.setMethod(self.query_method)
if self._is_contextual(queryGraph):
self.addParameter("default-graph-uri", queryGraph)
self.timeout = self._timeout
self.setQuery(query)
with contextlib.closing(SPARQLWrapper.query(self).response) as res:
return Result.parse(res)
It appears that the resetQuery call is dropping all of the parameters I have set prior to issuing the request, which would explain why the query is returning a 403 Forbidden. This is where I'm stuck now, so finally:
Questions
- Am I using SPARQLUpdateStore correctly?
- Is this exception unavoidable, given my need to embed parameters in the POST body?
- Should I revert to using a SPARQLWrapper instance, or are there any other options for querying a remote SPARQL endpoint?
- When will I be able to pip install the version that's up on master plz
Thanks and keep up the great work-
Steve
Full tracebacks
su.queryandConvert()
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-6-b62d36e78e8c> in <module>
----> 1 results = su.queryAndConvert()
~/dev/work/rab-trax/.direnv/python-3.6.8/lib/python3.6/site-packages/SPARQLWrapper/Wrapper.py in queryAndConvert(self)
931 @return: the converted query result. See the conversion methods for more details.
932 """
--> 933 res = self.query()
934 return res.convert()
935
~/dev/work/rab-trax/.direnv/python-3.6.8/lib/python3.6/site-packages/rdflib/plugins/stores/sparqlstore.py in query(self, *args, **kwargs)
614 if not self.autocommit:
615 self.commit()
--> 616 return SPARQLStore.query(self, *args, **kwargs)
617
618 def triples(self, *args, **kwargs):
TypeError: query() missing 1 required positional argument: 'query'
su.query(su.queryString)
---------------------------------------------------------------------------
HTTPError Traceback (most recent call last)
<ipython-input-8-dfa61e804cee> in <module>
----> 1 results = su.query(su.queryString)
~/dev/work/rab-trax/.direnv/python-3.6.8/lib/python3.6/site-packages/rdflib/plugins/stores/sparqlstore.py in query(self, *args, **kwargs)
614 if not self.autocommit:
615 self.commit()
--> 616 return SPARQLStore.query(self, *args, **kwargs)
617
618 def triples(self, *args, **kwargs):
~/dev/work/rab-trax/.direnv/python-3.6.8/lib/python3.6/site-packages/rdflib/plugins/stores/sparqlstore.py in query(self, query, initNs, initBindings, queryGraph, DEBUG)
317 self.setQuery(query)
318
--> 319 with contextlib.closing(SPARQLWrapper.query(self).response) as res:
320 return Result.parse(res)
321
~/dev/work/rab-trax/.direnv/python-3.6.8/lib/python3.6/site-packages/SPARQLWrapper/Wrapper.py in query(self)
925 @rtype: L{QueryResult} instance
926 """
--> 927 return QueryResult(self._query())
928
929 def queryAndConvert(self):
~/dev/work/rab-trax/.direnv/python-3.6.8/lib/python3.6/site-packages/SPARQLWrapper/Wrapper.py in _query(self)
905 raise EndPointInternalError(e.read())
906 else:
--> 907 raise e
908
909 def query(self):
~/dev/work/rab-trax/.direnv/python-3.6.8/lib/python3.6/site-packages/SPARQLWrapper/Wrapper.py in _query(self)
891 response = urlopener(request, timeout=self.timeout)
892 else:
--> 893 response = urlopener(request)
894 return response, self.returnFormat
895 except urllib.error.HTTPError as e:
/usr/lib/python3.6/urllib/request.py in urlopen(url, data, timeout, cafile, capath, cadefault, context)
221 else:
222 opener = _opener
--> 223 return opener.open(url, data, timeout)
224
225 def install_opener(opener):
/usr/lib/python3.6/urllib/request.py in open(self, fullurl, data, timeout)
530 for processor in self.process_response.get(protocol, []):
531 meth = getattr(processor, meth_name)
--> 532 response = meth(req, response)
533
534 return response
/usr/lib/python3.6/urllib/request.py in http_response(self, request, response)
640 if not (200 <= code < 300):
641 response = self.parent.error(
--> 642 'http', request, response, code, msg, hdrs)
643
644 return response
/usr/lib/python3.6/urllib/request.py in error(self, proto, *args)
568 if http_err:
569 args = (dict, 'default', 'http_error_default') + orig_args
--> 570 return self._call_chain(*args)
571
572 # XXX probably also want an abstract factory that knows when it makes
/usr/lib/python3.6/urllib/request.py in _call_chain(self, chain, kind, meth_name, *args)
502 for handler in handlers:
503 func = getattr(handler, meth_name)
--> 504 result = func(*args)
505 if result is not None:
506 return result
/usr/lib/python3.6/urllib/request.py in http_error_default(self, req, fp, code, msg, hdrs)
648 class HTTPDefaultErrorHandler(BaseHandler):
649 def http_error_default(self, req, fp, code, msg, hdrs):
--> 650 raise HTTPError(req.full_url, code, msg, hdrs, fp)
651
652 class HTTPRedirectHandler(BaseHandler):
HTTPError: HTTP Error 403: Forbidden