Skip to content

Issuing POST request with credentials via SPARQLStore #921

@stevemn

Description

@stevemn

Hello-

I am using RDFLib 4.2.2, and I'm having an issue using the SPARQLUpdateStore to query a SPARQL endpoint over HTTP. This seems related to issue #755, but I would still appreciate clarification. There's a lot here, and the ultimate conclusion may be "Don't use SPARQLUpdateStore", so please feel free to skip to the end where I ask my questions.

Problem

The endpoint I am hitting expects both queries and updates to be submitted via POST. Included in the message body are parameters for 'email' and 'password', along with a 3rd parameter for the query or update body. I've used SPARQLWrapper for this sort of operation in the past, and the following process completes with the expected results:

from SPARQLWrapper import SPARQLWrapper, POST

sw = SPARQLWrapper('http://endpoint/query', 'http://endpoint/update')
sw.addParameter('email', '[email protected]')
sw.addParameter('password', 'secret123')
sw.setQuery("DESCRIBE <http://example.com/foo>")
sw.setMethod(POST)
results = sw.queryAndConvert()

Using the Store interface promises to remove one more import/dependency, so I'd prefer to use that. The first issue I run into is that the SPARQLStore and SPARQLUpdateStore classes in 4.2.2 differ from their current state on the master branch. This is noted in pull request #744; namely, the removal of the SPARQLWrapper dependency. For me, this is actually a non-issue, since the older version of SPARQLStore inherits from SPARQLWrapper, and so I should be able to use the same process. All the following works as expected:

from rdflib.plugins.stores.sparqlstore import SPARQLUpdateStore
from SPARQLWrapper import POST

su = SPARQLUpdateStore('http://endpoint/query', 'http://endpoint/update')
su.addParameter('email', '[email protected]')
su.addParameter('password', 'secret123')
su.setQuery("DESCRIBE <http://example.com/foo>")
su.setMethod(POST)

Here is where my real issue begins. Calling su.queryAndConvert() fails with the following (I'll provide full tracebacks at the end):

TypeError: query() missing 1 required positional argument: 'query'

SPARQLStore is overriding the query method, and the signature has changed. The function expects the query string to be passed in as an argument, rather than accessing it on the object as SPARQLWrapper does. Easily fixed; I now call su.query(su.queryString). Here, I get the following; same exception as #755, and the main knot I'm dealing with:

HTTPError: HTTP Error 403: Forbidden

After some fiddling, I discover that, prior to issuing any queries, both the Wrapper and Store objects produce the same request data when the _createRequest method is called:

In [5]: sw._createRequest().data                                                               
Out[5]: b'email=sdmccaul%40me.com&password=secret123&query=DESCRIBE+%3Chttp%3A//example.com/foo%3E&format=xml&output=xml&results=xml'

In [6]: su._createRequest().data                                                               
Out[6]: b'email=sdmccaul%40me.com&password=secret123&query=DESCRIBE+%3Chttp%3A//example.com/foo%3E&format=xml&output=xml&results=xml'

Once the su.query(su.queryString) method fails, the request data is now completely empty:

In [8]: su._createRequest().data

Using IPython's introspection, I believe I narrow this down to the following section of SPARQLStore.query:

def query(self, query,
              initNs={},
              initBindings={},
              queryGraph=None,
              DEBUG=False):
        self.debug = DEBUG
            ....

        self.resetQuery()
        self.setMethod(self.query_method)
        if self._is_contextual(queryGraph):
            self.addParameter("default-graph-uri", queryGraph)
        self.timeout = self._timeout
        self.setQuery(query)

        with contextlib.closing(SPARQLWrapper.query(self).response) as res:
            return Result.parse(res)

It appears that the resetQuery call is dropping all of the parameters I have set prior to issuing the request, which would explain why the query is returning a 403 Forbidden. This is where I'm stuck now, so finally:

Questions

  • Am I using SPARQLUpdateStore correctly?
  • Is this exception unavoidable, given my need to embed parameters in the POST body?
  • Should I revert to using a SPARQLWrapper instance, or are there any other options for querying a remote SPARQL endpoint?
  • When will I be able to pip install the version that's up on master plz

Thanks and keep up the great work-
Steve

Full tracebacks
su.queryandConvert()

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-6-b62d36e78e8c> in <module>
----> 1 results = su.queryAndConvert()

~/dev/work/rab-trax/.direnv/python-3.6.8/lib/python3.6/site-packages/SPARQLWrapper/Wrapper.py in queryAndConvert(self)
    931         @return: the converted query result. See the conversion methods for more details.
    932         """
--> 933         res = self.query()
    934         return res.convert()
    935 

~/dev/work/rab-trax/.direnv/python-3.6.8/lib/python3.6/site-packages/rdflib/plugins/stores/sparqlstore.py in query(self, *args, **kwargs)
    614         if not self.autocommit:
    615             self.commit()
--> 616         return SPARQLStore.query(self, *args, **kwargs)
    617 
    618     def triples(self, *args, **kwargs):

TypeError: query() missing 1 required positional argument: 'query'

su.query(su.queryString)

---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
<ipython-input-8-dfa61e804cee> in <module>
----> 1 results = su.query(su.queryString)

~/dev/work/rab-trax/.direnv/python-3.6.8/lib/python3.6/site-packages/rdflib/plugins/stores/sparqlstore.py in query(self, *args, **kwargs)
    614         if not self.autocommit:
    615             self.commit()
--> 616         return SPARQLStore.query(self, *args, **kwargs)
    617 
    618     def triples(self, *args, **kwargs):

~/dev/work/rab-trax/.direnv/python-3.6.8/lib/python3.6/site-packages/rdflib/plugins/stores/sparqlstore.py in query(self, query, initNs, initBindings, queryGraph, DEBUG)
    317         self.setQuery(query)
    318 
--> 319         with contextlib.closing(SPARQLWrapper.query(self).response) as res:
    320             return Result.parse(res)
    321 

~/dev/work/rab-trax/.direnv/python-3.6.8/lib/python3.6/site-packages/SPARQLWrapper/Wrapper.py in query(self)
    925             @rtype: L{QueryResult} instance
    926         """
--> 927         return QueryResult(self._query())
    928 
    929     def queryAndConvert(self):

~/dev/work/rab-trax/.direnv/python-3.6.8/lib/python3.6/site-packages/SPARQLWrapper/Wrapper.py in _query(self)
    905                 raise EndPointInternalError(e.read())
    906             else:
--> 907                 raise e
    908 
    909     def query(self):

~/dev/work/rab-trax/.direnv/python-3.6.8/lib/python3.6/site-packages/SPARQLWrapper/Wrapper.py in _query(self)
    891                 response = urlopener(request, timeout=self.timeout)
    892             else:
--> 893                 response = urlopener(request)
    894             return response, self.returnFormat
    895         except urllib.error.HTTPError as e:

/usr/lib/python3.6/urllib/request.py in urlopen(url, data, timeout, cafile, capath, cadefault, context)
    221     else:
    222         opener = _opener
--> 223     return opener.open(url, data, timeout)
    224 
    225 def install_opener(opener):

/usr/lib/python3.6/urllib/request.py in open(self, fullurl, data, timeout)
    530         for processor in self.process_response.get(protocol, []):
    531             meth = getattr(processor, meth_name)
--> 532             response = meth(req, response)
    533 
    534         return response

/usr/lib/python3.6/urllib/request.py in http_response(self, request, response)
    640         if not (200 <= code < 300):
    641             response = self.parent.error(
--> 642                 'http', request, response, code, msg, hdrs)
    643 
    644         return response

/usr/lib/python3.6/urllib/request.py in error(self, proto, *args)
    568         if http_err:
    569             args = (dict, 'default', 'http_error_default') + orig_args
--> 570             return self._call_chain(*args)
    571 
    572 # XXX probably also want an abstract factory that knows when it makes

/usr/lib/python3.6/urllib/request.py in _call_chain(self, chain, kind, meth_name, *args)
    502         for handler in handlers:
    503             func = getattr(handler, meth_name)
--> 504             result = func(*args)
    505             if result is not None:
    506                 return result

/usr/lib/python3.6/urllib/request.py in http_error_default(self, req, fp, code, msg, hdrs)
    648 class HTTPDefaultErrorHandler(BaseHandler):
    649     def http_error_default(self, req, fp, code, msg, hdrs):
--> 650         raise HTTPError(req.full_url, code, msg, hdrs, fp)
    651 
    652 class HTTPRedirectHandler(BaseHandler):

HTTPError: HTTP Error 403: Forbidden

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions