1- prestoclient
1+ PrestoClient
22============
33
44PrestoClient implements a Python class to communicate with a Presto server.
@@ -8,7 +8,7 @@ Hadoop HDFS servers (http://hadoop.apache.org/).
88Presto uses SQL as its query language. Presto is an alternative for
99Hadoop-Hive.
1010
11- PrestoClient was developed using Presto 0.52 and tested on Presto 0.52 and 0.54.
11+ PrestoClient was developed using Presto 0.52 and tested on Presto 0.52 and 0.54. Python version used is 2.7.6
1212
1313You can use this class with this sample code:
1414
@@ -20,43 +20,52 @@ You can use this class with this sample code:
2020 presto = prestoclient.PrestoClient("localhost")
2121
2222 if not presto.startquery(sql):
23- print presto.getlasterrormessage()
23+ print "Error: ", presto.getlasterrormessage()
2424 else:
25- presto.waituntilfinished(True) # Remove True parameter to skip printing status messages
26- print "Columns: ", presto.getcolumns()
27- print "Datalength: ", presto.getdatalength(), " Data: ", presto.getdata()
25+ presto.waituntilfinished(True) # Remove True parameter to skip printing status messages
26+
27+ # We're done now, so let's show the results
28+ print "Columns: ", presto.getcolumns()
29+ if presto.getstatus() == "FAILED": print "Error : ", presto.getlasterrormessage()
30+ if presto.getdata(): print "Datalength: ", presto.getnumberofdatarows(), " Data: ", presto.getdata()
2831
2932
3033Presto client protocol
3134----------------------
32-
33- The communication protocol used between Presto clients and servers is not documented. It seems to
35+ The communication protocol used between Presto clients and servers is not documented yet. It seems to
3436be as follows:
3537
36- Client sends http POST request. Headerinformation should include: X-Presto-Catalog, X-Presto-Source,
37- X-Presto-Schema, User-Agent, X-Presto-User. The body of the request should contain the sql statement.
38- The server responds by returning JSON data. This data should contain 2 uri's. One giving the link
39- to get more information about the query execution (infoUri) and the other one giving the link to fetch
40- the next packet of data (nextUri).
41-
42- The client should send GET requests to the server (header: X-Presto-Source, User-Agent, X-Presto-User.
43- Body: empty) following the nextUri link from the last response
44- until the server response does not give any more nextUri links. The server response also contains a
45- 'state' variable. When there is no nextUri the state should be one of: FINISHED, FAILED or CANCELED.
46- Each response by the server to a 'nextUri' may contain information about the columns returned by the
47- query and all- or part of the querydata.
48-
49- The server reponse may contain a variable with the uri to cancel the query (partialCancelUri). The
50- client may issue a DELETE request to the server using this link.
51- The Presto server will retain information about finished queries for 15 minutes. When a client does
52- not respond to the server (by following the nextUri links) the server will cancel these 'dead' queries
53- after 5 minutes. These timeouts are hardcoded in the Presto server source code.
38+ Client sends http POST request to the Presto server, page: "/v1/statement". Header information should
39+ include: X-Presto-Catalog, X-Presto-Source, X-Presto-Schema, User-Agent, X-Presto-User. The body of the
40+ request should contain the sql statement. The server responds by returning JSON data (http status-code 200).
41+ This reply may contain up to 3 uri's. One giving the link to get more information about the query execution
42+ ('infoUri'), another giving the link to fetch the next packet of data ('nextUri') and one with the uri to
43+ cancel the query ('partialCancelUri').
44+
45+ The client should send GET requests to the server (Header: X-Presto-Source, User-Agent, X-Presto-User.
46+ Body: empty) following the 'nextUri' link from the previous response until the servers response does not
47+ contain an 'nextUri' link anymore. When there is no 'nextUri' the query is finished. If the last response
48+ from the server included an error section ('error') the query failed, otherwise the query succeeded. If
49+ the http status of the server response is anything other than 200 with Content-Type application/json, the
50+ query should also be considered failed. A 503 http response means that the server is (too) busy. Retry the
51+ request after waiting at least 50ms.
52+ The server response may contain a 'state' variable. This is for informational purposes only (may be subject
53+ to change in future implementations).
54+ Each response by the server to a 'nextUri' may contain information about the columns returned by the query
55+ and all- or part of the querydata. If the response contains a data section the columns section will always
56+ be available.
57+
58+ The server reponse may contain a variable with the uri to cancel the query ('partialCancelUri'). The client
59+ may issue a DELETE request to the server using this link. Response http status-code is 204.
60+
61+ The Presto server will retain information about finished queries for 15 minutes. When a client does not
62+ respond to the server (by following the 'nextUri' links) the server will cancel these 'dead' queries after
63+ 5 minutes. These timeouts are hardcoded in the Presto server source code.
5464
5565ToDo
5666----
57- - Make the PrestoClient class re-usable. Currently you can only start one query per instance of
58- this class.
59-
67+ - Enable PrestoClient to handle multiple running queries simultaneously. Currently you can only run one query per instance of this class.
68+ - Add support for https connections
6069- Add support for insert/update queries (if and when Presto server supports this).
6170
6271Availability
@@ -80,4 +89,3 @@ distributed under the License is distributed on an "AS IS" BASIS,
8089WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
8190See the License for the specific language governing permissions and
8291limitations under the License.
83-
0 commit comments