Skip to content

Commit 5f824f3

Browse files
author
Daniel Bauer
committed
incorporated comments.
1 parent 4686ebe commit 5f824f3

File tree

1 file changed

+13
-16
lines changed

1 file changed

+13
-16
lines changed

RFC-0009/RFC-0009-binary-exchange.md

Lines changed: 13 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,4 @@
1-
# **RFC0x for Presto**
2-
3-
4-
## Replacing HTTP Exchange with Binary Exchange Protocol
1+
# Add support for Binary Exchange Protocol
52

63
Proposers
74
* Daniel Bauer ([email protected])
@@ -18,7 +15,7 @@ Above protocol enhancement is integrated into the proposed binary exchange proto
1815

1916
The binary exchange protocol (BinX) is an alternative for the existing HTTP-based exchange protocol that
2017
runs between Prestissimo worker nodes. It offers the same functionality and API
21-
but uses binary encoding that can be more efficiently parsed than HTTP nessages.
18+
but uses binary encoding that can be more efficiently parsed than HTTP messages.
2219
This translates into a performance benefit for exchange-intensive queries.
2320
BinX does not replace the control protocol that runs between the coordinator and the
2421
worker nodes. The control protocol continues to use HTTP.
@@ -33,7 +30,7 @@ is more complex than decoding binary encoded messages.
3330

3431
### Goals
3532

36-
The proposal is to use a binary exchange protocol as a light-weight alternative to the existinig HTTP exchange protocol.
33+
The proposal is to use a binary exchange protocol as a light-weight alternative to the existing HTTP exchange protocol.
3734
As a prototypical implementation shows that such a protocol reduces query run-time of exchange heavy queries by
3835
20% to 30%.
3936

@@ -143,8 +140,9 @@ with the HTTP exchange.
143140

144141
#### Implementation Notes
145142

146-
The BinX server uses Wangle. It consists of the following components that are implemented in
147-
the file `BinaryExchangeServer.h`:
143+
Like Proxygen, the BinX server uses Wangle as its underlying networking library.
144+
The BinX server is implemented in the file `BinaryExchangeServer.h` and consists of
145+
several components:
148146

149147
* The `BinaryExchangeServer` is a controller for starting and stopping the Wangle protocol stack.
150148
It takes the port number, the IO thread pool and the CPU thread pool as construction parameters.
@@ -159,9 +157,8 @@ service implementation on top of the stack.
159157
The results from the TaskManager are packaged into replies and sent back to the requesting BinX exchange source.
160158
This exchange service follows the design of the existing `TaskResource` service.
161159

162-
The `TaskManagerStub` class is an implementation detail that enables the BinX server to interact with
163-
a mock TaskManager implementation. This is used in the unit tests and allows to test the BinX server
164-
implementation along with the BinX exchange source implementation.
160+
All of above components are templated to allow for different TaskManager implementations. In the production code,
161+
the Prestissimo TaskManager is used while for unit testing, a mock task manager is deployed.
165162

166163
### Binary Exchange Source and Binary Exchange Client
167164

@@ -175,7 +172,7 @@ The `PrestoServer` registers a factory method for creating exchange sources. Thi
175172
such that `BinaryExchangeSource`s are created instead of HTTP exchanges when enabled by configuration.
176173
One exception are connections to the
177174
Presto coordinator that always uses the HTTP based exchange protocol. In a Kubernetes environment with its virtual
178-
networking, it is unfortunately not straight forward to detect whether the target host is the Presto connector
175+
networking, it is unfortunately not straight forward to detect whether the target host is the Presto coordinator
179176
since the connector's service IP used in the Presto configuration doesn't correspond to the IP address used by the
180177
pod running the coordinator. In order to circumvent this problem, a helper class called `CoordinatorInfoResolver`
181178
uses the node status endpoint of the coordinator to retrieve the coordinator's IP address. Using this address
@@ -237,17 +234,17 @@ the additional complexity.
237234
- There is one additional configuration option to enable BinX. Otherwise, there is no impact on session parameters, no API changes
238235
and no changes to SQL.
239236

240-
- If we are changing behaviour how will we phase out the older behaviour?
237+
- If we are changing behavior how will we phase out the older behavior?
241238

242239
- The HTTP stack is still required for the control message. The cost of keeping the HttpExchangeSource is minimal.
243240

244241
- If we need special migration tools, describe them here.
245242

246243
- No tools required.
247244

248-
- When will we remove the existing behaviour, if applicable.
245+
- When will we remove the existing behavior, if applicable.
249246

250-
- Existing behaviour will remain as the default option.
247+
- Existing behavior will remain as the default option.
251248

252249
- How should this feature be taught to new and existing users? Basically mention if documentation changes/new blog are needed?
253250

@@ -261,5 +258,5 @@ the additional complexity.
261258

262259
Test plan involves running performance measurements using TPC-DS and TPC-H benchmarks that compare the performance of HTTP versus BinX.
263260

264-
The TPC-DS benchmark test has been conducted using a dataset with scale factor 1000 on an on-prem cluster with 8 nodes. The results
261+
The TPC-DS benchmark test has been conducted using a dataset with scale factor 1000 on an on-premise cluster with 8 nodes. The results
265262
for this 1TB dataset have shown that overall runtime for the 99 queries was ~56 minutes when using HTTP compared to ~43 minutes for BinX.

0 commit comments

Comments
 (0)