Skip to content

Commit d706ee3

Browse files
committed
Add markdown
1 parent 2dd1db1 commit d706ee3

File tree

1 file changed

+44
-0
lines changed

1 file changed

+44
-0
lines changed

examples/localhost.md

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
# Localhost workers example
2+
3+
This example executes a SQL query in a distributed context.
4+
5+
For this example to work, it's necessary to spawn some localhost workers with the `localhost_worker.rs` example:
6+
7+
### Spawning the workers
8+
9+
In two different terminals spawn two ArrowFlightEndpoints
10+
11+
```shell
12+
cargo run --example localhost_worker -- 8080 --cluster-ports 8080,8081
13+
```
14+
15+
```shell
16+
cargo run --example localhost_worker -- 8081 --cluster-ports 8080,8081
17+
```
18+
19+
- The positional numeric argument is the port in which each Arrow Flight endpoint will listen
20+
- The `--cluster-ports` parameter tells the Arrow Flight endpoint all the available localhost workers in the cluster
21+
22+
### Issuing a distributed SQL query
23+
24+
Now, DataFusion queries can be issued using these workers as part of the cluster.
25+
26+
```shell
27+
cargo run --example localhost_run -- 'SELECT count(*), "MinTemp" FROM weather GROUP BY "MinTemp"' --cluster-ports 8080,8081
28+
```
29+
30+
The head stage will be executed locally in the same process as that `cargo run` command, but further stages will be
31+
delegated to the workers running on ports 8080 and 8081.
32+
33+
Additionally, the `--explain` flag can be passed to render the distributed plan:
34+
35+
```shell
36+
cargo run --example localhost_run -- 'SELECT count(*), "MinTemp" FROM weather GROUP BY "MinTemp"' --cluster-ports 8080,8081 --explain
37+
```
38+
39+
### Available tables
40+
41+
Two tables are available in this example:
42+
43+
- `flights_1m`: Flight data with 1m rows
44+
- `weather`: Small dataset of weather data

0 commit comments

Comments
 (0)