|
1 | | -# Setup |
| 1 | +# Setup local K8s cluster for presto + clp |
2 | 2 |
|
3 | | -## Dependency Installation |
4 | | - |
5 | | -### Install docker |
| 3 | +## Install docker |
6 | 4 |
|
7 | 5 | Follow the guide here: [docker] |
8 | 6 |
|
9 | | -### Install kubectl |
| 7 | +## Install kubectl |
10 | 8 |
|
11 | 9 | `kubectl` is the command-line tool for interacting with Kubernetes clusters. You will use it to |
12 | 10 | manage and inspect your k3d cluster. |
13 | 11 |
|
14 | 12 | Follow the guide here: [kubectl] |
15 | 13 |
|
16 | | -### Install k3d |
| 14 | +## Install k3d |
17 | 15 |
|
18 | 16 | k3d is a lightweight wrapper to run k3s (Rancher Lab's minimal Kubernetes distribution) in docker. |
19 | 17 |
|
20 | 18 | Follow the guide here: [k3d] |
21 | 19 |
|
22 | | -### Runbook |
23 | | - |
24 | | -Import the images and apply them to the cluster: |
25 | | -```shell |
26 | | -# Create a cluster called presto |
27 | | -# There should be two directories under /path/to/host/configs/: |
28 | | -# - /path/to/host/configs/etc-coordinator: |
29 | | -# - /path/to/host/configs/etc-coordinator/catalog/clp.properties |
30 | | -# - /path/to/host/configs/etc-coordinator/config.properties |
31 | | -# - /path/to/host/configs/etc-coordinator/jvm.config |
32 | | -# - /path/to/host/configs/etc-coordinator/log.properties |
33 | | -# - /path/to/host/configs/etc-coordinator/metadata-filter.json |
34 | | -# - /path/to/host/configs/etc-coordinator/node.properties |
35 | | -# |
36 | | -# - /path/to/host/configs/etc-worker: |
37 | | -# - /path/to/host/configs/etc-worker/catalog/clp.properties |
38 | | -# - /path/to/host/configs/etc-worker/config.properties |
39 | | -# - /path/to/host/configs/etc-worker/node.properties |
40 | | -# - /path/to/host/configs/etc-worker/velox.properties |
41 | | -# |
42 | | -# There will be examples of these config files in the following sections |
43 | | -k3d cluster create presto --servers 1 --agents 1 -v /path/to/host/configs/:/configs |
44 | | -# Load the coordinator image |
45 | | -k3d image import ghcr.io/y-scope/yscope-presto-with-clp-connector-coordinator:latest -c presto |
46 | | -# Load the worker image |
47 | | -k3d image import ghcr.io/y-scope/yscope-presto-with-clp-connector-worker:latest -c presto |
48 | | -# Launch the container |
49 | | -kubectl apply -f coordinator.yaml worker.yaml |
50 | | -``` |
| 20 | +## Install Helm |
51 | 21 |
|
52 | | -To do a sanity check: |
53 | | -```shell |
54 | | -kubectl port-forward svc/presto-coordinator 8080:8080 |
55 | | -# Check is coordinator alive |
56 | | -curl -X GET http://coordinator:8080/v1/info |
57 | | -# Check is worker connected to the coordinator |
58 | | -curl -X GET http://coordinator:8080/v1/nodes |
59 | | -``` |
| 22 | +Helm is the package manager for Kubernetes. |
| 23 | + |
| 24 | +Follow the guide here: [helm] |
| 25 | + |
| 26 | +# Launch clp-package |
| 27 | +1. Find the clp-package for test on our official website [clp-json-v0.4.0]. We also put the dataset for demo here: `mongod-256MB-presto-clp.log.tar.gz`. |
60 | 28 |
|
61 | | -# Example Configs for Coordinator |
| 29 | +2. Untar it. |
62 | 30 |
|
63 | | -Example of k8s image YAML: |
| 31 | +3. Replace the content of `etc/clp-config.yml` with the following (also replace the IP address `${REPLACE_IP}` with the actual IP address of the host that you are running the clp-package): |
64 | 32 | ```yaml |
65 | | -apiVersion: v1 |
66 | | -kind: Pod |
67 | | -metadata: |
68 | | - labels: |
69 | | - app: coordinator |
70 | | - name: coordinator |
71 | | -spec: |
72 | | - containers: |
73 | | - - name: coordinator |
74 | | - image: ghcr.io/y-scope/yscope-0.293-coordinator:latest |
75 | | - imagePullPolicy: Never |
76 | | - volumeMounts: |
77 | | - - name: "coordinator-config" |
78 | | - mountPath: "/opt/presto-server/etc" |
79 | | - volumes: |
80 | | - - name: "coordinator-config" |
81 | | - hostPath: |
82 | | - path: "/configs/etc-coordinator" |
83 | | ---- |
84 | | -apiVersion: v1 |
85 | | -kind: Service |
86 | | -metadata: |
87 | | - name: coordinator |
88 | | - labels: |
89 | | - app: coordinator |
90 | | -spec: |
91 | | - type: NodePort |
92 | | - ports: |
93 | | - - port: 8080 |
94 | | - nodePort: 30000 |
95 | | - name: "8080" |
96 | | - selector: |
97 | | - app: coordinator |
| 33 | +package: |
| 34 | + storage_engine: "clp-s" |
| 35 | +database: |
| 36 | + type: "mariadb" |
| 37 | + host: "${REPLACE_IP}" |
| 38 | + port: 6001 |
| 39 | + name: "clp-db" |
| 40 | +query_scheduler: |
| 41 | + host: "${REPLACE_IP}" |
| 42 | + port: 6002 |
| 43 | + jobs_poll_delay: 0.1 |
| 44 | + num_archives_to_search_per_sub_job: 16 |
| 45 | + logging_level: "INFO" |
| 46 | +queue: |
| 47 | + host: "${REPLACE_IP}" |
| 48 | + port: 6003 |
| 49 | +redis: |
| 50 | + host: "${REPLACE_IP}" |
| 51 | + port: 6004 |
| 52 | + query_backend_database: 0 |
| 53 | + compression_backend_database: 1 |
| 54 | +reducer: |
| 55 | + host: "${REPLACE_IP}" |
| 56 | + base_port: 6100 |
| 57 | + logging_level: "INFO" |
| 58 | + upsert_interval: 100 |
| 59 | +results_cache: |
| 60 | + host: "${REPLACE_IP}" |
| 61 | + port: 6005 |
| 62 | + db_name: "clp-query-results" |
| 63 | + stream_collection_name: "stream-files" |
| 64 | +webui: |
| 65 | + host: "localhost" |
| 66 | + port: 6000 |
| 67 | + logging_level: "INFO" |
| 68 | +log_viewer_webui: |
| 69 | + host: "localhost" |
| 70 | + port: 6006 |
98 | 71 | ``` |
99 | 72 |
|
100 | | -Example of `/path/to/host/configs/etc-coordinator/catalog/clp.properties` (need to update `clp.metadata-db-*` fields): |
101 | | -``` |
102 | | -connector.name=clp |
103 | | -clp.metadata-provider-type=mysql |
104 | | -clp.metadata-db-url=jdbc:mysql://REPLACE_ME |
105 | | -clp.metadata-db-name=REPLACE_ME |
106 | | -clp.metadata-db-user=REPLACE_ME |
107 | | -clp.metadata-db-password=REPLACE_ME |
108 | | -clp.metadata-table-prefix=clp_ |
109 | | -clp.split-provider-type=mysql |
110 | | -clp.metadata-filter-config=$(pwd)/etc-coordinator/metadata-filter.json |
| 73 | +4. Launch: |
| 74 | +```bash |
| 75 | +# You probably want to run in a 3.11 python environment |
| 76 | +sbin/start-clp.sh |
111 | 77 | ``` |
112 | 78 |
|
113 | | -Example of `/path/to/host/configs/etc-coordinator/config.properties`: |
114 | | -``` |
115 | | -coordinator=true |
116 | | -node-scheduler.include-coordinator=false |
117 | | -http-server.http.port=8080 |
118 | | -query.max-memory=1GB |
119 | | -query.max-memory-per-node=1GB |
120 | | -discovery-server.enabled=true |
121 | | -discovery.uri=http://localhost:8080 |
122 | | -optimizer.optimize-hash-generation=false |
123 | | -regex-library=RE2J |
124 | | -use-alternative-function-signatures=true |
125 | | -inline-sql-functions=false |
126 | | -nested-data-serialization-enabled=false |
127 | | -native-execution-enabled=true |
| 79 | +5. Compress: |
| 80 | +```bash |
| 81 | +# You can also use your own dataset |
| 82 | +sbin/compress.sh --timestamp-key 't.dollar_sign_date' datasets/mongod-256MB-processed.log |
128 | 83 | ``` |
129 | 84 |
|
130 | | -Example of `/path/to/host/configs/etc-coordinator/jvm.config`: |
131 | | -``` |
132 | | --server |
133 | | --Xmx4G |
134 | | --XX:+UseG1GC |
135 | | --XX:G1HeapRegionSize=32M |
136 | | --XX:+UseGCOverheadLimit |
137 | | --XX:+ExplicitGCInvokesConcurrent |
138 | | --XX:+HeapDumpOnOutOfMemoryError |
139 | | --XX:+ExitOnOutOfMemoryError |
140 | | --Djdk.attach.allowAttachSelf=true |
141 | | -``` |
| 85 | +6. Use a JetBrain IDE to connect the database source. The database is `clp-db`, the user is `clp-user` and the password is in `etc/credential.yml`. Then modify the `archive_storage_directory` field in `clp_datasets` table to `/var/data/archives/default`, and submit the change. |
142 | 86 |
|
143 | | -Example of `/path/to/host/configs/etc-coordinator/log.properties`: |
144 | | -``` |
145 | | -com.facebook.presto=DEBUG |
| 87 | +# Create k8s Cluster |
| 88 | +Create a local k8s cluster with port forwarding |
| 89 | +```bash |
| 90 | +# Replace the ~/clp-json-x86_64-v0.4.0/var/data/archives to the correct path |
| 91 | +k3d cluster create yscope --servers 1 --agents 1 -v $(readlink -f ~/clp-json-x86_64-v0.4.0/var/data/archives):/var/data/archives |
146 | 92 | ``` |
147 | 93 |
|
148 | | -Example of `/path/to/host/configs/etc-coordinator/metadata-filter.json`: |
149 | | -``` |
150 | | -{ |
151 | | - "clp.default": [ |
152 | | - { |
153 | | - "filterName": "msg.timestamp", |
154 | | - "rangeMapping": { |
155 | | - "lowerBound": "begin_timestamp", |
156 | | - "upperBound": "end_timestamp" |
157 | | - }, |
158 | | - "required": true |
159 | | - } |
160 | | - ] |
161 | | -} |
162 | | -``` |
| 94 | +# Working with helm chart |
| 95 | +## Install |
| 96 | +In `yscope-k8s/templates/presto/presto-coordinator-config.yaml` replace the `${REPLACE_IP}` in `clp.metadata-db-url=jdbc:mysql://${REPLACE_IP}:6001` with the IP address of the host you are running the clp-package (basially match the IP address that you configured in the `etc/clp-config.yml` of the clp-package). |
163 | 97 |
|
164 | | -Example of `/path/to/host/configs/etc-coordinator/node.properties`: |
165 | | -``` |
166 | | -node.environment=production |
167 | | -node.id=coordinator |
168 | | -``` |
| 98 | +```bash |
| 99 | +cd yscope-k8s |
169 | 100 |
|
170 | | -# Example Configs for Worker |
| 101 | +helm template . |
171 | 102 |
|
172 | | -Example of k8s image YAML: |
173 | | -🚧This is still in progress. |
174 | | -```yaml |
175 | | -apiVersion: v1 |
176 | | -kind: Pod |
177 | | -metadata: |
178 | | - labels: |
179 | | - app: worker |
180 | | - name: worker |
181 | | -spec: |
182 | | - containers: |
183 | | - - name: worker |
184 | | - image: ubuntu:22.04 |
185 | | - # imagePullPolicy: Never |
186 | | - command: |
187 | | - - /bin/bash |
188 | | - - -c |
189 | | - args: |
190 | | - - sleep infinity |
| 103 | +helm install demo . |
191 | 104 | ``` |
192 | 105 |
|
193 | | -Example of `/path/to/host/configs/etc-worker/catalog/clp.properties`: |
194 | | -``` |
195 | | -connector.name=clp |
| 106 | +## Use cli: |
| 107 | +After all containers are in "Running" states (check by `kubectl get pods`): |
| 108 | +```bash |
| 109 | +kubectl port-forward service/presto-coordinator 8080:8080 |
196 | 110 | ``` |
197 | 111 |
|
198 | | -Example of `/path/to/host/configs/etc-worker/config.properties` (need to replace the `presto.version` to make it the same as coordinator`s): |
199 | | -``` |
200 | | -discovery.uri=http://127.0.0.1:8080 |
201 | | -presto.version=REPLACE_ME |
202 | | -http-server.http.port=7777 |
203 | | -shutdown-onset-sec=1 |
204 | | -register-test-functions=false |
205 | | -runtime-metrics-collection-enabled=false |
| 112 | +Then you can further forward the 8080 port to your local laptop, to access the Presto's WebUI by e.g., http://localhost:8080 |
| 113 | + |
| 114 | +To use presto-cli: |
| 115 | +```bash |
| 116 | +./presto-cli-0.293-executable.jar --catalog clp --schema default --server localhost:8080 |
206 | 117 | ``` |
207 | 118 |
|
208 | | -Example of `/path/to/host/configs/etc-worker/node.properties`: |
| 119 | +Example query: |
209 | 120 | ``` |
210 | | -node.environment=production |
211 | | -node.internal-address=127.0.0.1 |
212 | | -node.location=testing-location |
213 | | -node.id=worker-1 |
| 121 | +SELECT * FROM default LIMIT 1; |
214 | 122 | ``` |
215 | 123 |
|
216 | | -Example of `/path/to/host/configs/etc-worker/velox.properties`: |
| 124 | +## Uninstall |
| 125 | +```bash |
| 126 | +helm uninstall demo |
217 | 127 | ``` |
218 | | -mutable-config=true |
| 128 | + |
| 129 | +# Delete k8s Cluster |
| 130 | +```bash |
| 131 | +k3d cluster delete yscope |
219 | 132 | ``` |
220 | 133 |
|
| 134 | + |
| 135 | +[clp-json-v0.4.0]: https://github.com/y-scope/clp/releases/tag/v0.4.0 |
221 | 136 | [docker]: https://docs.docker.com/engine/install |
222 | 137 | [k3d]: https://k3d.io/stable/#installation |
223 | | -[kubectl]: https://kubernetes.io/docs/tasks/tools/#kubectl |
| 138 | +[kubectl]: https://kubernetes.io/docs/tasks/tools/#kubectl |
| 139 | +[helm]: https://helm.sh/docs/intro/install/ |
0 commit comments