Add documentation for deploying in Cluster mode using K8s by aucahuasi · Pull Request #54 · graphistry/graphistry-cli

aucahuasi · 2025-03-19T02:52:22Z

Improve the cluster documentation adding a new page for k8s
Improve the organization of the cluster documentation adding a folder to contain common topics.
Improve telemetry documentation (adding overview)

DataBoyTX

Just a couple of comments to consider, but looks great overall Percy!

DataBoyTX · 2025-03-19T23:17:46Z

docs/install/cluster/index.rst

+**Note**: *This deployment configuration is currently **experimental** and subject to future updates.*
+
+
+In this installation, both the **Leader** and **Follower** nodes can ingest datasets and files, with all nodes accessing the same **PostgreSQL** instance on the **Leader** node. As a result, **Follower** nodes can also perform data uploads, ensuring that both **Leader** and **Follower** nodes have equal access to dataset ingestion and visualization.


"ensuring that both Leader and Follower nodes have equal access to dataset ingestion and visualization" --> "allowing both Leader and Follower nodes to ingest datasets and visualize data"

Thanks Thomas, done!

DataBoyTX · 2025-03-19T23:36:02Z

docs/telemetry/kubernetes.md

+7. **`telemetryStack.grafana.GF_SERVER_ROOT_URL`** and **`telemetryStack.grafana.GF_SERVER_SERVE_FROM_SUB_PATH`**: These settings are used to configure Grafana, especially when it's deployed behind a reverse proxy or using an ingress controller.
+  - **`telemetryStack.grafana.GF_SERVER_ROOT_URL`** defines the root URL for accessing Grafana (e.g., `/grafana`).
+  - **`telemetryStack.grafana.GF_SERVER_SERVE_FROM_SUB_PATH`** should be set to `true` if Grafana is accessed from a sub-path (e.g., `/grafana`) behind a reverse proxy or ingress.
+8. **`telemetryStack.dcgmExporter.DCGM_EXPORTER_CLOCK_EVENTS_COUNT_WINDOW_SIZE`**: This environment variable is used when `OTEL_CLOUD_MODE` is set to `true`, and the `dcgm-exporter` is deployed to export GPU metrics to Prometheus. It controls the frequency of GPU sampling to gather metrics. The value `1000` represents the window size for counting clock events on the GPU.


telemetryStack.dcgmExporter.DCGM_EXPORTER_CLOCK_EVENTS_COUNT_WINDOW_SIZE:
This environment variable controls the GPU metric sampling resolution for dcgm-exporter, which exports GPU telemetry to Prometheus. It defines the window size (in milliseconds) for counting clock events on the GPU.

A smaller value (e.g., 500) results in higher-resolution telemetry with more frequent GPU metric updates.

A larger value (e.g., 2000) reduces the data rate but lowers monitoring overhead.
This setting applies regardless of OTEL_CLOUD_MODE and affects both local and cloud-based telemetry setups.

Thanks Thomas, done!

document cluster deployment mode for k8s and improve telemetry docs

0d9f2f1

aucahuasi requested a review from DataBoyTX March 19, 2025 02:52

aucahuasi self-assigned this Mar 19, 2025

DataBoyTX approved these changes Mar 19, 2025

View reviewed changes

aucahuasi added 2 commits March 21, 2025 11:39

improve wording

55f50be

remove tmp file

3d11e59

aucahuasi merged commit 27c2068 into master Mar 21, 2025
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add documentation for deploying in Cluster mode using K8s#54

Add documentation for deploying in Cluster mode using K8s#54
aucahuasi merged 3 commits intomasterfrom
dev/cluster-mode-k8s

aucahuasi commented Mar 19, 2025

Uh oh!

DataBoyTX left a comment

Uh oh!

DataBoyTX Mar 19, 2025

Uh oh!

aucahuasi Mar 21, 2025

Uh oh!

DataBoyTX Mar 19, 2025

Uh oh!

aucahuasi Mar 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		Note: This deployment configuration is currently experimental* and subject to future updates.*


		In this installation, both the Leader and Follower nodes can ingest datasets and files, with all nodes accessing the same PostgreSQL instance on the Leader node. As a result, Follower nodes can also perform data uploads, ensuring that both Leader and Follower nodes have equal access to dataset ingestion and visualization.

Conversation

aucahuasi commented Mar 19, 2025

Uh oh!

DataBoyTX left a comment

Choose a reason for hiding this comment

Uh oh!

DataBoyTX Mar 19, 2025

Choose a reason for hiding this comment

Uh oh!

aucahuasi Mar 21, 2025

Choose a reason for hiding this comment

Uh oh!

DataBoyTX Mar 19, 2025

Choose a reason for hiding this comment

Uh oh!

aucahuasi Mar 21, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants