Skip to content

Commit 1c668ef

Browse files
authored
fix: metric config (#59)
1 parent 5fbc6c7 commit 1c668ef

File tree

2 files changed

+16
-13
lines changed

2 files changed

+16
-13
lines changed

metric_monitor/REMOTE_WRITE_WITH_THANOS.md

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,11 @@ As shown in the new architecture, the monitoring system consists of the followin
4545
### Step 1: Set up TRON and Prometheus services
4646
Run the below command to start a java-tron FullNode, node exporter and Prometheus services:
4747
```sh
48-
docker-compose -f ./docker-compose/docker-compose-target-node.yml up -d
48+
docker-compose -f ./docker-compose/docker-compose-target-node.yml up -d # Start all
49+
50+
docker-compose -f ./docker-compose/docker-compose-target-node.yml up -d tron-node # Start tron-node only
51+
docker-compose -f ./docker-compose/docker-compose-target-node.yml up -d node-exporter # Start node-exporter only
52+
docker-compose -f ./docker-compose/docker-compose-target-node.yml up -d prometheus # Start prometheus only
4953
```
5054

5155
You can verify the Prometheus service status and monitor targets by accessing `http://[host_IP]:9090/` in your browser. Alternatively, use `docker logs -f prometheus` to view the Prometheus service logs.
@@ -108,7 +112,7 @@ remote_write:
108112
<img src="../images/metric_push_external_label.png" alt="Alt Text" width="680" >
109113

110114
- For `scrape_configs`:
111-
- The `scrape_interval` defines the frequency at which Prometheus collects metrics. While configured for 1-second intervals to enable real-time monitoring, this setting can be customized according to your specific monitoring needs. Keep in mind that decreasing the interval will increase the service load, as metrics are collected each time the HTTP request triggered.
115+
- The `scrape_interval` defines the frequency at which Prometheus collects metrics. While configured for 3-second intervals to enable real-time monitoring, this setting can be customized according to your specific monitoring needs. Keep in mind that decreasing the interval will increase the service load, as metrics are collected each time the HTTP request triggered.
112116
- The `targets` field specifies the java-tron services or other monitoring targets via their IP addresses and ports. Prometheus actively scrapes metrics from these defined endpoints.
113117
- The `labels` section contains key-value pairs that uniquely identify each target within Prometheus. These labels enable powerful filtering capabilities in Grafana dashboards - for example, you can filter metrics using expressions like `{group="group-tron"}`.
114118

@@ -119,7 +123,7 @@ remote_write:
119123
##### 2. Storage configurations
120124
- The volumes command `../prometheus_data:/prometheus` mounts a local directory used by Prometheus to store metrics data.
121125
- Even when using Prometheus with remote-write, metrics data is still temporarily stored locally.
122-
- The `--storage.tsdb.retention.time=7d` flag defines how long metrics data is retained. In this case, Prometheus automatically purges data older than 7 days. For a java-tron(v4.7.6+) FullNode, each metric request returns approximately 9KB of raw data. With a `scrape_interval` of 1 second and TSDB compression, **a single java-tron FullNode service requires about 2GB of Prometheus storage with 7 days of retention**.
126+
- The `--storage.tsdb.retention.time=7d` flag defines how long metrics data is retained. In this case, Prometheus automatically purges data older than 7 days. For a java-tron(v4.7.6+) FullNode, each metric request returns approximately 9KB of raw data. With a `scrape_interval` of 3 second and TSDB compression, **a single java-tron FullNode service requires about 700MB of Prometheus storage with 7 days of retention**.
123127
- The `--storage.tsdb.max-block-duration=30m` flag defines the maximum duration for generating TSDB blocks locally. With this setting, Prometheus will create new TSDB blocks at intervals no longer than 30 minutes, ensuring regular data persistence and efficient storage management.
124128
- Other storage flags can be found in the [official documentation](https://prometheus.io/docs/prometheus/latest/storage/#operational-aspects). For a quick start, you could use the default values.
125129

@@ -182,7 +186,7 @@ Core configuration for Thanos Receive in [thanos-receive.yml](./docker-compose/t
182186
##### 1. Storage configuration
183187
- Local Storage:
184188
`../receive-data:/receive/data` maps the host directory for metric TSDB storage.
185-
- Retention Policy: The `--tsdb.retention=30d` flag automatically purges data older than 30 days. Based on testing with a java-tron(v4.7.6+) FullNode using a 1-second metric scrape interval, storage consumption averages approximately **8GB of disk space per month**.
189+
- Retention Policy: The `--tsdb.retention=30d` flag automatically purges data older than 30 days. Based on testing with a java-tron(v4.7.6+) FullNode using a 3-second metric scrape interval, storage consumption averages approximately **3GB of disk space per month**.
186190

187191
- External Storage:
188192
`../conf:/receive` mounts configuration files. The `--objstore.config-file` flag enables long-term storage in MinIO/S3-compatible buckets. In this case, it is [bucket_storage_bucket.yml](conf/bucket_storage_bucket.yml).

metric_monitor/conf/prometheus-remote-write.yml

Lines changed: 8 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,8 @@ global:
99
scrape_configs:
1010
- job_name: java-tron
1111
honor_timestamps: true
12-
scrape_interval: 1s
13-
scrape_timeout: 1s
12+
scrape_interval: 3s
13+
scrape_timeout: 3s
1414
metrics_path: /metrics
1515
scheme: http
1616
follow_redirects: true
@@ -31,18 +31,17 @@ scrape_configs:
3131
remote_write:
3232
- url: http://thanos-receive-0:10908/api/v1/receive # if Thanos Receive service run on the same host with Prometheus
3333
headers:
34-
X-Auth-Token: "token"
3534
X-Service-Group: "tron-fullnode-group1"
36-
remote_timeout: 10s
35+
remote_timeout: 15s
3736
queue_config:
38-
capacity: 25000
37+
capacity: 50000
3938
max_shards: 200 # the maximum number of shards, or parallelism, Prometheus will use for each remote-write queue
4039
min_shards: 1
41-
max_samples_per_send: 5000
42-
batch_send_deadline: 1s
40+
max_samples_per_send: 10000
41+
batch_send_deadline: 3s
4342
min_backoff: 200ms
4443
max_backoff: 5s
4544
metadata_config:
4645
send: true
47-
send_interval: 1s # How frequently metric metadata is sent to remote storage.
48-
max_samples_per_send: 5000
46+
send_interval: 3s # How frequently metric metadata is sent to remote storage.
47+
max_samples_per_send: 50000

0 commit comments

Comments
 (0)