diff --git a/content/embeds/rs-prometheus-metrics-transition-plan.md b/content/embeds/rs-prometheus-metrics-transition-plan.md index 536646e9c4..691aaab938 100644 --- a/content/embeds/rs-prometheus-metrics-transition-plan.md +++ b/content/embeds/rs-prometheus-metrics-transition-plan.md @@ -2,16 +2,16 @@ | V1 metric | Equivalent V2 PromQL | Description | | --------- | :------------------- | :---------- | -| bdb_avg_latency | `sum by (db) (irate(endpoint_acc_latency[1m])) / sum by (db) (irate(endpoint_total_started_res[1m])) / 1000000` | Average latency of operations on the database (seconds); returned only when there is traffic | -| bdb_avg_latency_max | `sum by (db) (irate(endpoint_acc_latency[1m])) / sum by (db) (irate(endpoint_total_started_res[1m])) / 1000000` | Highest value of average latency of operations on the database (seconds); returned only when there is traffic | -| bdb_avg_read_latency | `sum by (db) (irate(endpoint_acc_read_latency[1m])) / sum by (db) (irate(endpoint_total_started_res[1m])) / 1000000` | Average latency of read operations (seconds); returned only when there is traffic | -| bdb_avg_read_latency_max | `sum by (db) (irate(endpoint_acc_read_latency[1m])) / sum by (db) (irate(endpoint_total_started_res[1m])) / 1000000` | Highest value of average latency of read operations (seconds); returned only when there is traffic | -| bdb_avg_write_latency | `sum by (db) (irate(endpoint_acc_write_latency[1m])) / sum by (db) (irate(endpoint_total_started_res[1m])) / 1000000` | Average latency of write operations (seconds); returned only when there is traffic | -| bdb_avg_write_latency_max | `sum by (db) (irate(endpoint_acc_write_latency[1m])) / sum by (db) (irate(endpoint_total_started_res[1m])) / 1000000` | Highest value of average latency of write operations (seconds); returned only when there is traffic | +| bdb_avg_latency | `(sum by (db)(irate(endpoint_read_requests_latency_histogram_sum{db=""$db""}[1m]) + irate(endpoint_write_requests_latency_histogram_sum{db=""$db""}[1m]) + irate(endpoint_other_requests_latency_histogram_sum{db=""$db""}[1m])))/(sum by (db)(irate(endpoint_read_requests{db=""$db""}[1m]) + irate(endpoint_write_requests{db=""$db""}[1m]) + irate(endpoint_other_requests{db=""$db""}[1m])))/1000000` | Average latency of operations on the database (seconds); returned only when there is traffic | +| bdb_avg_latency_max | `avg(histogram_quantile(1, sum by (le, db) (irate(endpoint_read_requests_latency_histogram_bucket{db=""$db""}[1m]) + irate(endpoint_write_requests_latency_histogram_bucket{db=""$db""}[1m]) + irate(endpoint_other_requests_latency_histogram_bucket{db=""$db""}[1m])))) / 1000000` | Highest value of average latency of operations on the database (seconds); returned only when there is traffic | +| bdb_avg_read_latency | `(sum(irate(endpoint_read_requests_latency_histogram_sum{db=""$db""}[1m]))/sum(irate(endpoint_read_requests{db=""$db""}[1m])))/1000000` | Average latency of read operations (seconds); returned only when there is traffic | +| bdb_avg_read_latency_max | `histogram_quantile(1, sum by (le) (irate(endpoint_read_requests_latency_histogram_bucket{db=""$db""}[1m]))) / 1000000` | Highest value of average latency of read operations (seconds); returned only when there is traffic | +| bdb_avg_write_latency | `(sum(irate(endpoint_write_requests_latency_histogram_sum{db=""$db""}[1m]))/sum(irate(endpoint_write_requests{db=""$db""}[1m])))/1000000` | Average latency of write operations (seconds); returned only when there is traffic | +| bdb_avg_write_latency_max | `histogram_quantile(1, sum by (le) (irate(endpoint_write_requests_latency_histogram_bucket{db=""$db""}[1m]))) / 1000000` | Highest value of average latency of write operations (seconds); returned only when there is traffic | | bdb_bigstore_shard_count | `sum((sum(label_replace(label_replace(namedprocess_namegroup_thread_count{groupname=~"redis-\d+", threadname=~"(speedb\|rocksdb).*"}, "redis", "$1", "groupname", "redis-(\d+)"), "driver", "$1", "threadname", "(speedb\|rocksdb).*")) by (redis, driver) > bool 0) * on (redis) group_left(db) redis_server_up) by (db, driver)` | Shard count by database and by storage engine (driver - rocksdb / speedb); Only for databases with Auto Tiering enabled | -| bdb_conns | `sum by(db) (endpoint_client_connections)` | Number of client connections to database | -| bdb_egress_bytes | `sum by(db) (irate(endpoint_egress_bytes[1m]))` | Rate of outgoing network traffic from the database (bytes/sec) | -| bdb_egress_bytes_max | `sum by(db) (irate(endpoint_egress_bytes[1m]))` | Highest value of the rate of outgoing network traffic from the database (bytes/sec) | +| bdb_conns | `sum by (db) (endpoint_client_connections{cluster=""$cluster"", db=""$db""} - endpoint_client_disconnections{cluster=""$cluster"", db=""$db""})` | Number of client connections to database | +| bdb_egress_bytes | `sum by(db) (irate(endpoint_egress{db="$db"}[1m]))` | Rate of outgoing network traffic from the database (bytes/sec) | +| bdb_egress_bytes_max | `max_over_time (sum by(db) (irate(endpoint_egress{db="$db"}[1m]))[$__range:])` | Highest value of the rate of outgoing network traffic from the database (bytes/sec) | | bdb_evicted_objects | `sum by (db) (irate(redis_server_evicted_keys{role="master"}[1m]))` | Rate of key evictions from database (evictions/sec) | | bdb_evicted_objects_max | `sum by (db) (irate(redis_server_evicted_keys{role="master"}[1m]))` | Highest value of the rate of key evictions from database (evictions/sec) | | bdb_expired_objects | `sum by (db) (irate(redis_server_expired_keys{role="master"}[1m]))` | Rate keys expired in database (expirations/sec) | @@ -20,8 +20,8 @@ | bdb_fork_cpu_system_max | `sum by (db) (irate(namedprocess_namegroup_thread_cpu_seconds_total{mode="system"}[1m]))` | Highest value of % cores utilization in system mode for all Redis shard fork child processes of this database | | bdb_fork_cpu_user | `sum by (db) (irate(namedprocess_namegroup_thread_cpu_seconds_total{mode="user"}[1m]))` | % cores utilization in user mode for all Redis shard fork child processes of this database | | bdb_fork_cpu_user_max | `sum by (db) (irate(namedprocess_namegroup_thread_cpu_seconds_total{mode="user"}[1m]))` | Highest value of % cores utilization in user mode for all Redis shard fork child processes of this database | -| bdb_ingress_bytes | `sum by(db) (irate(endpoint_ingress_bytes[1m]))` | Rate of incoming network traffic to database (bytes/sec) | -| bdb_ingress_bytes_max | `sum by(db) (irate(endpoint_ingress_bytes[1m]))` | Highest value of the rate of incoming network traffic to database (bytes/sec) | +| bdb_ingress_bytes | `sum by(db) (irate(endpoint_ingress{db="$db"}[1m]))` | Rate of incoming network traffic to database (bytes/sec) | +| bdb_ingress_bytes_max | `max_over_time (sum by(db) (irate(endpoint_ingress{db="$db"}[1m]))[$__range:])` | Highest value of the rate of incoming network traffic to database (bytes/sec) | | bdb_instantaneous_ops_per_sec | `sum by(db) (redis_server_instantaneous_ops_per_sec)` | Request rate handled by all shards of database (ops/sec) | | bdb_main_thread_cpu_system | `sum by(db) (irate(namedprocess_namegroup_thread_cpu_seconds_total{mode="system", threadname=~"redis-server.*"}[1m]))` | % cores utilization in system mode for all Redis shard main threads of this database | | bdb_main_thread_cpu_system_max | `sum by(db) (irate(namedprocess_namegroup_thread_cpu_seconds_total{mode="system", threadname=~"redis-server.*"}[1m]))` | Highest value of % cores utilization in system mode for all Redis shard main threads of this database | @@ -32,10 +32,10 @@ | bdb_memory_limit | `sum by(db) (redis_server_maxmemory)` | Configured RAM limit for the database | | bdb_monitor_sessions_count | `sum by(db) (endpoint_monitor_sessions_count)` | Number of clients connected in monitor mode to the database | | bdb_no_of_keys | `sum by (db) (redis_server_db_keys{role="master"})` | Number of keys in database | -| bdb_other_req | `sum by(db) (irate(endpoint_other_req[1m]))` | Rate of other (non read/write) requests on the database (ops/sec) | -| bdb_other_req_max | `sum by(db) (irate(endpoint_other_req[1m]))` | Highest value of the rate of other (non read/write) requests on the database (ops/sec) | -| bdb_other_res | `sum by(db) (irate(endpoint_other_res[1m]))` | Rate of other (non read/write) responses on the database (ops/sec) | -| bdb_other_res_max | `sum by(db) (irate(endpoint_other_res[1m]))` | Highest value of the rate of other (non read/write) responses on the database (ops/sec) | +| bdb_other_req | `sum by(db) (irate(endpoint_other_requests{db="$db"}[1m]))` | Rate of other (non read/write) requests on the database (ops/sec) | +| bdb_other_req_max | `max_over_time (sum by(db) (irate(endpoint_other_requests{db="$db"}[1m]))[$__range:])` | Highest value of the rate of other (non read/write) requests on the database (ops/sec) | +| bdb_other_res | `sum by(db) (irate(endpoint_other_responses{db="$db"}[1m]))` | Rate of other (non read/write) responses on the database (ops/sec) | +| bdb_other_res_max | `max_over_time (sum by(db) (irate(endpoint_other_responses{db="$db"}[1m]))[$__range:])` | Highest value of the rate of other (non read/write) responses on the database (ops/sec) | | bdb_pubsub_channels | `sum by(db) (redis_server_pubsub_channels)` | Count the pub/sub channels with subscribed clients | | bdb_pubsub_channels_max | `sum by(db) (redis_server_pubsub_channels)` | Highest value of count the pub/sub channels with subscribed clients | | bdb_pubsub_patterns | `sum by(db) (redis_server_pubsub_patterns)` | Count the pub/sub patterns with subscribed clients | @@ -44,31 +44,31 @@ | bdb_read_hits_max | `sum by (db) (irate(redis_server_keyspace_read_hits{role="master"}[1m]))` | Highest value of the rate of read operations accessing an existing key (ops/sec) | | bdb_read_misses | `sum by (db) (irate(redis_server_keyspace_read_misses{role="master"}[1m]))` | Rate of read operations accessing a non-existing key (ops/sec) | | bdb_read_misses_max | `sum by (db) (irate(redis_server_keyspace_read_misses{role="master"}[1m]))` | Highest value of the rate of read operations accessing a non-existing key (ops/sec) | -| bdb_read_req | `sum by (db) (irate(endpoint_read_req[1m]))` | Rate of read requests on the database (ops/sec) | -| bdb_read_req_max | `sum by (db) (irate(endpoint_read_req[1m]))` | Highest value of the rate of read requests on the database (ops/sec) | -| bdb_read_res | `sum by(db) (irate(endpoint_read_res[1m]))` | Rate of read responses on the database (ops/sec) | -| bdb_read_res_max | `sum by(db) (irate(endpoint_read_res[1m]))` | Highest value of the rate of read responses on the database (ops/sec) | +| bdb_read_req | `sum by (db) (irate(endpoint_read_requests{db="$db"}[1m]))` | Rate of read requests on the database (ops/sec) | +| bdb_read_req_max | `max_over_time (sum by(db) (irate(endpoint_read_requests{db="$db"}[1m]))[$__range:])` | Highest value of the rate of read requests on the database (ops/sec) | +| bdb_read_res | `sum by (db) (irate(endpoint_read_responses{"$db"}[1m]))` | Rate of read responses on the database (ops/sec) | +| bdb_read_res_max | `max_over_time (sum by(db) (irate(endpoint_read_responses{db="$db"}[1m]))[$__range:])` | Highest value of the rate of read responses on the database (ops/sec) | | bdb_shard_cpu_system | `sum by(db) (irate(namedprocess_namegroup_thread_cpu_seconds_total{mode="system", role="master"}[1m]))` | % cores utilization in system mode for all Redis shard processes of this database | | bdb_shard_cpu_system_max | `sum by(db) (irate(namedprocess_namegroup_thread_cpu_seconds_total{mode="system", role="master"}[1m]))` | Highest value of % cores utilization in system mode for all Redis shard processes of this database | | bdb_shard_cpu_user | `sum by(db) (irate(namedprocess_namegroup_thread_cpu_seconds_total{mode="user", role="master"}[1m]))` | % cores utilization in user mode for the Redis shard process | | bdb_shard_cpu_user_max | `sum by(db) (irate(namedprocess_namegroup_thread_cpu_seconds_total{mode="user", role="master"}[1m]))` | Highest value of % cores utilization in user mode for the Redis shard process | | bdb_shards_used | `sum((sum(label_replace(label_replace(label_replace(namedprocess_namegroup_thread_count{groupname=~"redis-\d+"}, "redis", "$1", "groupname", "redis-(\d+)"), "shard_type", "flash", "threadname", "(bigstore).*"), "shard_type", "ram", "shard_type", "")) by (redis, shard_type) > bool 0) * on (redis) group_left(db) redis_server_up) by (db, shard_type)` | Used shard count by database and by shard type (ram / flash) | -| bdb_total_connections_received | `sum by(db) (irate(endpoint_total_connections_received[1m]))` | Rate of new client connections to database (connections/sec) | -| bdb_total_connections_received_max | `sum by(db) (irate(endpoint_total_connections_received[1m]))` | Highest value of the rate of new client connections to database (connections/sec) | -| bdb_total_req | `sum by (db) (irate(endpoint_total_req[1m]))` | Rate of all requests on the database (ops/sec) | -| bdb_total_req_max | `sum by (db) (irate(endpoint_total_req[1m]))` | Highest value of the rate of all requests on the database (ops/sec) | -| bdb_total_res | `sum by(db) (irate(endpoint_total_res[1m]))` | Rate of all responses on the database (ops/sec) | -| bdb_total_res_max | `sum by(db) (irate(endpoint_total_res[1m]))` | Highest value of the rate of all responses on the database (ops/sec) | +| bdb_total_connections_received | `sum by(db) (irate(endpoint_client_connections{db="$db"}[1m]))` | Rate of new client connections to database (connections/sec) | +| bdb_total_connections_received_max | `max_over_time (sum by(db) (irate(endpoint_client_connections{db="$db"}[1m]))[$__range:])` | Highest value of the rate of new client connections to database (connections/sec) | +| bdb_total_req | `sum by (db) (irate(endpoint_read_requests{db=""$db""}[1m]) + irate(endpoint_write_requests{db=""$db""}[1m]) + irate(endpoint_other_requests{db=""$db""}[1m]))` | Rate of all requests on the database (ops/sec) | +| bdb_total_req_max | `max_over_time(sum by (db) (irate(endpoint_read_requests{db=""$db""}[1m]) + irate(endpoint_write_requests{db=""$db""}[1m]) + irate(endpoint_other_requests{db=""$db""}[1m])) [$__range:])` | Highest value of the rate of all requests on the database (ops/sec) | +| bdb_total_res | `sum by (db) (irate(endpoint_read_responses{db=""$db""}[1m]) + irate(endpoint_write_responses{db=""$db""}[1m]) + irate(endpoint_other_responses{db=""$db""}[1m]))` | Rate of all responses on the database (ops/sec) | +| bdb_total_res_max | `max_over_time(sum by (db) (irate(endpoint_read_responses{db=""$db""}[1m]) + irate(endpoint_write_responses{db=""$db""}[1m]) + irate(endpoint_other_responses{db=""$db""}[1m])) [$__range:])` | Highest value of the rate of all responses on the database (ops/sec) | | bdb_up | `min by(db) (redis_up)` | Database is up and running | | bdb_used_memory | `sum by (db) (redis_server_used_memory)` | Memory used by database (in BigRedis this includes flash) (bytes) | | bdb_write_hits | `sum by (db) (irate(redis_server_keyspace_write_hits{role="master"}[1m]))` | Rate of write operations accessing an existing key (ops/sec) | | bdb_write_hits_max | `sum by (db) (irate(redis_server_keyspace_write_hits{role="master"}[1m]))` | Highest value of the rate of write operations accessing an existing key (ops/sec) | | bdb_write_misses | `sum by (db) (irate(redis_server_keyspace_write_misses{role="master"}[1m]))` | Rate of write operations accessing a non-existing key (ops/sec) | | bdb_write_misses_max | `sum by (db) (irate(redis_server_keyspace_write_misses{role="master"}[1m]))` | Highest value of the rate of write operations accessing a non-existing key (ops/sec) | -| bdb_write_req | `sum by (db) (irate(endpoint_write_requests[1m]))` | Rate of write requests on the database (ops/sec) | -| bdb_write_req_max | `sum by (db) (irate(endpoint_write_requests[1m]))` | Highest value of the rate of write requests on the database (ops/sec) | -| bdb_write_res | `sum by(db) (irate(endpoint_write_responses[1m]))` | Rate of write responses on the database (ops/sec) | -| bdb_write_res_max | `sum by(db) (irate(endpoint_write_responses[1m]))` | Highest value of the rate of write responses on the database (ops/sec) | +| bdb_write_req | `sum by (db) (irate(endpoint_write_requests{db="$db"}[1m]))` | Rate of write requests on the database (ops/sec) | +| bdb_write_req_max | `max_over_time(sum by (db) (irate(endpoint_write_requests{db="$db"}[1m]))[$__range:])` | Highest value of the rate of write requests on the database (ops/sec) | +| bdb_write_res | `sum by(db) (irate(endpoint_write_responses{db="$db"}[1m]))` | Rate of write responses on the database (ops/sec) | +| bdb_write_res_max | `max_over_time(sum by (db) (irate(endpoint_write_responses{db="$db"}[1m]))[$__range:])` | Highest value of the rate of write responses on the database (ops/sec) | | no_of_expires | `sum by(db) (redis_server_db_expires{role="master"})` | Current number of volatile keys in the database | ## Node metrics