-
Notifications
You must be signed in to change notification settings - Fork 265
RS: v2 Prometheus metrics #540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No idea about the PromQL, but language looks OK apart from some minor points you might want to consider. Also, one minor formatting glitch.
| <span class="break-all">bdb_other_req_max</span> | <span class="break-all">`sum by(bdb) (irate(endpoint_other_req[1m]))`</span> | Highest value of the rate of other (non read/write) requests on the database (ops/sec) | | ||
| <span class="break-all">bdb_other_res</span> | <span class="break-all">`sum by(bdb) (irate(endpoint_other_res[1m]))`</span> | Rate of other (non read/write) responses on the database (ops/sec) | | ||
| <span class="break-all">bdb_other_res_max</span> | <span class="break-all">`sum by(bdb) (irate(endpoint_other_res[1m]))`</span> | Highest value of the rate of other (non read/write) responses on the database (ops/sec) | | ||
| <span class="break-all">bdb_pubsub_channels</span> | <span class="break-all">`sum by(bdb) (redis_server_pubsub_channels)`</span> | Count the pub/sub channels with subscribed clients | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Count" reads like an imperative here, maybe?
| <span class="break-all">bdb_pubsub_channels</span> | <span class="break-all">`sum by(bdb) (redis_server_pubsub_channels)`</span> | Count the pub/sub channels with subscribed clients | | |
| <span class="break-all">bdb_pubsub_channels</span> | <span class="break-all">`sum by(bdb) (redis_server_pubsub_channels)`</span> | Count of the pub/sub channels with subscribed clients | |
| <span class="break-all">bdb_other_res</span> | <span class="break-all">`sum by(bdb) (irate(endpoint_other_res[1m]))`</span> | Rate of other (non read/write) responses on the database (ops/sec) | | ||
| <span class="break-all">bdb_other_res_max</span> | <span class="break-all">`sum by(bdb) (irate(endpoint_other_res[1m]))`</span> | Highest value of the rate of other (non read/write) responses on the database (ops/sec) | | ||
| <span class="break-all">bdb_pubsub_channels</span> | <span class="break-all">`sum by(bdb) (redis_server_pubsub_channels)`</span> | Count the pub/sub channels with subscribed clients | | ||
| <span class="break-all">bdb_pubsub_channels_max</span> | <span class="break-all">`sum by(bdb) (redis_server_pubsub_channels)`</span> | Highest value of count the pub/sub channels with subscribed clients | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the one above. There are a few more of these in the file.
| <span class="break-all">bdb_pubsub_channels_max</span> | <span class="break-all">`sum by(bdb) (redis_server_pubsub_channels)`</span> | Highest value of count the pub/sub channels with subscribed clients | | |
| <span class="break-all">bdb_pubsub_channels_max</span> | <span class="break-all">`sum by(bdb) (redis_server_pubsub_channels)`</span> | Highest value of count for the pub/sub channels with subscribed clients | |
| <span class="break-all">node_available_memory</span> | <span class="break-all">`node_available_memory_bytes`</span> | Amount of free memory in the node (bytes) that is available for database provisioning | | ||
| <span class="break-all">node_available_memory_no_overbooking</span> | <span class="break-all">`node_available_memory_no_overbooking_bytes`</span> | Available RAM in the node (bytes) without taking into account overbooking | | ||
| <span class="break-all">node_avg_latency</span> | <span class="break-all">`sum by (proxy) (irate(endpoint_acc_latency[1m])) / sum by (proxy) (irate(endpoint_total_started_res[1m]))`</span> | Average latency of requests handled by endpoints on the node in milliseconds; returned only when there is traffic | | ||
| <span class="break-all">node_bigstore_free</span> | <span class="break-all">`node_bigstore_free_bytes`</span> | Sum of free space of back-end flash (used by flash database's [BigRedis]) on all cluster nodes (bytes); returned only when BigRedis is enabled | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not clear if this is supposed to be a plural or means something else.
| <span class="break-all">node_bigstore_free</span> | <span class="break-all">`node_bigstore_free_bytes`</span> | Sum of free space of back-end flash (used by flash database's [BigRedis]) on all cluster nodes (bytes); returned only when BigRedis is enabled | | |
| <span class="break-all">node_bigstore_free</span> | <span class="break-all">`node_bigstore_free_bytes`</span> | Sum of free space of back-end flash (used by flash databases [BigRedis]) on all cluster nodes (bytes); returned only when BigRedis is enabled | |
| --------- | :------------------- | :---------- | | ||
| <span class="break-all">listener_acc_latency</span> | <span class="break-all">N/A</span> | Accumulative latency (sum of the latencies) of all types of commands on the database. For the average latency, divide this value by listener_total_res | | ||
| <span class="break-all">listener_acc_latency_max</span> | <span class="break-all">N/A</span> | Highest value of accumulative latency of all types of commands on the database | | ||
| <span class="break-all">listener_acc_other_latency</span> | <span class="break-all">N/A</span> | Accumulative latency (sum of the latencies) of commands that are a type "other" on the database. For the average latency, divide this value by listener_other_res | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess "a type XXX" is OK for these but "of type XXX" is used elsewhere in the table (not a major problem!)
| <span class="break-all">redis_keyspace_write_misses</span> | <span class="break-all">`redis_server_keyspace_write_misses`</span> | Number of write operations accessing a non-existing keyspace | | ||
| <span class="break-all">redis_master_link_status</span> | <span class="break-all">`redis_server_master_link_status`</span> | Indicates if the replica is connected to its master | | ||
| <span class="break-all">redis_master_repl_offset</span> | <span class="break-all">`redis_server_master_repl_offset`</span> | Number of bytes sent to replicas by the shard; calculate the throughput for a time period by comparing the value at different times | | ||
| <span class="break-all">redis_master_sync_in_progress</span> | <span class="break-all">`redis_server_master_sync_in_progress`</span> | The master shard is synchronizing (1 true | 0 false) | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the vertical bar here creates an extra invisible column, so you can't see the "0 false" part.
| <span class="break-all">redis_master_sync_in_progress</span> | <span class="break-all">`redis_server_master_sync_in_progress`</span> | The master shard is synchronizing (1 true | 0 false) | | |
| <span class="break-all">redis_master_sync_in_progress</span> | <span class="break-all">`redis_server_master_sync_in_progress`</span> | The master shard is synchronizing (1 true; 0 false) | |
… instead of PromQL on dedicated v2 metrics page
* DOC-3943 RS Fuya Fuya release notes placeholder * DOC-3271 Add background_op deprecation to RS Fuya Fuya release notes * DOC-3271 Deprecate background_op field in BDB object reference * DOC-3711 RS: REST API reference to change CM UI time zone (#508) * DOC-4057 RS: Add 7.6 to supported platforms table * DOC-3978 RS: Update log rotation default config to reduce filling disk space with logs (#507) * DOC-3978 Add log rotation enhancement to release notes * DOC-3943 DOC-4049 DOC-3606 DOC-4048 Add Cluster Manager UI enhancements to the 7.6 release notes * DOC-4051 Add 3 module feature sets to release notes * DOC-4051 Highlight 3 module feature sets as module management enhancements in release notes * Update wording of supported paths note in 7.6 release notes * DOC-3551 Add cluster_wd port 3349 to reserved ports (#544) * DOC-4053 Add module API and rladmin deprecations to 7.6 release notes * DOC-4053 Add module API and rladmin deprecations to 7.6 release notes index too * DOC-3954 RS: Add default_oss_sharding policy to cluster_settings REST API object and rladmin tune cluster (#548) * RS: Database tags in CM UI (#555) * DOC-4049 RS: Database tags in CM UI * Fix relref * DOC-4073 Known issue - ACL with linebreak can cause shard migration to fail * DOC-4073 Copy ACL with linebreak known issue to 7.6 release notes index * DOC-4057 Changed 7.6 release back to September * DOC-4069 Add triggers and functions end of preview note to release notes * DOC-3952 Add client-side caching to release notes * DOC-4054 Add some resolved issues to the release notes * DOC-4072 RS: Copy LDAP URI validation change to version changes section of 7.6.0 release notes * DOC-3943 Add another resolved issue to release notes * RS: Update DB config docs with CM UI changes (#560) * DOC-3606 RS: Update DB config to match CM UI changes * DOC-3606 RS: Update A-A and Replica Of DB config and screenshots to match CM UI changes * DOC-3606 RS: Update persistence and schedule backups to match CM UI DB config changes * DOC-3606 RS: Update OSS Cluster API and more screenshots to match CM UI DB config changes * DOC-3606 RS: Update quick DB screenshot to match CM UI DB config changes * DOC-4174 Remove legacy UI instruction for configuring A-A DBs * DOC-4174 Add database version to DB config to match the new UI * DOC-4174 Fix the clustering section of DB config to match the new UI * Fix issues caused by resolving merge conflict * DOC-3606 Add DB tags to configure DB index page * Feedback update for auto tiering quick start next steps * Feedback updates for replica HA * Feedback update for CM UI's supported web browsers * Feedback updates for Active-Active OSS Cluster API config * Feedback updates for OSS Cluster API and proxy policy - use primary shards where possible * DOC-3606 Feedback updates for enabling OSS Cluster API for Active-Active DBs * Feedback updates for DB persistence intro * RS: Edit module config for existing DB in the CM UI (#561) * DOC-4136 RS: Edit module config for existing DB in the CM UI * Update content/operate/oss_and_stack/stack-with-enterprise/bloom/config.md Co-authored-by: David Dougherty <[email protected]> * DOC-4136 Feedback updates for module config in RS --------- Co-authored-by: David Dougherty <[email protected]> * RS: 3 feature sets and module API changes and deprecations (#546) * DOC-4052 Document 3 module feature sets are bundled with RS 7.6 * DOC-4052 rladmin upgrade db also upgrades modules without requiring latest_with_modules * DOC-4052 /bdb/upgrade module API deprecations * DOC-4052 POST bdbs module_list API deprecations * DOC-4052 min_redis_version module schema deprecation * DOC-4052 POST /modules/upgrade API deprecation * DOC-4052 Add compatible_redis_version to rladmin status modules * DOC-4052 Deprecate rladmin upgrade module * DOC-4052 Remove instruction to download latest modules from the download center * DOC-4052 Update DB and module upgrade docs to account for 7.6.0 behavior changes and deprecations * RS: Update enable capabilities/modules screenshot * RS and RC: Client-side caching and compatibility (#550) * DOC-3951 RS: Add tracking_table_max_keys and default_tracking_table_max_keys_policy to REST API and rladmin for client-side caching * DOC-3951 RS: Add tracking-table-max-keys to Redis CE config settings compatibility reference * Fix new row format in bdb object table * DOC-3951 RS: Client-side caching and compatibility * DOC-3951 RS and RC: Mention client-side caching on compatibility index pages * Apply suggestions from code review Co-authored-by: andy-stark-redis <[email protected]> --------- Co-authored-by: andy-stark-redis <[email protected]> * DOC-4277 Change 7.6 to 7.8 in supported platforms and release notes draft * DOC-4277 Change release notes draft file names from 7.6 to 7.8 * DOC-4287 RS & RC support CLIENT TRACKING & CLIENT TRACKINGINFO commands (#685) * RS: Configure minimum password length (#690) * DOC-4279 RS: Add password_min_length to cluster REST API object reference * DOC-4279 RS: Document how to change minimum password length * Fix spacing in REST API example * Add alt text for screenshot * RS: Node actions in the CM UI (#554) * DOC-4048 RS: Update remove node instructions for new UI * DOC-4048 RS: Update verify node instructions for new UI * DOC-4048 Rename secondary node actions screenshot * DOC-4048 RS: Change node roles in new UI * Apply suggestions from code review Co-authored-by: mich-elle-luna <[email protected]> * Feedback suggestion to mark Cancel removal as a note * DOC-4048 Feedback updates for add a node * DOC-4048 Feedback update for remove a node --------- Co-authored-by: mich-elle-luna <[email protected]> * DOC-2693 RS: DB availability REST API reference (#562) * DOC-4281 RS: Add new syncer connection alert attributes to BDB object REST API reference (#679) * RS: Configure license expiration alert (#563) * DOC-4038 RS: Configure license expiration alert * DOC-4038 Add RS CMUI instructions and screenshot for license expiration alert config * DOC-4038 Add RS CMUI screenshot that shows cluster license expiration alert * DOC-4053 Feedback updates for module breaking changes and deprecations * DOC-4053 Copy feedback updates for module breaking changes and deprecations to 7.8 release notes index * DOC-4054 Add missing descriptions for 7.6 resolved issues * DOC-4308 Add 7.8 resolved issues to release notes * DOC-4309 Add 7.8 enhancements to release notes * DOC-4338 Ubuntu 18 not supported for RS 7.8+ * DOC-4338 Add deprecation update that Ubuntu 18 is not supported for RS 7.8+ * Change Redis Enterprise Software to Redis Software in 7.8 release notes * Redis DB version 6.0 no longer supported as of RS 7.8 * Legacy UI no longer supported as of RS 7.8 * DOC-4260 Add more features/enhancement details and links to 7.8 release notes * DOC-4260 Update module versions bundled with RS 7.8 * DOC-4308 Feedback update - emphasize RS123645 & add details of fixed behavior * DOC-4260 Update bundled Redis versions in 7.8 release notes security section * DOC-4260 Change 7.8.0 to 7.8.2 * RS: Configure query performance factor with CM UI (#711) * DOC-4276 RS: Configure query performance factor with CM UI * DOC-4276 RS: Edits for configure query performance factor with CM UI * DOC-4276 Remove restart proxies step from query performance factor CM UI config method * DOC-4260 Add DB availability API link and fix typo * DOC-4260 Placeholder for Redis DB v7.4 features in release notes * DOC-4394 RS: Add 7.8.2 bundled DB versions to upgrade DB table * DOC-4394 RS: Add 7.8.2 bundled DB versions to release notes * DOC-4388 Add Redis 7.4 breaking changes to RS 7.8 release notes * Added RS137396 & RS134238 to 7.8.2 resolved issues * Change RS 7.6.0 to RS 7.8.2 * Change RS 7.6.0 to RS 7.8.2 * DOC-4354 DOC-4309 Add new user manager role to RS 7.8.2 release notes * DOC-4376 Differentiate between locked out users & incorrect/expired passwords on the CM UI sign-in screen * DOC-4415 RS: Remove user auth_method deprecation * DOC-4260 RS: Changed 7.8 Oct release to Nov * DOC-4416 RS: Add metrics stream engine preview details to release notes * DOC-3953 Mention oss_sharding in version changes * DOC-4416 Feedback update - current list of v2 metrics is partial & new dashboards are planned for future releases * DOC-4416 Copy v1 Prometheus metrics deprecation to deprecations list * DOC-4058 RS: Add 7.8 release date and 7.4 EOL date * DOC-4505 RS: Add 7.8.x to cluster upgrade path table * DOC-4394 Edit bundled Redis DB versions in 7.8.2 release notes * DOC-4260 Add some Redis DB 7.4 highlights to new features in release notes * DOC-4191 RS: Add rebalance database shards REST API reference (#733) * DOC-4505 Feedback update - note/link to k8s supported distros/lifecycle * DOC-4512 RS: Updated reserved ports for 7.8.2 * DOC-4512 RS: Add new reserved ports to 7.8.2 release notes * DOC-4509 Add flush A-A DB to 7.8.2 release notes new CM UI enhancements * Update module versions for 7.8.2 release notes * Add relref link to rebalance API * Change RS 7.6 mentions to RS 7.8.2 * DOC-4512 Feedback updates for reserved ports * DOC-4308 Added 2 more resolved issues * RS: Check DB availability (#816) * DOC-4307 Create DB availability placeholder article * DOC-4307 Add DB availability API examples * DOC-4307 Feedback update to add use cases for load balancers vs. monitoring * DOC-4307 Feedback updates to add note about data availability & DB status link * DOC-4307 Feedback update to remove shard status table * RS: Add failover REST API reference (#732) * DOC-4280 RS: Add failover REST API reference * Edit parameter descriptions * Feedback update to add failover REST API use case * DOC-4505 Feedback update to copy k8s lifecycle note to other RS upgrade path docs * DOC-4509 RS: New CM UI instructions to flush Active-Active DB (#831) * DOC-4388 Feedback update to remove 7.4 breaking change #13326 from 7.8.2 release notes since it was a change between 7.4-RC2 and 7.4-RC1 * Add hash field expiration link to RS 7.8.2 release notes * Add HFE commands to compat. page (#844) * DOC-4055 RS: Add build number to 7.8.2 release notes * DOC-4055 RS: Add SHA256 checksums to 7.8.2 release notes * RS: Add DB stop_traffic & resume_traffic REST API request references (#750) * DOC-4035 RS: Add DB stop_traffic & resume_traffic REST API request references * DOC-4035 Feedback update to add use cases for stop_traffic and resume_traffic actions * DOC-4035 Feedback updates to change stop_traffic/resume_traffic API descriptions * RS: User manager role (#813) * DOC-4354 RS: Add new user manager role * DOC-4354 Add user_manager role to REST API permissions reference & fix table anchor format * DOC-4354 Add user_manager role to REST API permissions table * Fix table anchor style * 2nd attempt to fix merge conflict * Another fix for merge conflict * DOC-4354 Add user_manager to view_crdb_task REST API permissions * DOC-4354 Add user_manager to RBAC REST API requests * DOC-4354 Add user_manager role to other requests according to listed permissions * DOC-4354 RS: Add user manager role to CM UI permissions table * DOC-4354 Feedback update to remove duplicated page description * DOC-4415 Add missing auth_method options for user API object * Feedback update to add hash field expiration to highlights * Add missing links to RS 7.8.2 release notes * Update content/operate/rs/references/rest-api/requests/bdbs/availability.md Co-authored-by: mich-elle-luna <[email protected]> * Update content/operate/rs/references/rest-api/requests/bdbs/availability.md Co-authored-by: mich-elle-luna <[email protected]> * Fix screenshots to support RS 7.8 and 7.4 versioned docs * Add relref links for DB availability API in release notes * DOC-4418 RS: Enable TLS updates for new CM UI (#846) * RS: RHEL 9 FIPS mode support (#847) * DOC-4522 RS: RHEL 9 FIPS mode support * Fix inconsistent character in table * DOC-4260 Feedback update to add a missing word to client-side caching description in release notes * Update _index.md (#852) * Update _index.md Added Metrics Stream Engine data Still need to link the transition plan here and @AlonMagrafta to review * alons updates * Update _index.md added the API spec * Apply suggestions from code review Co-authored-by: mich-elle-luna <[email protected]> * Update content/operate/rs/clusters/monitoring/_index.md Co-authored-by: mich-elle-luna <[email protected]> --------- Co-authored-by: Alon Magrafta <[email protected]> Co-authored-by: Rachel Elledge <[email protected]> Co-authored-by: mich-elle-luna <[email protected]> * DOC-4055 Update RS 7.8.2 build number * DOC-4055 Update RS 7.8.2 checksums for new build * RS: v2 Prometheus metrics (#540) * DOC-3945 v2 DB Prometheus metrics * Remove list-formatted metrics preview * DOC-3946 v2 node Prometheus metrics * DOC-3949 v2 cluster Prometheus metrics * DOC-3950 v2 proxy Prometheus metrics * DOC-3947 v2 replication Prometheus metrics * DOC-3948 v2 shard Prometheus metrics * Small copy edits and fixes * More small copy edits and fixes * Change v2 not supported messages to N/A in tables * DOC-4071 Separate v2 Prometheus metrics and transition tables * DOC-3552 Cluster watchdog Prometheus metrics * DOC-3944 Placeholder - plan to provide v2 metric names when available instead of PromQL on dedicated v2 metrics page * Table formatting * DOC-4294 Add latency histogram metrics to v2 Prometheus metrics * Remove PromQL from dedicated v2 metrics page * DOC-4294 Moved example PromQL for latency histogram metrics to description column * DOC-4417 RS: Update v2 cert metric with new name * DOC-3944 Feedback update to remove a few v2 metrics * Update RS version to 7.8.2 * DOC-3944 Feedback updates for v2 metrics * DOC-3944 Feedback update for title of v1 to v2 metrics transition doc * DOC-3944 Add v2 replication metrics * DOC-3944 Add v2 DB metrics * DOC-3944 Change v2 metric column name * DOC-3944 Fix endpoint format & add link to metrics transition tables --------- Co-authored-by: David Dougherty <[email protected]> Co-authored-by: andy-stark-redis <[email protected]> Co-authored-by: mich-elle-luna <[email protected]> Co-authored-by: Maayan Agranat <[email protected]> Co-authored-by: Alon Magrafta <[email protected]>
DOC-3944