SUMO-251500: Adding monitor's information to OTEL Apps Set2

chetanchoudhary-sumo · chetanchoudhary-sumo · commit fbf50070e374 · 2024-12-13T12:54:06.000+05:30
diff --git a/docs/integrations/databases/opentelemetry/couchbase-opentelemetry.md b/docs/integrations/databases/opentelemetry/couchbase-opentelemetry.md
@@ -226,3 +226,20 @@ Use this dashboard to:
 - To understand user behavior accessing clusters and servers through Rest API.
 
 <img src='https://sumologic-app-data-v2.s3.amazonaws.com/dashboards/Couchbase-OpenTelemetry/Couchbase-HTTP-Access.png' alt="Access" />
+
+## Create monitors for Couchbase app
+
+import CreateMonitors from '../../../reuse/apps/create-monitors.md';
+
+<CreateMonitors/>
+
+### Couchbase alerts
+
+| Name | Description | Alert Condition | Recover Condition |
+|:--|:--|:--|:--|
+| `Couchbase - Bucket Not Ready` | This alert is triggered when a bucket in the Couchbase cluster is not ready. | Count `>` 0 | Count `<=` 0 |
+| `Couchbase - High Latency HTTP Requests` | This alert is triggered on high average latency for HTTP requests to Couchbase | Count `>` 1000 | Count `<=` 1000 |
+| `Couchbase - Node Down` | This alert is triggered when a node in the Couchbase cluster is down. | Count `>` 0 | Count `<=` 0 |
+| `Couchbase - Node Not Respond` | This alert is triggered when a node in the Couchbase cluster does not respond too many times. | Count `>=` 10 | Count `<` 10 |
+| `Couchbase - Too Many Error Queries on Buckets` | This alert is triggered when there are too many error queries on a bucket in a Couchbase cluster. | Count `>=` 1000 | Count `<` 1000 |
+| `Couchbase - Too Many Login Failures` | This alert is triggered when there are too many login failures to a node in a Couchbase cluster. | Count `>=` 1000 | Count `<` 1000 |
diff --git a/docs/integrations/databases/opentelemetry/mariadb-opentelemetry.md b/docs/integrations/databases/opentelemetry/mariadb-opentelemetry.md
@@ -248,3 +248,19 @@ Use this dashboard to:
 - Examine slow query trends to determine if there are periodic performance bottlenecks in your database clusters.
 
 <img src='https://sumologic-app-data-v2.s3.amazonaws.com/dashboards/MariaDB-OpenTelemetry/MariaDB-Slow-Queries.png' alt="Slow Queries" />
+
+## Create monitors for MariaDB app
+
+import CreateMonitors from '../../../reuse/apps/create-monitors.md';
+
+<CreateMonitors/>
+
+### MariaDB alerts
+
+| Name | Description | Alert Condition | Recover Condition |
+|:--|:--|:--|:--|
+| `MariaDB - Critical Errors` | This alert is triggered when there are critical database errors. | Count `>` 10 | Count `<=` 10 |
+| `MariaDB - Excessive Slow Query Detected` | This alert is triggered when the average time to execute a query is more than 15 seconds for a 5 minute time interval. | Count `>=` 1 | Count `<` 1 |
+| `MariaDB - Failed Login Attempts` | This alert is triggered when there are excessive failed login attempts in a short period. | Count `>=` 1 | Count `<` 1 |
+| `MariaDB - Instance down` | This alert is triggered when we detect that a MariaDB instance is down | Count `>=` 1 | Count `<` 1 |
+| `MariaDB - Replication Failure` | This alert is triggered when there are replication failures. | Count `>=` 1 | Count `<` 1 |
diff --git a/docs/integrations/microsoft-azure/opentelemetry/sql-server-linux-opentelemetry.md b/docs/integrations/microsoft-azure/opentelemetry/sql-server-linux-opentelemetry.md
@@ -184,3 +184,21 @@ Use this dashboard to:
 -   Monitor any errors and warnings.
 
 <img src='https://sumologic-app-data-v2.s3.amazonaws.com/dashboards/SQLServer-Linux-OpenTelemetry/SQL-Server-Operations.png' alt="Operations" />
+
+## Create monitors for SQL Server Linux app
+
+import CreateMonitors from '../../../reuse/apps/create-monitors.md';
+
+<CreateMonitors/>
+
+### SQL Server Linux alerts
+
+| Name | Description | Alert Condition | Recover Condition |
+|:--|:--|:--|:--|
+| `SQL Server - AppDomain` | This alert is triggered when we detect AppDomain related issues in your SQL Server instance. | Count `>=` 1 | Count `<` 1 |
+| `SQL Server - Backup Fail` | This alert is triggered when we detect that the SQL Server backup failed. | Count `>=` 1 | Count `<` 1 |
+| `SQL Server - Deadlock` | This alert is triggered when we detect deadlocks in a SQL Server instance. | Count `>` 5 | Count `<=` 5 |
+| `SQL Server - Instance Down` | This alert is triggered when we detect that the SQL Server instance is down for 5 minutes. | Count `>` 0 | Count `<=` 0 |
+| `SQL Server - Insufficient Space` | This alert is triggered when SQL Server instance could not allocate a new page for database because of insufficient disk space in filegroup. | Count `>` 0 | Count `<=` 0 |
+| `SQL Server - Login Fail` | This alert is triggered when we detect that the user cannot login to SQL Server. | Count `>=` 1 | Count `<` 1 |
+| `SQL Server - Mirroring Error` | This alert is triggered when we detect that the SQL Server mirroring has error. | Count `>=` 1 | Count `<` 1 |
diff --git a/docs/integrations/web-servers/opentelemetry/squid-proxy-opentelemetry.md b/docs/integrations/web-servers/opentelemetry/squid-proxy-opentelemetry.md
@@ -194,3 +194,18 @@ The **The Squid Proxy - HTTP Response Analysis** dashboard provides insights int
 The **Squid Proxy - Quality of Service** dashboard provides insights into latency, the response time of requests according to HTTP action, and the response time according to location.
 
 <img src='https://sumologic-app-data-v2.s3.amazonaws.com/dashboards/Squid-Proxy-OpenTelemetry/Squid-Proxy-Quality-of-Service.png' alt="Quality of Service" />
+
+## Create monitors for SquidProxy app
+
+import CreateMonitors from '../../../reuse/apps/create-monitors.md';
+
+<CreateMonitors/>
+
+### SquidProxy alerts
+
+| Name | Description | Alert Condition | Recover Condition |
+|:--|:--|:--|:--|
+| `Squid Proxy - High Client (HTTP 4xx) Error Rate` | This alert is triggered when there are too many HTTP requests (>5%) with a response status of 4xx. | Count `>` 0 | Count `<=` 0 |
+| `Squid Proxy - High Denied Request` | This alert is triggered when there are too many HTTP denied requests (>5%) | Count `>` 0 | Count `<=` 0 |
+| `Squid Proxy - High Response Time` | This alert is triggered when requests are taking too long to process. | Count `>` 20 | Count `<=` 20 |
+| `Squid Proxy - High Server (HTTP 5xx) Error Rate` | This alert is triggered when there are too many HTTP requests (>5%) with a response status of 5xx. | Count `>` 0 | Count `<=` 0 |
diff --git a/docs/integrations/web-servers/opentelemetry/varnish-opentelemetry.md b/docs/integrations/web-servers/opentelemetry/varnish-opentelemetry.md
@@ -184,3 +184,17 @@ The **Varnish - Visitor Traffic Insight** dashboard provides detailed informatio
 The **Varnish - Web Server Operations** dashboard provides a high-level view combined with detailed information on the top ten bots, geographic locations and data for clients with high error rates, server errors over time, and non 200 response code status codes. Dashboard panels also show information on server error logs, error log levels, error responses by server, and the top URIs responsible for 404 responses.
 
 <img src='https://sumologic-app-data-v2.s3.amazonaws.com/dashboards/Varnish-OpenTelemetry/Varnish-Web-Server-Operations.png' alt="Web Server Operations" />
+
+## Create monitors for Varnish app
+
+import CreateMonitors from '../../../reuse/apps/create-monitors.md';
+
+<CreateMonitors/>
+
+### Varnish alerts
+
+| Name | Description | Alert Condition | Recover Condition |
+|:--|:--|:--|:--|
+| `Varnish - Access from Highly Malicious Sources` | This alert is triggered when Varnish is accessed from highly malicious IP addresses. | Count `>` 0 | Count `<=` 0 |
+| `Varnish - High 4XX Error Rate` | This alert is triggered when there are too many HTTP requests (>5%) with a response status of 4xx. | Count `>` 5 | Count `<=` 5 |
+| `Varnish - High 5XX Error Rate` | This alert is triggered when there are too many HTTP requests (>5%) with a response status of 5xx. | Count `>` 5 | Count `<=` 5 |